Screenshot API for Fintech Compliance: Web Evidence Archiving for Regulated Firms

2026-05-14 | Tags: [screenshot-api, fintech, compliance, regulatory, archiving]

Screenshot API for Fintech Compliance: Web Evidence Archiving for Regulated Firms

Financial services firms operate under a regulatory burden that demands documentation of everything: vendor representations at contract signing, regulatory guidance at the time of policy creation, third-party disclosures at the moment of customer interaction. When that documentation exists as web content, screenshot archiving becomes a compliance tool.

This post covers appropriate use cases, implementation patterns, and the audit trail requirements that distinguish defensible records from ad-hoc captures.

Why Fintech Compliance Teams Screenshot Web Content

Vendor due diligence. When your firm evaluates a data vendor, KYC provider, or payment processor, their compliance certifications and security representations live on their website. Procurement teams capture these pages at contract signing. When regulators ask "what certifications did you verify at onboarding?" — a timestamped screenshot with a cryptographic hash answers the question. A vendor's website at audit time may say something different.

Regulatory guidance at point in time. SEC guidance, FinCEN advisories, CFPB bulletins — these change. Your policy documentation references "guidance as of [date]." The screenshot proves what that guidance actually said. Regulatory websites update without versioning or changelogs.

Competitive and pricing claims. FINRA's advertising rules require firms to retain promotional materials. If your firm cites a competitor's fee schedule or a market data provider's pricing in sales materials, you need to document what those pages said when you cited them.

Third-party disclosure verification. Payment processors, embedded finance partners, and API providers publish terms, fees, and disclosures on their sites. Periodic captures demonstrate ongoing due diligence — that your firm verified disclosures were accurate and accessible at monitoring checkpoints.

Implementation: Vendor Certification Archiving

import requests
import hashlib
import json
from datetime import datetime, timezone
from pathlib import Path

API_KEY = "YOUR_KEY"
SCREENSHOT_API = "https://hermesforge.dev/api/screenshot"

def archive_vendor_certification(
    url: str,
    vendor_name: str,
    vendor_id: str,
    certification_type: str,
    output_dir: str,
    reviewed_by: str
) -> dict:
    """
    Capture and archive a vendor certification or compliance page.
    Produces audit-ready record with hash, timestamp, and metadata.
    """
    timestamp = datetime.now(timezone.utc)
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)

    response = requests.get(
        SCREENSHOT_API,
        params={
            "url": url,
            "width": 1440,
            "height": 900,
            "full_page": True,
            "format": "png",
            "delay": 3000
        },
        headers={"X-API-Key": API_KEY},
        timeout=45
    )
    response.raise_for_status()

    image_bytes = response.content
    sha256 = hashlib.sha256(image_bytes).hexdigest()

    filename = f"{vendor_id}_{certification_type}_{timestamp.strftime('%Y%m%dT%H%M%SZ')}.png"
    filepath = output_path / filename
    filepath.write_bytes(image_bytes)

    record = {
        "schema_version": "1.0",
        "record_type": "vendor_certification_capture",
        "vendor_name": vendor_name,
        "vendor_id": vendor_id,
        "certification_type": certification_type,
        "source_url": url,
        "captured_at": timestamp.isoformat(),
        "sha256": sha256,
        "file_size_bytes": len(image_bytes),
        "file": str(filepath),
        "reviewed_by": reviewed_by,
        "retention_category": "vendor_due_diligence",
        "retention_years": 7  # Adjust to your regulatory requirement
    }

    log_path = output_path / "archive_log.jsonl"
    with open(log_path, "a") as f:
        f.write(json.dumps(record) + "\n")

    return record


# Example: Onboarding a payment processor
record = archive_vendor_certification(
    url="https://stripe.com/docs/security/stripe",
    vendor_name="Stripe Inc",
    vendor_id="VENDOR-0042",
    certification_type="pci_dss",
    output_dir="/secure/vendor-records/stripe",
    reviewed_by="compliance-system-v1"
)
print(f"Archived: {record['file']} | SHA256: {record['sha256'][:16]}...")

Regulatory Guidance Pipeline

For firms that reference regulatory guidance in internal policies, a systematic capture workflow:

from pathlib import Path
import json

# Regulatory sources to monitor and archive
REGULATORY_SOURCES = [
    {
        "agency": "SEC",
        "topic": "cybersecurity_disclosure",
        "url": "https://www.sec.gov/rules/final/2023/33-11216.pdf",
        "capture_type": "pdf_url",
        "retention_years": 10
    },
    {
        "agency": "CFPB",
        "topic": "open_banking_rule",
        "url": "https://www.consumerfinance.gov/rules-policy/final-rules/personal-financial-data-rights/",
        "capture_type": "webpage",
        "retention_years": 10
    },
    {
        "agency": "FinCEN",
        "topic": "beneficial_ownership",
        "url": "https://www.fincen.gov/boi",
        "capture_type": "webpage",
        "retention_years": 10
    }
]

def archive_regulatory_guidance(sources: list, output_dir: str) -> list:
    """
    Archive regulatory guidance pages at current state.
    Run when updating internal policies that reference specific guidance.
    """
    results = []
    output_path = Path(output_dir)
    output_path.mkdir(parents=True, exist_ok=True)

    for source in sources:
        try:
            response = requests.get(
                SCREENSHOT_API,
                params={
                    "url": source["url"],
                    "width": 1440,
                    "height": 900,
                    "full_page": True,
                    "format": "png",
                    "delay": 4000  # Regulatory sites can be slow
                },
                headers={"X-API-Key": API_KEY},
                timeout=60
            )

            if response.status_code != 200:
                results.append({
                    **source,
                    "status": "capture_failed",
                    "http_status": response.status_code,
                    "captured_at": datetime.now(timezone.utc).isoformat()
                })
                continue

            image_bytes = response.content
            sha256 = hashlib.sha256(image_bytes).hexdigest()
            timestamp = datetime.now(timezone.utc)

            filename = f"{source['agency'].lower()}_{source['topic']}_{timestamp.strftime('%Y%m%d')}.png"
            filepath = output_path / filename
            filepath.write_bytes(image_bytes)

            record = {
                **source,
                "status": "ok",
                "captured_at": timestamp.isoformat(),
                "sha256": sha256,
                "file": str(filepath),
                "file_size_bytes": len(image_bytes)
            }
            results.append(record)

        except Exception as e:
            results.append({
                **source,
                "status": "error",
                "error": str(e),
                "captured_at": datetime.now(timezone.utc).isoformat()
            })

    # Write summary
    summary_path = output_path / f"capture_run_{datetime.now(timezone.utc).strftime('%Y%m%d')}.json"
    summary_path.write_text(json.dumps(results, indent=2))

    return results

Periodic Disclosure Monitoring

For third-party partners whose disclosures are material to your own customer obligations:

import schedule
import time

MONITORED_DISCLOSURES = [
    {
        "partner": "payment_processor_a",
        "page": "fee_schedule",
        "url": "https://partner.example.com/fees",
        "baseline": "baselines/pa_fees.png",
        "alert_on_change": True
    },
    {
        "partner": "data_provider_b",
        "page": "terms_of_service",
        "url": "https://dataprovider.example.com/terms",
        "baseline": "baselines/db_terms.png",
        "alert_on_change": True
    }
]

def check_disclosure_change(monitored_page: dict) -> dict:
    """
    Capture current state and compare hash to baseline.
    Returns change detection result.
    """
    response = requests.get(
        SCREENSHOT_API,
        params={
            "url": monitored_page["url"],
            "width": 1440,
            "full_page": True,
            "format": "png",
            "delay": 2000
        },
        headers={"X-API-Key": API_KEY},
        timeout=45
    )
    response.raise_for_status()

    current_hash = hashlib.sha256(response.content).hexdigest()

    # Load baseline hash (stored from previous known-good capture)
    baseline_hash_path = Path(monitored_page["baseline"]).with_suffix(".sha256")
    baseline_hash = baseline_hash_path.read_text().strip() if baseline_hash_path.exists() else None

    changed = baseline_hash is not None and current_hash != baseline_hash

    if changed:
        # Save the changed version for review
        timestamp = datetime.now(timezone.utc)
        changed_path = f"changes/{monitored_page['partner']}_{monitored_page['page']}_{timestamp.strftime('%Y%m%dT%H%M%SZ')}.png"
        Path(changed_path).parent.mkdir(parents=True, exist_ok=True)
        Path(changed_path).write_bytes(response.content)

    return {
        "partner": monitored_page["partner"],
        "page": monitored_page["page"],
        "url": monitored_page["url"],
        "checked_at": datetime.now(timezone.utc).isoformat(),
        "current_hash": current_hash,
        "baseline_hash": baseline_hash,
        "changed": changed,
        "change_file": changed_path if changed else None
    }

Audit Trail Requirements

A screenshot archive is only defensible if the record demonstrates:

  1. When it was captured (timestamp, ideally UTC)
  2. What was captured (source URL, full content)
  3. Integrity (cryptographic hash — SHA-256 minimum)
  4. Why it was captured (documented business purpose)
  5. Chain of custody (who or what system initiated the capture)

The hash is essential. A screenshot file can be modified after capture; a hash stored separately and at capture time provides tamper evidence. For high-stakes compliance records, consider storing hashes in an append-only log or a separate system from the images themselves.

class ComplianceArchive:
    """
    Manages a compliance-grade screenshot archive with integrity verification.
    """
    def __init__(self, archive_dir: str, hash_log_path: str):
        self.archive_dir = Path(archive_dir)
        self.hash_log = Path(hash_log_path)
        self.archive_dir.mkdir(parents=True, exist_ok=True)

    def capture_and_record(self, url: str, purpose: str, category: str) -> dict:
        timestamp = datetime.now(timezone.utc)

        response = requests.get(
            SCREENSHOT_API,
            params={"url": url, "width": 1440, "full_page": True, "format": "png", "delay": 3000},
            headers={"X-API-Key": API_KEY},
            timeout=45
        )
        response.raise_for_status()

        sha256 = hashlib.sha256(response.content).hexdigest()
        filename = f"{category}_{timestamp.strftime('%Y%m%dT%H%M%SZ')}.png"
        filepath = self.archive_dir / filename
        filepath.write_bytes(response.content)

        record = {
            "timestamp": timestamp.isoformat(),
            "url": url,
            "purpose": purpose,
            "category": category,
            "file": str(filepath),
            "sha256": sha256,
            "size_bytes": len(response.content)
        }

        # Hash log is append-only — separate from image storage
        with open(self.hash_log, "a") as f:
            f.write(json.dumps(record) + "\n")

        return record

    def verify_integrity(self, record: dict) -> bool:
        """Verify that a stored file matches its recorded hash."""
        filepath = Path(record["file"])
        if not filepath.exists():
            return False
        current_hash = hashlib.sha256(filepath.read_bytes()).hexdigest()
        return current_hash == record["sha256"]

Record Retention

Financial services record retention varies by regulation and record type:

Record Type Typical Retention Regulation
Vendor due diligence 7 years after relationship ends OCC/Fed guidance
Promotional material references 3–6 years FINRA Rule 4511
Regulatory guidance at policy creation Duration of policy + 3 years General practice
Third-party disclosure snapshots 5–7 years Varies by activity

Consult your compliance counsel for requirements specific to your firm's regulatory status and activities.

What This Is Not

Screenshot archiving of web content is an evidence preservation practice. It does not:

Screenshots document what a website said at a point in time. For regulatory exams, they provide supporting evidence for policies and decisions. Their evidentiary weight depends on the integrity of your capture process and chain of custody.


The screenshot API captures full-page screenshots with timestamp metadata suitable for compliance archiving workflows. First 100 requests free.