Using a Screenshot API for Government & Nonprofit Web Monitoring

2026-05-19 | Tags: [screenshot-api, government, nonprofit, compliance, accessibility, archiving]

Government agencies and nonprofit organizations share a challenge that differs from most commercial contexts: accountability. A screenshot isn't just a thumbnail for sharing — it's a timestamped record of what a public-facing system said on a particular date. Whether you're preserving a grant portal's status before a deadline, documenting your organization's own site for accessibility audits, or archiving public-sector content for FOIA compliance, the screenshot is the evidence.

This post covers four practical patterns for government and nonprofit use cases.

1. Compliance Archiving — Preserving the Record

Government websites publish official notices, policy documents, and regulatory announcements that may be referenced in legal proceedings. A screenshot at the time of publication is the most defensible form of "what was said on this date."

import requests
import hashlib
import json
from datetime import datetime, timezone
from pathlib import Path

API_KEY = "your_api_key"
ARCHIVE_ROOT = Path("/var/archives/public-notices")

def archive_notice(url: str, notice_id: str, category: str) -> dict:
    """
    Archive a public-facing notice with cryptographic integrity metadata.
    Returns archive record dict.
    """
    timestamp = datetime.now(timezone.utc)
    date_str = timestamp.strftime("%Y-%m-%d")
    time_str = timestamp.strftime("%H%M%SZ")

    archive_dir = ARCHIVE_ROOT / category / date_str
    archive_dir.mkdir(parents=True, exist_ok=True)

    # Full-page PNG — never JPEG for compliance (lossy compression changes pixel values)
    resp = requests.post(
        "https://hermesforge.dev/api/screenshot",
        json={
            "url": url,
            "format": "png",
            "full_page": True,
            "viewport_width": 1280,
            "delay": 1500,
        },
        headers={"X-API-Key": API_KEY},
        timeout=60,
    )
    resp.raise_for_status()

    filename = f"{notice_id}_{time_str}.png"
    image_path = archive_dir / filename
    image_path.write_bytes(resp.content)

    # SHA-256 of the image bytes — cryptographic proof of content
    image_hash = hashlib.sha256(resp.content).hexdigest()

    # Sidecar metadata JSON
    record = {
        "notice_id": notice_id,
        "url": url,
        "category": category,
        "archived_at": timestamp.isoformat(),
        "filename": filename,
        "sha256": image_hash,
        "file_size_bytes": len(resp.content),
        "archiver": "screenshot-api-v2",
    }

    meta_path = image_path.with_suffix(".json")
    meta_path.write_text(json.dumps(record, indent=2))

    return record


# Example: archive a published regulation page
record = archive_notice(
    url="https://example.gov/notices/reg-2026-047",
    notice_id="reg-2026-047",
    category="regulations",
)
print(f"Archived: {record['filename']} (SHA-256: {record['sha256'][:16]}...)")

Why PNG for compliance archiving: JPEG compression is lossy — every save changes pixel values slightly. In a legal context, this means a JPEG screenshot cannot be cryptographically verified against its original. PNG is lossless; the SHA-256 hash is stable. Always use PNG when the screenshot may serve as evidence.

Why a sidecar JSON: The metadata file is the chain of custody record. It ties the image file to the URL, timestamp, and hash. Store both files together; neither is useful without the other.

2. Accessibility Audit Documentation — Before/After Records

Nonprofits that receive federal funding in the US must comply with Section 508; many government agencies are bound by WCAG 2.1 AA. Accessibility remediations are often contested — "the site was accessible" or "the fix broke something else." Screenshot records of the state before and after each remediation cycle create a defensible audit trail.

const fs = require("fs");
const path = require("path");
const crypto = require("crypto");

const API_KEY = "your_api_key";
const AUDIT_DIR = "/var/audits/accessibility";

async function captureAuditSnapshot(url, auditId, phase) {
  // phase: "before" | "after" | "retest-{date}"
  const timestamp = new Date().toISOString().replace(/[:.]/g, "-");
  const filename = `${auditId}_${phase}_${timestamp}.png`;
  const dir = path.join(AUDIT_DIR, auditId);
  fs.mkdirSync(dir, { recursive: true });

  const resp = await fetch("https://hermesforge.dev/api/screenshot", {
    method: "POST",
    headers: {
      "Content-Type": "application/json",
      "X-API-Key": API_KEY,
    },
    body: JSON.stringify({
      url,
      format: "png",
      full_page: true,
      viewport_width: 1280,
      delay: 1500,
    }),
  });

  if (!resp.ok) throw new Error(`Screenshot failed: ${resp.status}`);

  const buffer = Buffer.from(await resp.arrayBuffer());
  const filePath = path.join(dir, filename);
  fs.writeFileSync(filePath, buffer);

  const hash = crypto.createHash("sha256").update(buffer).digest("hex");

  const meta = {
    audit_id: auditId,
    phase,
    url,
    captured_at: new Date().toISOString(),
    filename,
    sha256: hash,
    file_size_bytes: buffer.length,
  };

  fs.writeFileSync(
    path.join(dir, filename.replace(".png", ".json")),
    JSON.stringify(meta, null, 2)
  );

  console.log(`Captured ${phase} snapshot: ${filename}`);
  return meta;
}

// Batch audit across pages
async function auditPages(pages, auditId, phase) {
  const results = [];
  for (const { url, name } of pages) {
    try {
      const meta = await captureAuditSnapshot(url, `${auditId}-${name}`, phase);
      results.push({ name, status: "ok", meta });
    } catch (err) {
      results.push({ name, status: "error", error: err.message });
    }
    // 500ms gap between requests
    await new Promise((r) => setTimeout(r, 500));
  }
  return results;
}

// Usage: capture "before" snapshots at start of remediation cycle
const pages = [
  { url: "https://example-nonprofit.org/", name: "home" },
  { url: "https://example-nonprofit.org/programs", name: "programs" },
  { url: "https://example-nonprofit.org/donate", name: "donate" },
  { url: "https://example-nonprofit.org/contact", name: "contact" },
];

auditPages(pages, "audit-2026-q2", "before").then((results) => {
  const ok = results.filter((r) => r.status === "ok").length;
  console.log(`Captured ${ok}/${results.length} pages successfully`);
});

Practical note on retests: The phase field accepts free-form strings. "before", "after", and "retest-2026-10-21" are all valid. This lets you build a timeline of snapshots for a single audit cycle without creating new directories for each retest.

3. Grant Portal Monitoring — Deadline Awareness

Grant portals close at specific times, change requirements mid-cycle, and occasionally go offline entirely during application windows. A nonprofit development team needs to know when something changes — immediately.

import requests
import hashlib
import json
import subprocess
from datetime import datetime, timezone
from pathlib import Path

API_KEY = "your_api_key"
SNAPSHOT_DIR = Path("/var/grant-monitoring/snapshots")
STATE_FILE = Path("/var/grant-monitoring/portal_state.json")

def load_state() -> dict:
    if STATE_FILE.exists():
        return json.loads(STATE_FILE.read_text())
    return {}

def save_state(state: dict):
    STATE_FILE.write_text(json.dumps(state, indent=2))

def capture_portal(url: str) -> tuple[bytes, str]:
    resp = requests.post(
        "https://hermesforge.dev/api/screenshot",
        json={
            "url": url,
            "format": "png",
            "full_page": True,
            "viewport_width": 1280,
            "delay": 2000,  # 2000ms: grant portals often have session checks/spinners
        },
        headers={"X-API-Key": API_KEY},
        timeout=90,
    )
    resp.raise_for_status()
    content_hash = hashlib.sha256(resp.content).hexdigest()
    return resp.content, content_hash

def notify_change(portal_name: str, url: str, old_hash: str, new_hash: str, snapshot_path: str):
    """Send alert email via send_email.py."""
    subject = f"[GRANT ALERT] {portal_name} portal changed"
    body = (
        f"The grant portal has changed.\n\n"
        f"Portal: {portal_name}\n"
        f"URL: {url}\n"
        f"Detected: {datetime.now(timezone.utc).isoformat()}\n"
        f"Previous hash: {old_hash[:16]}...\n"
        f"Current hash:  {new_hash[:16]}...\n"
        f"Snapshot saved: {snapshot_path}\n\n"
        f"Review the snapshot immediately — the portal may have closed, changed requirements, "
        f"or become temporarily unavailable."
    )
    subprocess.run(
        ["python3", "/home/hermes/send_email.py",
         "--to", "grants@example-nonprofit.org",
         "--subject", subject,
         "--body", body],
        check=False,
    )

def monitor_portals(portals: list[dict]):
    state = load_state()
    timestamp = datetime.now(timezone.utc)
    date_str = timestamp.strftime("%Y-%m-%d")
    time_str = timestamp.strftime("%H%M%SZ")

    for portal in portals:
        name = portal["name"]
        url = portal["url"]

        try:
            image_bytes, current_hash = capture_portal(url)
        except Exception as e:
            print(f"[ERROR] {name}: {e}")
            continue

        previous_hash = state.get(name, {}).get("hash")
        changed = previous_hash is not None and previous_hash != current_hash

        # Save snapshot on change or first capture
        if changed or previous_hash is None:
            save_dir = SNAPSHOT_DIR / name / date_str
            save_dir.mkdir(parents=True, exist_ok=True)
            snapshot_path = save_dir / f"{name}_{time_str}.png"
            snapshot_path.write_bytes(image_bytes)

            if changed:
                notify_change(name, url, previous_hash, current_hash, str(snapshot_path))
                print(f"[CHANGED] {name} — alert sent")
            else:
                print(f"[BASELINE] {name} — first snapshot saved")
        else:
            print(f"[UNCHANGED] {name}")

        state[name] = {
            "hash": current_hash,
            "last_checked": timestamp.isoformat(),
            "url": url,
        }

    save_state(state)

# Example portals — run every 4h from cron during active grant seasons
portals = [
    {"name": "nsf-grants-gov", "url": "https://www.grants.gov/search-grants?programNumber=NSF-2026-001"},
    {"name": "hud-nofa", "url": "https://www.hud.gov/program_offices/spm/gmomgmt/grantsinfo/fundingopps"},
    {"name": "state-arts-council", "url": "https://example-state.gov/arts/grants/open-applications"},
]

monitor_portals(portals)

Cron frequency: Every 4 hours during active grant seasons; daily during quiet periods. Grant portals can close with 24-hour notice — 4-hour monitoring gives teams time to react.

4. Public Records Preservation — FOIA Documentation

Government agencies and transparency-focused nonprofits often need to preserve public records at a specific point in time — for FOIA requests, litigation holds, or historical archiving. A systematic capture pipeline with a manifest makes the archive auditable.

import requests
import hashlib
import json
import csv
from datetime import datetime, timezone
from pathlib import Path
from concurrent.futures import ThreadPoolExecutor, as_completed

API_KEY = "your_api_key"
RECORDS_DIR = Path("/var/public-records")

def capture_record(url: str, record_id: str, collection: str) -> dict:
    timestamp = datetime.now(timezone.utc)

    resp = requests.post(
        "https://hermesforge.dev/api/screenshot",
        json={
            "url": url,
            "format": "png",
            "full_page": True,
            "viewport_width": 1280,
            "delay": 1500,
        },
        headers={"X-API-Key": API_KEY},
        timeout=90,
    )

    if not resp.ok:
        return {
            "record_id": record_id,
            "url": url,
            "status": "error",
            "error": f"HTTP {resp.status_code}",
            "captured_at": timestamp.isoformat(),
        }

    save_dir = RECORDS_DIR / collection / record_id[:2]  # two-char prefix sharding
    save_dir.mkdir(parents=True, exist_ok=True)

    filename = f"{record_id}.png"
    filepath = save_dir / filename
    filepath.write_bytes(resp.content)

    sha256 = hashlib.sha256(resp.content).hexdigest()

    return {
        "record_id": record_id,
        "url": url,
        "collection": collection,
        "status": "ok",
        "captured_at": timestamp.isoformat(),
        "filename": str(filepath.relative_to(RECORDS_DIR)),
        "sha256": sha256,
        "file_size_bytes": len(resp.content),
    }

def bulk_capture(records: list[dict], collection: str, max_workers: int = 3) -> list[dict]:
    """
    Capture a list of public records concurrently.
    max_workers=3 to avoid overwhelming the API or the source servers.
    """
    results = []
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {
            executor.submit(capture_record, r["url"], r["id"], collection): r
            for r in records
        }
        for future in as_completed(futures):
            result = future.result()
            results.append(result)
            status = result["status"]
            rid = result["record_id"]
            print(f"[{status.upper()}] {rid}")
    return results

def write_manifest(results: list[dict], collection: str):
    """Write CSV manifest for auditors."""
    manifest_path = RECORDS_DIR / f"{collection}_manifest.csv"
    fieldnames = ["record_id", "url", "status", "captured_at", "filename", "sha256", "file_size_bytes", "error"]

    with open(manifest_path, "w", newline="") as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames, extrasaction="ignore")
        writer.writeheader()
        writer.writerows(results)

    ok = sum(1 for r in results if r["status"] == "ok")
    print(f"\nManifest: {manifest_path}")
    print(f"Captured: {ok}/{len(results)} records")
    return manifest_path

# Example: preserve a set of public contract award pages
records = [
    {"id": "contract-2026-0041", "url": "https://example.gov/contracts/2026/0041"},
    {"id": "contract-2026-0042", "url": "https://example.gov/contracts/2026/0042"},
    {"id": "contract-2026-0043", "url": "https://example.gov/contracts/2026/0043"},
]

results = bulk_capture(records, collection="contracts-2026")
write_manifest(results, "contracts-2026")

Two-character prefix sharding (record_id[:2]) prevents any single directory from accumulating thousands of files, which degrades filesystem performance for large archiving jobs. For a FOIA request covering hundreds of records, this keeps navigation practical.


Summary

Pattern Key Parameter Why
Compliance archiving format: "png" Lossless — hash-verifiable
Accessibility audit full_page: true Captures content below fold
Grant portal monitoring delay: 2000 Session checks/spinners settle
Public records bulk max_workers: 3 Polite to source servers

The common thread across all four patterns is integrity over speed. Government and nonprofit archiving is not about generating thumbnails quickly — it's about producing records that can be defended under scrutiny. PNG format, SHA-256 hashes, sidecar metadata, and manifest files all exist to answer the same question: "How do we know this is what the page actually said?"


All code examples use hermesforge.dev. Authentication requires a free API key.