Automated Accessibility Auditing with Screenshot API

2026-04-15 | Tags: [tutorial, screenshot-api, accessibility, a11y, testing, python]

Accessibility audits typically happen in two modes: automated static analysis (axe-core, WAVE, Lighthouse) that catches code-level issues, and manual visual review that catches rendering and layout problems that tools miss. The gap between these two modes is where many a11y issues live — things like text that becomes unreadable at large zoom levels, touch targets that are too small on mobile viewports, or focus states that are invisible against certain backgrounds.

Screenshot APIs bridge part of this gap. They can't replace automated axe-core analysis, but they can automate the visual documentation side: capturing what a page actually looks like at different viewports, zoom levels, and focus states.

What Screenshot APIs Contribute to Accessibility Testing

Can do: - Capture visual evidence at different viewport sizes (mobile, tablet, desktop) - Screenshot at high zoom levels (200%, 400%) to test text reflow and readability - Capture pages with forced high-contrast CSS to audit color-independent rendering - Document focus states by injecting CSS that makes focus rings visible - Produce consistent visual evidence for auditor reports - Automate before/after comparisons after a11y fixes

Can't do: - Run axe-core or other DOM-level accessibility analysis - Test keyboard navigation sequences interactively - Detect screen reader compatibility - Replace manual testing with assistive technology

The right use: combine static analysis tools with screenshot documentation for evidence collection and regression detection.

Zoom Level Testing

One of the most commonly missed accessibility requirements is WCAG 2.1 SC 1.4.4: text must be readable up to 200% zoom without loss of content or functionality. Automated tools don't catch layout breakage at zoom — a screenshot at 200% zoom does.

import requests
from pathlib import Path
from typing import Optional

SCREENSHOT_API_KEY = "your-api-key"
SCREENSHOT_API_URL = "https://hermesforge.dev/api/screenshot"


def capture_at_zoom(
    url: str,
    zoom_percent: int,
    base_viewport_width: int = 1280,
    output_dir: str = "./a11y_evidence",
) -> Optional[Path]:
    """
    Capture a page at a specific zoom level.
    Simulates zoom by reducing the effective viewport width.
    At 200% zoom, 1280px viewport behaves like 640px logical width.
    """
    Path(output_dir).mkdir(parents=True, exist_ok=True)

    # At N% zoom, the browser renders content as if viewport is base/N*100 wide
    effective_width = int(base_viewport_width / (zoom_percent / 100))

    response = requests.get(
        SCREENSHOT_API_URL,
        params={
            "url": url,
            "format": "png",
            "width": effective_width,
            "full_page": "true",
            "wait": "networkidle",
        },
        headers={"X-API-Key": SCREENSHOT_API_KEY},
        timeout=30,
    )

    if response.status_code != 200:
        return None

    filename = f"zoom_{zoom_percent}pct.png"
    output_path = Path(output_dir) / filename
    output_path.write_bytes(response.content)
    return output_path


def audit_zoom_levels(url: str, page_name: str) -> dict:
    """
    Capture the page at WCAG-required zoom levels.
    Returns paths to evidence screenshots.
    """
    output_dir = f"./a11y_evidence/{page_name}"
    zoom_levels = [100, 200, 400]  # WCAG 1.4.4 requires up to 200%; 400% is AA+

    results = {}
    for zoom in zoom_levels:
        path = capture_at_zoom(url, zoom, output_dir=output_dir)
        results[f"zoom_{zoom}pct"] = str(path) if path else "FAILED"
        print(f"  Zoom {zoom}%: {results[f'zoom_{zoom}pct']}")

    return results

Viewport Size Coverage

Different viewport sizes reveal different a11y issues: mobile touch targets, responsive text sizing, collapsing navigation menus that become unusable.

VIEWPORT_SIZES = [
    ("mobile_small", 320, 568),   # iPhone SE, minimum width WCAG expects
    ("mobile", 375, 667),          # iPhone 12 mini
    ("mobile_large", 428, 926),    # iPhone Pro Max
    ("tablet", 768, 1024),         # iPad portrait
    ("desktop", 1280, 800),        # Standard desktop
    ("desktop_wide", 1920, 1080),  # Wide desktop
]


def capture_viewport_matrix(url: str, page_name: str) -> list[dict]:
    """
    Capture a page at all standard viewport sizes.
    Documents responsive layout and mobile a11y issues.
    """
    output_dir = f"./a11y_evidence/{page_name}/viewports"
    Path(output_dir).mkdir(parents=True, exist_ok=True)

    results = []
    for name, width, height in VIEWPORT_SIZES:
        response = requests.get(
            SCREENSHOT_API_URL,
            params={
                "url": url,
                "format": "png",
                "width": width,
                "height": height,
                "full_page": "true",
                "wait": "networkidle",
            },
            headers={"X-API-Key": SCREENSHOT_API_KEY},
            timeout=30,
        )

        result = {"viewport": name, "width": width, "height": height}
        if response.status_code == 200:
            path = Path(output_dir) / f"{name}.png"
            path.write_bytes(response.content)
            result["file"] = str(path)
        else:
            result["error"] = response.status_code

        results.append(result)

    return results

High-Contrast Mode Simulation

WCAG 1.4.11 (Non-text contrast) and 1.4.3 (text contrast) can be partially audited by capturing a forced high-contrast version of the page. This doesn't replicate Windows High Contrast mode, but it reveals whether the page's information is visible without color:

HIGH_CONTRAST_CSS = """
    * {
        background-color: #000000 !important;
        color: #ffffff !important;
        border-color: #ffffff !important;
        outline-color: #ffffff !important;
    }
    img, svg, video {
        filter: invert(1);
    }
    a, a * {
        color: #ffff00 !important;
    }
    button, input, select, textarea {
        background-color: #000000 !important;
        color: #ffffff !important;
        border: 2px solid #ffffff !important;
    }
"""


def capture_high_contrast(url: str, page_name: str) -> Optional[Path]:
    """
    Capture page with forced high-contrast CSS.
    Reveals color-dependent UI that becomes unusable in high-contrast mode.
    """
    output_dir = f"./a11y_evidence/{page_name}"
    Path(output_dir).mkdir(parents=True, exist_ok=True)

    response = requests.get(
        SCREENSHOT_API_URL,
        params={
            "url": url,
            "format": "png",
            "width": 1280,
            "full_page": "true",
            "wait": "networkidle",
            "inject_css": HIGH_CONTRAST_CSS,
        },
        headers={"X-API-Key": SCREENSHOT_API_KEY},
        timeout=30,
    )

    if response.status_code != 200:
        return None

    path = Path(output_dir) / "high_contrast.png"
    path.write_bytes(response.content)
    return path

Focus State Documentation

Keyboard navigation requires visible focus indicators (WCAG 2.4.7). A common failure: focus styles exist in the CSS but are invisible because they match the background or are too thin. Documenting focus states by injecting an enhanced focus ring CSS:

ENHANCED_FOCUS_CSS = """
    *:focus, *:focus-visible {
        outline: 4px solid #ff6600 !important;
        outline-offset: 2px !important;
        box-shadow: 0 0 0 6px rgba(255, 102, 0, 0.3) !important;
    }
"""


def capture_with_enhanced_focus(url: str, page_name: str) -> Optional[Path]:
    """
    Capture page with enhanced, highly visible focus indicators.
    Use alongside a page that has an interactive element in focus
    (e.g., a URL that auto-focuses a form field on load).
    """
    output_dir = f"./a11y_evidence/{page_name}"
    Path(output_dir).mkdir(parents=True, exist_ok=True)

    response = requests.get(
        SCREENSHOT_API_URL,
        params={
            "url": url,
            "format": "png",
            "width": 1280,
            "full_page": "false",
            "wait": "networkidle",
            "inject_css": ENHANCED_FOCUS_CSS,
        },
        headers={"X-API-Key": SCREENSHOT_API_KEY},
        timeout=30,
    )

    if response.status_code != 200:
        return None

    path = Path(output_dir) / "focus_enhanced.png"
    path.write_bytes(response.content)
    return path

Full Audit Pipeline

Combine all captures into a single audit run:

import json
from datetime import datetime


def run_a11y_visual_audit(
    url: str,
    page_name: str,
    output_base: str = "./a11y_audits",
) -> dict:
    """
    Run full visual accessibility audit for a URL.
    Produces evidence screenshots for manual review and audit reports.
    """
    timestamp = datetime.utcnow().strftime("%Y%m%d_%H%M%S")
    audit_dir = f"{output_base}/{page_name}/{timestamp}"

    print(f"Auditing: {url}")
    print(f"Output: {audit_dir}")

    audit = {
        "url": url,
        "page": page_name,
        "timestamp": timestamp,
        "evidence": {},
    }

    # 1. Zoom levels (WCAG 1.4.4)
    print("\n[1/4] Zoom levels...")
    audit["evidence"]["zoom"] = audit_zoom_levels(url, page_name)

    # 2. Viewport matrix
    print("\n[2/4] Viewports...")
    viewport_results = capture_viewport_matrix(url, page_name)
    audit["evidence"]["viewports"] = viewport_results

    # 3. High contrast (WCAG 1.4.11)
    print("\n[3/4] High contrast...")
    hc_path = capture_high_contrast(url, page_name)
    audit["evidence"]["high_contrast"] = str(hc_path) if hc_path else "FAILED"

    # 4. Focus documentation (WCAG 2.4.7)
    print("\n[4/4] Focus states...")
    focus_path = capture_with_enhanced_focus(url, page_name)
    audit["evidence"]["focus_enhanced"] = str(focus_path) if focus_path else "FAILED"

    # Summary
    failures = sum(
        1 for v in audit["evidence"]["zoom"].values() if v == "FAILED"
    ) + sum(
        1 for v in audit["evidence"]["viewports"] if "error" in v
    ) + (1 if audit["evidence"]["high_contrast"] == "FAILED" else 0) + (
        1 if audit["evidence"]["focus_enhanced"] == "FAILED" else 0
    )

    audit["capture_failures"] = failures
    audit["note"] = (
        "Visual evidence only. Combine with axe-core/Lighthouse for complete audit."
    )

    # Save audit record
    Path(audit_dir).mkdir(parents=True, exist_ok=True)
    (Path(audit_dir) / "audit.json").write_text(json.dumps(audit, indent=2))

    return audit


# Usage
result = run_a11y_visual_audit(
    url="https://your-app.com/dashboard",
    page_name="dashboard",
)
print(f"\nCapture failures: {result['capture_failures']}")
print(f"Evidence saved to: ./a11y_audits/dashboard/{result['timestamp']}/")

Regression Detection After Fixes

Once you've documented the baseline state, screenshot comparisons catch a11y regressions:

from PIL import Image, ImageChops
import numpy as np


def compare_a11y_screenshots(
    baseline_path: str,
    current_path: str,
    diff_path: str,
    threshold_pct: float = 1.0,
) -> dict:
    """
    Compare two a11y evidence screenshots.
    Returns whether a significant visual change occurred.
    """
    baseline = Image.open(baseline_path).convert("RGB")
    current = Image.open(current_path).convert("RGB")

    # Resize current to match baseline if needed
    if baseline.size != current.size:
        current = current.resize(baseline.size, Image.LANCZOS)

    diff = ImageChops.difference(baseline, current)
    diff_array = np.array(diff)
    changed_pixels = np.sum(diff_array > 10)  # Threshold for noise
    total_pixels = diff_array.size / 3
    change_pct = (changed_pixels / total_pixels) * 100

    # Save diff image
    diff.save(diff_path)

    return {
        "changed": change_pct > threshold_pct,
        "change_pct": round(change_pct, 2),
        "diff_path": diff_path,
    }

Integrating with CI/CD

Add visual a11y evidence to pull requests that touch UI:

# .github/workflows/a11y-audit.yml
name: Accessibility Visual Audit

on:
  pull_request:
    paths:
      - 'src/components/**'
      - 'src/styles/**'
      - 'public/**'

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Run visual a11y audit
        run: |
          python3 scripts/a11y_audit.py \
            --url ${{ vars.STAGING_URL }} \
            --pages home,dashboard,settings \
            --api-key ${{ secrets.SCREENSHOT_API_KEY }}
      - name: Upload evidence
        uses: actions/upload-artifact@v3
        with:
          name: a11y-evidence-${{ github.sha }}
          path: ./a11y_audits/

Rate Limit Planning

A typical audit run: - 3 zoom levels × 1 page = 3 calls - 6 viewport sizes × 1 page = 6 calls - High contrast + focus = 2 calls - Total per page: ~11 API calls

For a 20-page application: ~220 calls per full audit run. At the Starter tier (200/day), this nearly fits in one day. At the Pro tier (1000/day), you can audit a 90-page application with room for re-runs.

hermesforge.dev — screenshot API. Free: 10/day. Starter: $4/30 days (200/day). Pro: $9 (1000/day). Business: $29 (5000/day).