Generating PDFs from URLs Without wkhtmltopdf

2026-04-18 | Tags: [tutorial, screenshot-api, pdf, python, automation]

wkhtmltopdf has been the default tool for URL-to-PDF generation for over a decade. It works by embedding a stripped-down version of WebKit to render HTML and print it to PDF. The problem: WebKit has moved on. CSS Grid, Flexbox, modern fonts, and JavaScript-rendered content all have varying support in wkhtmltopdf's frozen rendering engine. The result is PDFs that look nothing like the actual page.

The browser-based alternative works by using a real, current Chromium instance — the same engine that renders the page in a user's browser — to print to PDF. The fidelity is dramatically better.

The Core Approach

A screenshot API that uses Playwright or Puppeteer under the hood has access to page.pdf() — the same mechanism Chrome DevTools uses when you hit Print → Save as PDF. This produces a PDF that looks exactly like the page would in Chrome.

The request looks similar to a screenshot request:

import requests

def url_to_pdf(url: str, api_key: str) -> bytes:
    """Convert a URL to PDF using screenshot API."""
    response = requests.get(
        "https://hermesforge.dev/api/screenshot",
        params={
            "url": url,
            "format": "pdf",
            "wait": "networkidle",
        },
        headers={"X-API-Key": api_key},
        timeout=60,
    )
    response.raise_for_status()
    return response.content


# Usage
pdf_bytes = url_to_pdf("https://example.com/invoice/123", "your-api-key")
with open("invoice_123.pdf", "wb") as f:
    f.write(pdf_bytes)

PDF-Specific Parameters

For PDFs, some parameters differ from screenshots:

Page format: A4 by default, or custom dimensions:

format=pdf&page_format=A4          # Standard A4
format=pdf&page_format=Letter      # US Letter
format=pdf&page_format=Legal       # US Legal

Margins: Essential for print-quality PDFs:

format=pdf&margin_top=20mm&margin_bottom=20mm&margin_left=15mm&margin_right=15mm

Background graphics: Enabled by default in Chrome's print dialog; sometimes needs explicit enabling:

format=pdf&print_background=true

Scale: Adjusts the rendered size. 1.0 = 100%, 0.8 = 80%:

format=pdf&scale=0.9

Common Use Cases

Invoices

The most common use case. Build the invoice as an HTML page (you're probably already doing this for the email version), then PDF it:

from pathlib import Path

def generate_invoice_pdf(invoice_id: str, api_key: str) -> Path:
    """Generate a PDF for an invoice page."""
    invoice_url = f"https://your-app.com/invoices/{invoice_id}?print=true"
    output_path = Path(f"./invoices/invoice_{invoice_id}.pdf")
    output_path.parent.mkdir(parents=True, exist_ok=True)

    if output_path.exists():
        return output_path  # Already generated

    response = requests.get(
        "https://hermesforge.dev/api/screenshot",
        params={
            "url": invoice_url,
            "format": "pdf",
            "wait": "networkidle",
            "margin_top": "15mm",
            "margin_bottom": "15mm",
            "margin_left": "15mm",
            "margin_right": "15mm",
            "print_background": "true",
        },
        headers={"X-API-Key": api_key},
        timeout=60,
    )

    if response.status_code == 200:
        output_path.write_bytes(response.content)
        return output_path

    raise RuntimeError(f"PDF generation failed: {response.status_code}")

The ?print=true query param is a pattern worth adopting: your invoice template can detect this and render a print-optimized version (hide nav, remove buttons, apply print CSS).

Reports and Dashboards

Data visualizations built with Chart.js, D3, or Recharts render perfectly because the full JavaScript engine executes:

def generate_dashboard_report(
    dashboard_url: str,
    report_name: str,
    api_key: str,
    wait_ms: int = 2000,  # Extra time for chart animations
) -> bytes:
    """
    Generate a PDF of a dashboard.
    Use a fixed delay to allow chart animations to complete.
    """
    response = requests.get(
        "https://hermesforge.dev/api/screenshot",
        params={
            "url": dashboard_url,
            "format": "pdf",
            "wait": wait_ms,  # Fixed delay for animations
            "page_format": "A4",
            "print_background": "true",
            "scale": "0.85",  # Slightly reduced to fit wide dashboards
        },
        headers={"X-API-Key": api_key},
        timeout=90,
    )

    response.raise_for_status()
    return response.content

For long-form documents where page breaks matter, add print-specific CSS to your HTML:

/* print.css — included with ?print=true */
@media print {
  .no-print { display: none; }

  /* Avoid breaking inside sections */
  .section { page-break-inside: avoid; }

  /* Force new page before major sections */
  .chapter { page-break-before: always; }

  /* Show full URLs when printing links */
  a[href]:after { content: " (" attr(href) ")"; }
}

The screenshot API respects @media print rules when generating PDFs, since it uses Chrome's native print mechanism.

Batch PDF Generation

For generating PDFs for multiple pages:

import time
from concurrent.futures import ThreadPoolExecutor, as_completed

def generate_pdfs_batch(
    items: list[dict],  # Each: {"url": "...", "output_path": "..."}
    api_key: str,
    max_workers: int = 3,  # Stay within rate limits
) -> list[dict]:
    """Generate PDFs for multiple URLs in parallel."""

    def generate_one(item: dict) -> dict:
        try:
            response = requests.get(
                "https://hermesforge.dev/api/screenshot",
                params={"url": item["url"], "format": "pdf", "wait": "networkidle"},
                headers={"X-API-Key": api_key},
                timeout=60,
            )
            if response.status_code == 200:
                Path(item["output_path"]).parent.mkdir(parents=True, exist_ok=True)
                Path(item["output_path"]).write_bytes(response.content)
                return {"status": "ok", **item}
            return {"status": f"failed: {response.status_code}", **item}
        except Exception as e:
            return {"status": f"error: {e}", **item}

    results = []
    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        futures = {executor.submit(generate_one, item): item for item in items}
        for future in as_completed(futures):
            results.append(future.result())

    return results

Keep max_workers at 3 or below for the Starter tier (200/day). At the Pro tier (1000/day), you can push to 5-8 concurrent workers.

Comparing to wkhtmltopdf

Feature wkhtmltopdf Screenshot API (Chromium)
CSS Grid support Partial Full
Flexbox support Partial Full
JavaScript rendering Limited Full
Custom fonts Fragile Full
Modern CSS features Poor Current Chrome support
Infrastructure required Binary install HTTP call
Maintenance burden High (unmaintained) None
Cost Free (self-hosted) Per-call

For straightforward HTML documents, wkhtmltopdf still works. For anything with modern CSS or dynamic content, browser-based PDF generation is the right choice.

A few patterns that improve PDF output quality:

<!-- Force exact dimensions for PDF -->
<meta name="viewport" content="width=1200">

<!-- Load all fonts before render -->
<link rel="preload" as="font" href="/fonts/main.woff2" crossorigin>

<!-- Explicit print styles -->
<style>
  @page {
    size: A4;
    margin: 15mm;
  }
  body {
    -webkit-print-color-adjust: exact;
    print-color-adjust: exact;
  }
</style>

The -webkit-print-color-adjust: exact property is critical — without it, Chrome may strip background colors and images from the PDF to save ink.


hermesforge.dev — screenshot API with PDF generation. Free: 10/day. Starter: $4/30 days (200/day). Pro: $9 (1000/day). Business: $29 (5000/day).