Generating PDFs from URLs Without wkhtmltopdf
wkhtmltopdf has been the default tool for URL-to-PDF generation for over a decade. It works by embedding a stripped-down version of WebKit to render HTML and print it to PDF. The problem: WebKit has moved on. CSS Grid, Flexbox, modern fonts, and JavaScript-rendered content all have varying support in wkhtmltopdf's frozen rendering engine. The result is PDFs that look nothing like the actual page.
The browser-based alternative works by using a real, current Chromium instance — the same engine that renders the page in a user's browser — to print to PDF. The fidelity is dramatically better.
The Core Approach
A screenshot API that uses Playwright or Puppeteer under the hood has access to page.pdf() — the same mechanism Chrome DevTools uses when you hit Print → Save as PDF. This produces a PDF that looks exactly like the page would in Chrome.
The request looks similar to a screenshot request:
import requests
def url_to_pdf(url: str, api_key: str) -> bytes:
"""Convert a URL to PDF using screenshot API."""
response = requests.get(
"https://hermesforge.dev/api/screenshot",
params={
"url": url,
"format": "pdf",
"wait": "networkidle",
},
headers={"X-API-Key": api_key},
timeout=60,
)
response.raise_for_status()
return response.content
# Usage
pdf_bytes = url_to_pdf("https://example.com/invoice/123", "your-api-key")
with open("invoice_123.pdf", "wb") as f:
f.write(pdf_bytes)
PDF-Specific Parameters
For PDFs, some parameters differ from screenshots:
Page format: A4 by default, or custom dimensions:
format=pdf&page_format=A4 # Standard A4
format=pdf&page_format=Letter # US Letter
format=pdf&page_format=Legal # US Legal
Margins: Essential for print-quality PDFs:
format=pdf&margin_top=20mm&margin_bottom=20mm&margin_left=15mm&margin_right=15mm
Background graphics: Enabled by default in Chrome's print dialog; sometimes needs explicit enabling:
format=pdf&print_background=true
Scale: Adjusts the rendered size. 1.0 = 100%, 0.8 = 80%:
format=pdf&scale=0.9
Common Use Cases
Invoices
The most common use case. Build the invoice as an HTML page (you're probably already doing this for the email version), then PDF it:
from pathlib import Path
def generate_invoice_pdf(invoice_id: str, api_key: str) -> Path:
"""Generate a PDF for an invoice page."""
invoice_url = f"https://your-app.com/invoices/{invoice_id}?print=true"
output_path = Path(f"./invoices/invoice_{invoice_id}.pdf")
output_path.parent.mkdir(parents=True, exist_ok=True)
if output_path.exists():
return output_path # Already generated
response = requests.get(
"https://hermesforge.dev/api/screenshot",
params={
"url": invoice_url,
"format": "pdf",
"wait": "networkidle",
"margin_top": "15mm",
"margin_bottom": "15mm",
"margin_left": "15mm",
"margin_right": "15mm",
"print_background": "true",
},
headers={"X-API-Key": api_key},
timeout=60,
)
if response.status_code == 200:
output_path.write_bytes(response.content)
return output_path
raise RuntimeError(f"PDF generation failed: {response.status_code}")
The ?print=true query param is a pattern worth adopting: your invoice template can detect this and render a print-optimized version (hide nav, remove buttons, apply print CSS).
Reports and Dashboards
Data visualizations built with Chart.js, D3, or Recharts render perfectly because the full JavaScript engine executes:
def generate_dashboard_report(
dashboard_url: str,
report_name: str,
api_key: str,
wait_ms: int = 2000, # Extra time for chart animations
) -> bytes:
"""
Generate a PDF of a dashboard.
Use a fixed delay to allow chart animations to complete.
"""
response = requests.get(
"https://hermesforge.dev/api/screenshot",
params={
"url": dashboard_url,
"format": "pdf",
"wait": wait_ms, # Fixed delay for animations
"page_format": "A4",
"print_background": "true",
"scale": "0.85", # Slightly reduced to fit wide dashboards
},
headers={"X-API-Key": api_key},
timeout=90,
)
response.raise_for_status()
return response.content
Contracts and Legal Documents
For long-form documents where page breaks matter, add print-specific CSS to your HTML:
/* print.css — included with ?print=true */
@media print {
.no-print { display: none; }
/* Avoid breaking inside sections */
.section { page-break-inside: avoid; }
/* Force new page before major sections */
.chapter { page-break-before: always; }
/* Show full URLs when printing links */
a[href]:after { content: " (" attr(href) ")"; }
}
The screenshot API respects @media print rules when generating PDFs, since it uses Chrome's native print mechanism.
Batch PDF Generation
For generating PDFs for multiple pages:
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
def generate_pdfs_batch(
items: list[dict], # Each: {"url": "...", "output_path": "..."}
api_key: str,
max_workers: int = 3, # Stay within rate limits
) -> list[dict]:
"""Generate PDFs for multiple URLs in parallel."""
def generate_one(item: dict) -> dict:
try:
response = requests.get(
"https://hermesforge.dev/api/screenshot",
params={"url": item["url"], "format": "pdf", "wait": "networkidle"},
headers={"X-API-Key": api_key},
timeout=60,
)
if response.status_code == 200:
Path(item["output_path"]).parent.mkdir(parents=True, exist_ok=True)
Path(item["output_path"]).write_bytes(response.content)
return {"status": "ok", **item}
return {"status": f"failed: {response.status_code}", **item}
except Exception as e:
return {"status": f"error: {e}", **item}
results = []
with ThreadPoolExecutor(max_workers=max_workers) as executor:
futures = {executor.submit(generate_one, item): item for item in items}
for future in as_completed(futures):
results.append(future.result())
return results
Keep max_workers at 3 or below for the Starter tier (200/day). At the Pro tier (1000/day), you can push to 5-8 concurrent workers.
Comparing to wkhtmltopdf
| Feature | wkhtmltopdf | Screenshot API (Chromium) |
|---|---|---|
| CSS Grid support | Partial | Full |
| Flexbox support | Partial | Full |
| JavaScript rendering | Limited | Full |
| Custom fonts | Fragile | Full |
| Modern CSS features | Poor | Current Chrome support |
| Infrastructure required | Binary install | HTTP call |
| Maintenance burden | High (unmaintained) | None |
| Cost | Free (self-hosted) | Per-call |
For straightforward HTML documents, wkhtmltopdf still works. For anything with modern CSS or dynamic content, browser-based PDF generation is the right choice.
Print-Optimizing Your HTML
A few patterns that improve PDF output quality:
<!-- Force exact dimensions for PDF -->
<meta name="viewport" content="width=1200">
<!-- Load all fonts before render -->
<link rel="preload" as="font" href="/fonts/main.woff2" crossorigin>
<!-- Explicit print styles -->
<style>
@page {
size: A4;
margin: 15mm;
}
body {
-webkit-print-color-adjust: exact;
print-color-adjust: exact;
}
</style>
The -webkit-print-color-adjust: exact property is critical — without it, Chrome may strip background colors and images from the PDF to save ink.
hermesforge.dev — screenshot API with PDF generation. Free: 10/day. Starter: $4/30 days (200/day). Pro: $9 (1000/day). Business: $29 (5000/day).