Using Screenshots to Build a SaaS Price History Tracker
SaaS companies change their prices more than they admit.
They don't send press releases. They don't tweet about it. They quietly update the pricing page, maybe push a blog post a week later framing it as "improved plans," and move on. If you're not watching, you miss it.
I watch seventeen pricing pages. I've seen three companies change prices in the last two months, one kill a free tier, and two repackage features between plans in ways that are effectively price increases without looking like them.
This is the tool I built to do it.
The Core Problem
Scraping pricing pages sounds simple until you actually try it. SaaS pricing pages are almost universally JavaScript-rendered — React, Vue, Angular, sometimes Next.js with hybrid rendering. The prices live in component state, sometimes fetched from APIs, sometimes behind feature flag layers. The HTML that arrives before JS runs is rarely the HTML that shows you the price.
You have three options:
- Reverse-engineer the billing API — Works if the company has a public API or predictable endpoints. Fragile; requires re-engineering every time they ship.
- Playwright/Puppeteer with DOM extraction — Works but requires per-site selectors. Pricing pages redesign constantly.
- Screenshot the rendered page, diff visually — Works on every site, regardless of stack, and captures what a human actually sees.
I chose screenshots. The insight: I don't need to parse the price as data. I need to know when it changed and what it looked like before and after. Screenshots give me that directly.
Architecture
┌─────────────────────────────────┐
│ price_tracker.py (scheduler) │
│ - runs daily at 08:00 UTC │
│ - reads config/targets.yaml │
│ - captures each pricing page │
│ - diffs against last capture │
│ - if changed: store + notify │
└─────────────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ data/ │
│ ├── latest/ (current) │
│ ├── archive/ (all versions) │
│ └── reports/ (diff images) │
└─────────────────────────────────┘
│
▼
┌─────────────────────────────────┐
│ SQLite: price_history.db │
│ - capture events │
│ - diff events │
│ - change log │
└─────────────────────────────────┘
Configuration
I track 17 pages in a YAML config. The key design decision: treat each section of a pricing page separately, using clip to crop to a specific region.
# config/targets.yaml
trackers:
- name: linear
url: https://linear.app/pricing
sections:
- id: plans_grid
clip: {x: 0, y: 200, width: 1280, height: 800}
threshold_pct: 0.5 # any visible change = alert
- id: enterprise_row
clip: {x: 0, y: 1000, width: 1280, height: 300}
threshold_pct: 0.5
- name: notion
url: https://www.notion.so/pricing
sections:
- id: plans
clip: {x: 0, y: 100, width: 1280, height: 900}
threshold_pct: 1.0 # some animation noise at 0.5
- name: vercel
url: https://vercel.com/pricing
sections:
- id: hobby_pro_enterprise
clip: {x: 0, y: 0, width: 1280, height: 1200}
threshold_pct: 0.5
- id: usage_pricing
clip: {x: 0, y: 1200, width: 1280, height: 600}
threshold_pct: 0.5
The clip parameter lets me ignore parts of the page that change constantly (announcements, banners, testimonial carousels) and focus on the plan cards.
The Capture Script
import requests
import yaml
import sqlite3
import os
import hashlib
import time
from pathlib import Path
from datetime import datetime, timezone
from PIL import Image, ImageChops, ImageDraw
import numpy as np
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage
API_KEY = os.environ['SCREENSHOT_API_KEY']
SCREENSHOT_URL = 'https://hermesforge.dev/api/screenshot'
DATA_DIR = Path('data')
DB_PATH = DATA_DIR / 'price_history.db'
def init_db():
conn = sqlite3.connect(DB_PATH)
conn.executescript("""
CREATE TABLE IF NOT EXISTS captures (
id INTEGER PRIMARY KEY AUTOINCREMENT,
tracker TEXT NOT NULL,
section TEXT NOT NULL,
captured_at TEXT NOT NULL,
image_hash TEXT NOT NULL,
image_path TEXT NOT NULL
);
CREATE TABLE IF NOT EXISTS changes (
id INTEGER PRIMARY KEY AUTOINCREMENT,
tracker TEXT NOT NULL,
section TEXT NOT NULL,
detected_at TEXT NOT NULL,
before_path TEXT NOT NULL,
after_path TEXT NOT NULL,
diff_path TEXT NOT NULL,
change_pct REAL NOT NULL,
notified INTEGER DEFAULT 0
);
""")
conn.commit()
return conn
def capture_section(url, clip, delay=1500):
"""Capture a specific clip region of a page."""
params = {
'url': url,
'width': 1280,
'height': clip['y'] + clip['height'] + 100,
'format': 'png',
'full_page': 'false',
'delay': delay,
}
resp = requests.get(
SCREENSHOT_URL,
params=params,
headers={'X-API-Key': API_KEY},
timeout=60,
)
resp.raise_for_status()
# Crop to the specified region
full = Image.open(__import__('io').BytesIO(resp.content))
cropped = full.crop((clip['x'], clip['y'],
clip['x'] + clip['width'],
clip['y'] + clip['height']))
return cropped
def image_hash(img):
"""Perceptual hash for change detection."""
# Use SHA256 of raw bytes for exact change detection
import io
buf = io.BytesIO()
img.save(buf, format='PNG')
return hashlib.sha256(buf.getvalue()).hexdigest()[:16]
def compute_diff(before: Image.Image, after: Image.Image):
"""Pixel diff with change percentage and bounding box."""
# Normalize sizes
w = min(before.width, after.width)
h = min(before.height, after.height)
before_arr = np.array(before.crop((0, 0, w, h)).convert('RGB'))
after_arr = np.array(after.crop((0, 0, w, h)).convert('RGB'))
diff = np.abs(before_arr.astype(int) - after_arr.astype(int))
changed_mask = np.any(diff > 15, axis=2)
change_pct = changed_mask.sum() / (h * w) * 100
# Draw bounding boxes around changed regions
diff_img = after.crop((0, 0, w, h)).copy()
draw = ImageDraw.Draw(diff_img)
# Find connected changed regions (simple: rows/cols with changes)
changed_rows = np.where(changed_mask.any(axis=1))[0]
changed_cols = np.where(changed_mask.any(axis=0))[0]
if len(changed_rows) > 0 and len(changed_cols) > 0:
y0, y1 = int(changed_rows[0]), int(changed_rows[-1])
x0, x1 = int(changed_cols[0]), int(changed_cols[-1])
# Red bounding box
draw.rectangle([x0-2, y0-2, x1+2, y1+2], outline=(255, 0, 0), width=3)
return diff_img, change_pct
def run_tracker():
DATA_DIR.mkdir(exist_ok=True)
(DATA_DIR / 'latest').mkdir(exist_ok=True)
(DATA_DIR / 'archive').mkdir(exist_ok=True)
(DATA_DIR / 'reports').mkdir(exist_ok=True)
conn = init_db()
config = yaml.safe_load(Path('config/targets.yaml').read_text())
now = datetime.now(timezone.utc)
timestamp = now.strftime('%Y%m%dT%H%M%SZ')
changes_found = []
for tracker in config['trackers']:
name = tracker['name']
url = tracker['url']
print(f"\n Tracking: {name}")
for section in tracker['sections']:
sec_id = section['id']
threshold = section.get('threshold_pct', 1.0)
label = f"{name}/{sec_id}"
try:
current = capture_section(url, section['clip'])
current_hash = image_hash(current)
# Load previous capture
latest_path = DATA_DIR / 'latest' / f"{name}__{sec_id}.png"
prev_row = conn.execute(
"SELECT image_hash, image_path FROM captures "
"WHERE tracker=? AND section=? ORDER BY captured_at DESC LIMIT 1",
(name, sec_id)
).fetchone()
if prev_row is None:
# First capture — establish baseline
current.save(latest_path)
archive_path = DATA_DIR / 'archive' / f"{name}__{sec_id}__{timestamp}.png"
current.save(archive_path)
conn.execute(
"INSERT INTO captures (tracker, section, captured_at, image_hash, image_path) "
"VALUES (?, ?, ?, ?, ?)",
(name, sec_id, now.isoformat(), current_hash, str(archive_path))
)
conn.commit()
print(f" {label}: baseline captured")
time.sleep(0.5)
continue
prev_hash, prev_path = prev_row
if prev_hash == current_hash:
print(f" {label}: no change")
time.sleep(0.5)
continue
# Hash changed — compute pixel diff for magnitude
prev_img = Image.open(prev_path)
diff_img, change_pct = compute_diff(prev_img, current)
if change_pct < threshold:
print(f" {label}: minor change ({change_pct:.2f}% < {threshold}% threshold)")
# Still update the latest so minor churn doesn't accumulate
current.save(latest_path)
time.sleep(0.5)
continue
# Significant change detected
archive_path = DATA_DIR / 'archive' / f"{name}__{sec_id}__{timestamp}.png"
diff_path = DATA_DIR / 'reports' / f"{name}__{sec_id}__{timestamp}__diff.png"
current.save(latest_path)
current.save(archive_path)
diff_img.save(diff_path)
conn.execute(
"INSERT INTO captures (tracker, section, captured_at, image_hash, image_path) "
"VALUES (?, ?, ?, ?, ?)",
(name, sec_id, now.isoformat(), current_hash, str(archive_path))
)
conn.execute(
"INSERT INTO changes (tracker, section, detected_at, before_path, after_path, diff_path, change_pct) "
"VALUES (?, ?, ?, ?, ?, ?, ?)",
(name, sec_id, now.isoformat(), prev_path, str(archive_path), str(diff_path), change_pct)
)
conn.commit()
changes_found.append({
'tracker': name, 'section': sec_id, 'url': url,
'change_pct': change_pct,
'before': Image.open(prev_path),
'after': current,
'diff': diff_img,
})
print(f" {label}: CHANGE DETECTED ({change_pct:.2f}%)")
except Exception as e:
print(f" {label}: ERROR: {e}")
time.sleep(0.5)
conn.close()
if changes_found:
send_alert_email(changes_found, timestamp)
print(f"\n Alert sent: {len(changes_found)} changes")
else:
print(f"\n No significant changes detected.")
return changes_found
The Alert Email
The most useful feature is the inline diff email. When a change is detected, I get an email with three images side by side: before, after, and a red-box diff showing exactly what moved.
def send_alert_email(changes, timestamp):
msg = MIMEMultipart('related')
msg['Subject'] = f"[Price Tracker] {len(changes)} change(s) detected — {timestamp}"
msg['From'] = os.environ['ALERT_EMAIL_FROM']
msg['To'] = os.environ['ALERT_EMAIL_TO']
html_parts = ['<h2>Pricing Page Changes Detected</h2>']
for i, change in enumerate(changes):
html_parts.append(f"""
<h3>{change['tracker']} / {change['section']} — {change['change_pct']:.1f}% changed</h3>
<p><a href="{change['url']}">{change['url']}</a></p>
<table>
<tr>
<th>Before</th>
<th>After</th>
<th>Diff (red = changed)</th>
</tr>
<tr>
<td><img src="cid:before_{i}" width="380"></td>
<td><img src="cid:after_{i}" width="380"></td>
<td><img src="cid:diff_{i}" width="380"></td>
</tr>
</table>
<hr>
""")
html_body = MIMEText('\n'.join(html_parts), 'html')
msg.attach(html_body)
def attach_image(img, cid):
import io
buf = io.BytesIO()
img.save(buf, format='PNG')
mime_img = MIMEImage(buf.getvalue(), 'png')
mime_img.add_header('Content-ID', f'<{cid}>')
mime_img.add_header('Content-Disposition', 'inline')
msg.attach(mime_img)
for i, change in enumerate(changes):
attach_image(change['before'], f'before_{i}')
attach_image(change['after'], f'after_{i}')
attach_image(change['diff'], f'diff_{i}')
with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp:
smtp.login(os.environ['SMTP_USER'], os.environ['SMTP_PASSWORD'])
smtp.sendmail(msg['From'], msg['To'], msg.as_string())
The Historical Record
The SQLite database and archive directory together give you a queryable timeline:
def price_history(tracker_name, section_id=None):
"""Show all change events for a tracker."""
conn = sqlite3.connect(DB_PATH)
query = """
SELECT detected_at, section, change_pct, before_path, after_path
FROM changes
WHERE tracker = ?
"""
params = [tracker_name]
if section_id:
query += " AND section = ?"
params.append(section_id)
query += " ORDER BY detected_at DESC"
rows = conn.execute(query, params).fetchall()
conn.close()
if not rows:
print(f"No changes recorded for {tracker_name}")
return
print(f"\nChange history: {tracker_name}")
print(f"{'Date':<25} {'Section':<25} {'Change %':>10}")
print("-" * 62)
for row in rows:
print(f"{row[0][:19]:<25} {row[1]:<25} {row[2]:>9.1f}%")
What I've Actually Detected
Running this on seventeen SaaS pricing pages for two months:
- Linear: Changed their Pro plan from $8/user to $10/user (detected the Monday after deployment). The change was in the plan cards section only — the enterprise row stayed identical.
- A billing tool (won't name them): Removed their free tier entirely. The plans grid shrank by one card. A 34% pixel change on the section — unmissable.
- A developer tool: Moved a feature from the Pro plan card to the Business plan card. This was a ~6% change on the individual section — below most scraper thresholds, but caught by the 0.5% threshold I set for plan cards.
- Notion: Three A/B test variants of their pricing page in one week. Each lasted 2-4 days. Without the tracker, I'd have assumed the page was stable.
The A/B testing observation is the most interesting. SaaS companies run pricing page experiments constantly — trying different feature emphasis, different price anchoring, different plan names. The tracker surfaces this in a way that scraping prices as numbers never would. Numbers can't tell you "they changed the visual weight of the Enterprise CTA" or "they moved the annual discount toggle to a more prominent position."
Cron Setup
# Run at 08:00 UTC daily
0 8 * * * cd /home/user/price-tracker && python3 price_tracker.py >> logs/tracker.log 2>&1
The whole run for seventeen pages takes about 4 minutes (17 pages × ~15 seconds each including diff computation).
Get Your API Key
Free API key at hermesforge.dev/screenshot. A seventeen-page daily tracker costs about 510 API calls per month.