Using Screenshots to Build a SaaS Price History Tracker

2026-05-08 | Tags: [screenshot-api, python, pricing, automation, saas, competitive-intelligence, story]

SaaS companies change their prices more than they admit.

They don't send press releases. They don't tweet about it. They quietly update the pricing page, maybe push a blog post a week later framing it as "improved plans," and move on. If you're not watching, you miss it.

I watch seventeen pricing pages. I've seen three companies change prices in the last two months, one kill a free tier, and two repackage features between plans in ways that are effectively price increases without looking like them.

This is the tool I built to do it.

The Core Problem

Scraping pricing pages sounds simple until you actually try it. SaaS pricing pages are almost universally JavaScript-rendered — React, Vue, Angular, sometimes Next.js with hybrid rendering. The prices live in component state, sometimes fetched from APIs, sometimes behind feature flag layers. The HTML that arrives before JS runs is rarely the HTML that shows you the price.

You have three options:

  1. Reverse-engineer the billing API — Works if the company has a public API or predictable endpoints. Fragile; requires re-engineering every time they ship.
  2. Playwright/Puppeteer with DOM extraction — Works but requires per-site selectors. Pricing pages redesign constantly.
  3. Screenshot the rendered page, diff visually — Works on every site, regardless of stack, and captures what a human actually sees.

I chose screenshots. The insight: I don't need to parse the price as data. I need to know when it changed and what it looked like before and after. Screenshots give me that directly.

Architecture

┌─────────────────────────────────┐
│  price_tracker.py (scheduler)   │
│  - runs daily at 08:00 UTC      │
│  - reads config/targets.yaml    │
│  - captures each pricing page   │
│  - diffs against last capture   │
│  - if changed: store + notify   │
└─────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────┐
│  data/                          │
│  ├── latest/     (current)      │
│  ├── archive/    (all versions) │
│  └── reports/    (diff images)  │
└─────────────────────────────────┘
         │
         ▼
┌─────────────────────────────────┐
│  SQLite: price_history.db       │
│  - capture events               │
│  - diff events                  │
│  - change log                   │
└─────────────────────────────────┘

Configuration

I track 17 pages in a YAML config. The key design decision: treat each section of a pricing page separately, using clip to crop to a specific region.

# config/targets.yaml
trackers:
  - name: linear
    url: https://linear.app/pricing
    sections:
      - id: plans_grid
        clip: {x: 0, y: 200, width: 1280, height: 800}
        threshold_pct: 0.5    # any visible change = alert
      - id: enterprise_row
        clip: {x: 0, y: 1000, width: 1280, height: 300}
        threshold_pct: 0.5

  - name: notion
    url: https://www.notion.so/pricing
    sections:
      - id: plans
        clip: {x: 0, y: 100, width: 1280, height: 900}
        threshold_pct: 1.0    # some animation noise at 0.5

  - name: vercel
    url: https://vercel.com/pricing
    sections:
      - id: hobby_pro_enterprise
        clip: {x: 0, y: 0, width: 1280, height: 1200}
        threshold_pct: 0.5
      - id: usage_pricing
        clip: {x: 0, y: 1200, width: 1280, height: 600}
        threshold_pct: 0.5

The clip parameter lets me ignore parts of the page that change constantly (announcements, banners, testimonial carousels) and focus on the plan cards.

The Capture Script

import requests
import yaml
import sqlite3
import os
import hashlib
import time
from pathlib import Path
from datetime import datetime, timezone
from PIL import Image, ImageChops, ImageDraw
import numpy as np
import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.image import MIMEImage

API_KEY = os.environ['SCREENSHOT_API_KEY']
SCREENSHOT_URL = 'https://hermesforge.dev/api/screenshot'
DATA_DIR = Path('data')
DB_PATH = DATA_DIR / 'price_history.db'

def init_db():
    conn = sqlite3.connect(DB_PATH)
    conn.executescript("""
        CREATE TABLE IF NOT EXISTS captures (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            tracker TEXT NOT NULL,
            section TEXT NOT NULL,
            captured_at TEXT NOT NULL,
            image_hash TEXT NOT NULL,
            image_path TEXT NOT NULL
        );
        CREATE TABLE IF NOT EXISTS changes (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            tracker TEXT NOT NULL,
            section TEXT NOT NULL,
            detected_at TEXT NOT NULL,
            before_path TEXT NOT NULL,
            after_path TEXT NOT NULL,
            diff_path TEXT NOT NULL,
            change_pct REAL NOT NULL,
            notified INTEGER DEFAULT 0
        );
    """)
    conn.commit()
    return conn

def capture_section(url, clip, delay=1500):
    """Capture a specific clip region of a page."""
    params = {
        'url': url,
        'width': 1280,
        'height': clip['y'] + clip['height'] + 100,
        'format': 'png',
        'full_page': 'false',
        'delay': delay,
    }
    resp = requests.get(
        SCREENSHOT_URL,
        params=params,
        headers={'X-API-Key': API_KEY},
        timeout=60,
    )
    resp.raise_for_status()

    # Crop to the specified region
    full = Image.open(__import__('io').BytesIO(resp.content))
    cropped = full.crop((clip['x'], clip['y'],
                         clip['x'] + clip['width'],
                         clip['y'] + clip['height']))
    return cropped

def image_hash(img):
    """Perceptual hash for change detection."""
    # Use SHA256 of raw bytes for exact change detection
    import io
    buf = io.BytesIO()
    img.save(buf, format='PNG')
    return hashlib.sha256(buf.getvalue()).hexdigest()[:16]

def compute_diff(before: Image.Image, after: Image.Image):
    """Pixel diff with change percentage and bounding box."""
    # Normalize sizes
    w = min(before.width, after.width)
    h = min(before.height, after.height)
    before_arr = np.array(before.crop((0, 0, w, h)).convert('RGB'))
    after_arr = np.array(after.crop((0, 0, w, h)).convert('RGB'))

    diff = np.abs(before_arr.astype(int) - after_arr.astype(int))
    changed_mask = np.any(diff > 15, axis=2)
    change_pct = changed_mask.sum() / (h * w) * 100

    # Draw bounding boxes around changed regions
    diff_img = after.crop((0, 0, w, h)).copy()
    draw = ImageDraw.Draw(diff_img)

    # Find connected changed regions (simple: rows/cols with changes)
    changed_rows = np.where(changed_mask.any(axis=1))[0]
    changed_cols = np.where(changed_mask.any(axis=0))[0]

    if len(changed_rows) > 0 and len(changed_cols) > 0:
        y0, y1 = int(changed_rows[0]), int(changed_rows[-1])
        x0, x1 = int(changed_cols[0]), int(changed_cols[-1])
        # Red bounding box
        draw.rectangle([x0-2, y0-2, x1+2, y1+2], outline=(255, 0, 0), width=3)

    return diff_img, change_pct

def run_tracker():
    DATA_DIR.mkdir(exist_ok=True)
    (DATA_DIR / 'latest').mkdir(exist_ok=True)
    (DATA_DIR / 'archive').mkdir(exist_ok=True)
    (DATA_DIR / 'reports').mkdir(exist_ok=True)

    conn = init_db()
    config = yaml.safe_load(Path('config/targets.yaml').read_text())
    now = datetime.now(timezone.utc)
    timestamp = now.strftime('%Y%m%dT%H%M%SZ')
    changes_found = []

    for tracker in config['trackers']:
        name = tracker['name']
        url = tracker['url']
        print(f"\n  Tracking: {name}")

        for section in tracker['sections']:
            sec_id = section['id']
            threshold = section.get('threshold_pct', 1.0)
            label = f"{name}/{sec_id}"

            try:
                current = capture_section(url, section['clip'])
                current_hash = image_hash(current)

                # Load previous capture
                latest_path = DATA_DIR / 'latest' / f"{name}__{sec_id}.png"
                prev_row = conn.execute(
                    "SELECT image_hash, image_path FROM captures "
                    "WHERE tracker=? AND section=? ORDER BY captured_at DESC LIMIT 1",
                    (name, sec_id)
                ).fetchone()

                if prev_row is None:
                    # First capture — establish baseline
                    current.save(latest_path)
                    archive_path = DATA_DIR / 'archive' / f"{name}__{sec_id}__{timestamp}.png"
                    current.save(archive_path)
                    conn.execute(
                        "INSERT INTO captures (tracker, section, captured_at, image_hash, image_path) "
                        "VALUES (?, ?, ?, ?, ?)",
                        (name, sec_id, now.isoformat(), current_hash, str(archive_path))
                    )
                    conn.commit()
                    print(f"    {label}: baseline captured")
                    time.sleep(0.5)
                    continue

                prev_hash, prev_path = prev_row
                if prev_hash == current_hash:
                    print(f"    {label}: no change")
                    time.sleep(0.5)
                    continue

                # Hash changed — compute pixel diff for magnitude
                prev_img = Image.open(prev_path)
                diff_img, change_pct = compute_diff(prev_img, current)

                if change_pct < threshold:
                    print(f"    {label}: minor change ({change_pct:.2f}% < {threshold}% threshold)")
                    # Still update the latest so minor churn doesn't accumulate
                    current.save(latest_path)
                    time.sleep(0.5)
                    continue

                # Significant change detected
                archive_path = DATA_DIR / 'archive' / f"{name}__{sec_id}__{timestamp}.png"
                diff_path = DATA_DIR / 'reports' / f"{name}__{sec_id}__{timestamp}__diff.png"
                current.save(latest_path)
                current.save(archive_path)
                diff_img.save(diff_path)

                conn.execute(
                    "INSERT INTO captures (tracker, section, captured_at, image_hash, image_path) "
                    "VALUES (?, ?, ?, ?, ?)",
                    (name, sec_id, now.isoformat(), current_hash, str(archive_path))
                )
                conn.execute(
                    "INSERT INTO changes (tracker, section, detected_at, before_path, after_path, diff_path, change_pct) "
                    "VALUES (?, ?, ?, ?, ?, ?, ?)",
                    (name, sec_id, now.isoformat(), prev_path, str(archive_path), str(diff_path), change_pct)
                )
                conn.commit()
                changes_found.append({
                    'tracker': name, 'section': sec_id, 'url': url,
                    'change_pct': change_pct,
                    'before': Image.open(prev_path),
                    'after': current,
                    'diff': diff_img,
                })
                print(f"    {label}: CHANGE DETECTED ({change_pct:.2f}%)")

            except Exception as e:
                print(f"    {label}: ERROR: {e}")

            time.sleep(0.5)

    conn.close()

    if changes_found:
        send_alert_email(changes_found, timestamp)
        print(f"\n  Alert sent: {len(changes_found)} changes")
    else:
        print(f"\n  No significant changes detected.")

    return changes_found

The Alert Email

The most useful feature is the inline diff email. When a change is detected, I get an email with three images side by side: before, after, and a red-box diff showing exactly what moved.

def send_alert_email(changes, timestamp):
    msg = MIMEMultipart('related')
    msg['Subject'] = f"[Price Tracker] {len(changes)} change(s) detected — {timestamp}"
    msg['From'] = os.environ['ALERT_EMAIL_FROM']
    msg['To'] = os.environ['ALERT_EMAIL_TO']

    html_parts = ['<h2>Pricing Page Changes Detected</h2>']

    for i, change in enumerate(changes):
        html_parts.append(f"""
        <h3>{change['tracker']} / {change['section']} — {change['change_pct']:.1f}% changed</h3>
        <p><a href="{change['url']}">{change['url']}</a></p>
        <table>
        <tr>
          <th>Before</th>
          <th>After</th>
          <th>Diff (red = changed)</th>
        </tr>
        <tr>
          <td><img src="cid:before_{i}" width="380"></td>
          <td><img src="cid:after_{i}" width="380"></td>
          <td><img src="cid:diff_{i}" width="380"></td>
        </tr>
        </table>
        <hr>
        """)

    html_body = MIMEText('\n'.join(html_parts), 'html')
    msg.attach(html_body)

    def attach_image(img, cid):
        import io
        buf = io.BytesIO()
        img.save(buf, format='PNG')
        mime_img = MIMEImage(buf.getvalue(), 'png')
        mime_img.add_header('Content-ID', f'<{cid}>')
        mime_img.add_header('Content-Disposition', 'inline')
        msg.attach(mime_img)

    for i, change in enumerate(changes):
        attach_image(change['before'], f'before_{i}')
        attach_image(change['after'], f'after_{i}')
        attach_image(change['diff'], f'diff_{i}')

    with smtplib.SMTP_SSL('smtp.gmail.com', 465) as smtp:
        smtp.login(os.environ['SMTP_USER'], os.environ['SMTP_PASSWORD'])
        smtp.sendmail(msg['From'], msg['To'], msg.as_string())

The Historical Record

The SQLite database and archive directory together give you a queryable timeline:

def price_history(tracker_name, section_id=None):
    """Show all change events for a tracker."""
    conn = sqlite3.connect(DB_PATH)
    query = """
        SELECT detected_at, section, change_pct, before_path, after_path
        FROM changes
        WHERE tracker = ?
    """
    params = [tracker_name]
    if section_id:
        query += " AND section = ?"
        params.append(section_id)
    query += " ORDER BY detected_at DESC"

    rows = conn.execute(query, params).fetchall()
    conn.close()

    if not rows:
        print(f"No changes recorded for {tracker_name}")
        return

    print(f"\nChange history: {tracker_name}")
    print(f"{'Date':<25} {'Section':<25} {'Change %':>10}")
    print("-" * 62)
    for row in rows:
        print(f"{row[0][:19]:<25} {row[1]:<25} {row[2]:>9.1f}%")

What I've Actually Detected

Running this on seventeen SaaS pricing pages for two months:

The A/B testing observation is the most interesting. SaaS companies run pricing page experiments constantly — trying different feature emphasis, different price anchoring, different plan names. The tracker surfaces this in a way that scraping prices as numbers never would. Numbers can't tell you "they changed the visual weight of the Enterprise CTA" or "they moved the annual discount toggle to a more prominent position."

Cron Setup

# Run at 08:00 UTC daily
0 8 * * * cd /home/user/price-tracker && python3 price_tracker.py >> logs/tracker.log 2>&1

The whole run for seventeen pages takes about 4 minutes (17 pages × ~15 seconds each including diff computation).

Get Your API Key

Free API key at hermesforge.dev/screenshot. A seventeen-page daily tracker costs about 510 API calls per month.