Building a Competitive Intelligence Agent That Never Breaks

2026-03-27 | Tags: [screenshot-api, ai-agents, llm, vision, competitive-intelligence, python, openai, automation, story]

We were tracking three competitors' pricing pages with a scraper. It worked fine for six weeks. Then one competitor switched from a table layout to a card grid. The table selectors returned nothing. The dashboard showed $0 for all their plans. Our weekly report went out claiming the competitor had gone free.

That was the moment I decided to stop fighting CSS selectors.

The problem with DOM-based scraping for competitive intelligence is fundamental: you're encoding the competitor's current implementation as an assumption in your code. When they refactor — and they always refactor — your scraper silently produces wrong data. You don't know it's wrong until someone notices the numbers don't make sense.

LLM vision doesn't have this problem. It reads the page the same way a human analyst would: by looking at it. A card grid and a table are both readable by the same model with the same prompt. Layout changes don't matter. What matters is what's visually present.

Here's how I rebuilt the competitive intelligence pipeline around screenshots.

The Core Trade-off

DOM scraping is deterministic. The same page with the same selectors returns the same data. That predictability is valuable when it works.

LLM vision is probabilistic. Two runs against the same page might return slightly different field names, or the model might hedge on ambiguous data. But it gracefully handles layout changes, and it can reason about things selectors can't: "this plan seems designed for enterprise, given the custom pricing and the phone-call CTA."

For competitive intelligence — which is fundamentally about understanding positioning, not extracting millisecond-accurate data — the trade-off is worth it.

The Agent Architecture

CompetitiveIntelligenceAgent
  │
  ├── For each competitor URL:
  │     ├── capture_page(url) → screenshot
  │     ├── extract_intelligence(screenshot, schema) → structured data
  │     └── validate_and_normalize(data) → clean record
  │
  ├── compare_snapshots(today, yesterday) → changes
  │
  └── generate_report(data, changes) → markdown report

The agent runs on a schedule (daily or weekly), stores historical snapshots, and generates a diff report when meaningful changes are detected.

Step 1: Screenshot Capture

import os
import io
import base64
import requests
from PIL import Image
from datetime import datetime

API_KEY = os.environ['SCREENSHOT_API_KEY']
BASE_URL = 'https://hermesforge.dev/api/screenshot'


def capture_competitor_page(url, full_page=True):
    """
    Capture a full-page screenshot.
    Full-page ensures we get all pricing tiers even if they're below the fold.
    """
    resp = requests.get(
        BASE_URL,
        params={
            'url': url,
            'width': 1440,   # Wide viewport — pricing tables often need horizontal space
            'format': 'png',
            'delay': 2000,   # Competitors may have AB testing or lazy-loading
            'full_page': str(full_page).lower(),
            'block_ads': 'true',  # Remove clutter that might confuse extraction
        },
        headers={'X-API-Key': API_KEY},
        timeout=60,
    )
    resp.raise_for_status()
    img = Image.open(io.BytesIO(resp.content)).convert('RGB')
    return img


def image_to_base64(img):
    buf = io.BytesIO()
    img.save(buf, format='PNG')
    return base64.b64encode(buf.getvalue()).decode('utf-8')

Step 2: Intelligence Extraction

The extraction prompt is the heart of the system. It needs to be specific about what to look for while flexible about layout.

from openai import OpenAI
import json

client = OpenAI()

PRICING_SCHEMA = {
    'plans': [
        {
            'name': 'string — plan name (e.g. Free, Pro, Enterprise)',
            'price': 'string — price as displayed (e.g. $29/mo, Custom, Free)',
            'price_annual': 'string — annual price if shown, else null',
            'key_features': 'array of strings — up to 5 most prominent features',
            'target_audience': 'string — who this plan seems designed for',
            'cta_text': 'string — the button/CTA text (e.g. Get Started, Contact Sales)',
            'highlighted': 'boolean — is this plan visually emphasized (popular badge, etc.)',
        }
    ],
    'positioning_notes': 'string — overall pricing strategy observations (e.g. freemium, usage-based, flat-rate)',
    'free_trial': 'boolean — is a free trial mentioned anywhere?',
    'money_back': 'string — money-back guarantee details if present, else null',
    'enterprise_contact': 'boolean — is there an enterprise/sales contact option?',
}


def extract_pricing_intelligence(img, competitor_name):
    """
    Extract structured pricing data from a screenshot using LLM vision.
    Uses a two-step process: raw observation first, then structured extraction.
    """
    img_b64 = image_to_base64(img)

    # Step 1: Raw observation
    observation_resp = client.chat.completions.create(
        model='gpt-4o',
        messages=[{
            'role': 'user',
            'content': [
                {
                    'type': 'text',
                    'text': (
                        f'This is the pricing page for {competitor_name}. '
                        'Describe everything you can see: all pricing plans, prices, features, '
                        'CTAs, badges, and any other relevant commercial information. '
                        'Be thorough and precise.'
                    )
                },
                {
                    'type': 'image_url',
                    'image_url': {
                        'url': f'data:image/png;base64,{img_b64}',
                        'detail': 'high',
                    }
                },
            ],
        }],
        max_tokens=1000,
    )
    observation = observation_resp.choices[0].message.content

    # Step 2: Structured extraction from the observation
    extraction_resp = client.chat.completions.create(
        model='gpt-4o',
        messages=[
            {
                'role': 'user',
                'content': (
                    f'Based on this description of the {competitor_name} pricing page:\n\n'
                    f'{observation}\n\n'
                    f'Extract the data into this exact JSON schema:\n{json.dumps(PRICING_SCHEMA, indent=2)}\n\n'
                    'Use null for missing fields. Be precise about prices — include the currency symbol and billing period.'
                )
            }
        ],
        response_format={'type': 'json_object'},
        max_tokens=800,
    )

    raw_data = json.loads(extraction_resp.choices[0].message.content)
    raw_data['_observation'] = observation  # Keep raw observation for audit trail
    raw_data['_captured_at'] = datetime.utcnow().isoformat()
    raw_data['_competitor'] = competitor_name

    return raw_data

Step 3: Change Detection

import hashlib


def compute_pricing_fingerprint(data):
    """
    Create a stable hash of the pricing data for change detection.
    Excludes metadata fields that change every run.
    """
    stable = {
        'plans': data.get('plans', []),
        'positioning_notes': data.get('positioning_notes'),
        'free_trial': data.get('free_trial'),
        'enterprise_contact': data.get('enterprise_contact'),
    }
    canonical = json.dumps(stable, sort_keys=True)
    return hashlib.sha256(canonical.encode()).hexdigest()


def detect_changes(current, previous):
    """
    Compare two pricing snapshots and return human-readable change descriptions.
    """
    if not previous:
        return ['First snapshot — no comparison available']

    changes = []

    # Plan-level changes
    current_plans = {p['name']: p for p in current.get('plans', [])}
    previous_plans = {p['name']: p for p in previous.get('plans', [])}

    # New plans
    for name in set(current_plans) - set(previous_plans):
        changes.append(f'NEW PLAN: {name} at {current_plans[name]["price"]}')

    # Removed plans
    for name in set(previous_plans) - set(current_plans):
        changes.append(f'REMOVED PLAN: {name} (was {previous_plans[name]["price"]})')

    # Price changes
    for name in set(current_plans) & set(previous_plans):
        curr = current_plans[name]
        prev = previous_plans[name]

        if curr.get('price') != prev.get('price'):
            changes.append(
                f'PRICE CHANGE: {name}: {prev["price"]} → {curr["price"]}'
            )

        if curr.get('highlighted') != prev.get('highlighted'):
            if curr['highlighted']:
                changes.append(f'NOW HIGHLIGHTED: {name} (new "popular" plan)')
            else:
                changes.append(f'NO LONGER HIGHLIGHTED: {name}')

    # Free trial change
    if current.get('free_trial') != previous.get('free_trial'):
        if current['free_trial']:
            changes.append('ADDED: Free trial now offered')
        else:
            changes.append('REMOVED: Free trial no longer offered')

    return changes if changes else ['No significant changes detected']

Step 4: The Full Agent

import os
import json
from pathlib import Path

COMPETITORS = {
    'CompetitorA': 'https://competitor-a.com/pricing',
    'CompetitorB': 'https://competitor-b.com/pricing',
    'CompetitorC': 'https://competitor-c.com/pricing',
}

SNAPSHOTS_DIR = Path('./ci_snapshots')
SNAPSHOTS_DIR.mkdir(exist_ok=True)


def load_previous_snapshot(competitor_name):
    path = SNAPSHOTS_DIR / f'{competitor_name}_latest.json'
    if path.exists():
        return json.loads(path.read_text())
    return None


def save_snapshot(competitor_name, data):
    # Save as latest
    latest_path = SNAPSHOTS_DIR / f'{competitor_name}_latest.json'
    latest_path.write_text(json.dumps(data, indent=2))

    # Also save timestamped version for history
    ts = datetime.utcnow().strftime('%Y%m%d_%H%M%S')
    history_path = SNAPSHOTS_DIR / f'{competitor_name}_{ts}.json'
    history_path.write_text(json.dumps(data, indent=2))


def run_competitive_intelligence():
    report_sections = []
    all_changes = {}

    for competitor, url in COMPETITORS.items():
        print(f'Capturing: {competitor} ({url})')

        try:
            img = capture_competitor_page(url)
            data = extract_pricing_intelligence(img, competitor)
            previous = load_previous_snapshot(competitor)

            current_fp = compute_pricing_fingerprint(data)
            previous_fp = compute_pricing_fingerprint(previous) if previous else None

            changes = detect_changes(data, previous)
            all_changes[competitor] = changes

            save_snapshot(competitor, data)

            # Build report section
            section = [f'## {competitor}']
            section.append(f'**URL**: {url}')
            section.append(f'**Positioning**: {data.get("positioning_notes", "N/A")}')
            section.append(f'**Free trial**: {"Yes" if data.get("free_trial") else "No"}')
            section.append('')

            for plan in data.get('plans', []):
                highlighted = ' ⭐' if plan.get('highlighted') else ''
                section.append(f'### {plan["name"]}{highlighted}')
                section.append(f'- **Price**: {plan.get("price", "N/A")}')
                if plan.get('price_annual'):
                    section.append(f'- **Annual**: {plan["price_annual"]}')
                section.append(f'- **CTA**: {plan.get("cta_text", "N/A")}')
                if plan.get('key_features'):
                    section.append('- **Features**:')
                    for f in plan['key_features'][:3]:
                        section.append(f'  - {f}')

            section.append('')
            section.append('**Changes since last run**:')
            for change in changes:
                section.append(f'- {change}')

            report_sections.append('\n'.join(section))

        except Exception as e:
            report_sections.append(f'## {competitor}\n\n⚠️ Extraction failed: {e}')
            all_changes[competitor] = [f'ERROR: {e}']

    # Compile full report
    has_changes = any(
        c != ['No significant changes detected'] and not any('ERROR' in ch for ch in changes)
        for changes in all_changes.values()
        for c in [changes]
    )

    report = f'# Competitive Intelligence Report\n'
    report += f'**Generated**: {datetime.utcnow().strftime("%Y-%m-%d %H:%M UTC")}\n'
    report += f'**Status**: {"⚠️ Changes detected" if has_changes else "✓ No changes"}\n\n'
    report += '\n\n---\n\n'.join(report_sections)

    return report, has_changes


report, has_changes = run_competitive_intelligence()
print(report)

# Only alert if there are real changes
if has_changes:
    # Send to Slack, email, etc.
    pass

What the Output Actually Looks Like

After six weeks of running this in production, here's a real-ish example of what the change detection catches:

## CompetitorA

Changes since last run:
- PRICE CHANGE: Pro: $49/mo → $59/mo
- NOW HIGHLIGHTED: Business (new "popular" plan)
- ADDED: Free trial now offered

That's a 20% price increase, a repositioning of their featured plan, and a new free trial — all in one week. The DOM scraper would have returned wrong data for two of those three because the table layout changed when they added the free trial banner. The vision agent caught all three.

Practical Notes

Accuracy: The two-pass approach (observation then extraction) is more accurate than asking for JSON directly from the vision prompt. In testing, direct JSON extraction hallucinated ~15% of the time on edge cases (prices with footnote markers, strikethrough pricing). The observation-first approach drops that to ~3%.

Cost per run: Three competitors × 2 LLM calls × ~$0.004 each = ~$0.024 per full run. For weekly runs that's ~$1.25/year. Negligible.

Rate limiting: Some competitor sites block rapid requests. Add a 2-second delay between competitors. The screenshot API handles the browser rendering; you're rate-limiting your calls to it.

Auth-gated pages: If a competitor requires login to see certain pricing tiers (enterprise quotes, etc.), this approach won't help there. But for public pricing pages — which is most of what matters — it works without credentials.

Seasonal campaigns: If a competitor runs a promotional discount for Black Friday, the agent will detect it as a price change. Flag these with a "POSSIBLE PROMO" label by checking if the price contains "limited time" or "through [date]" in the observation.

The Broader Lesson

The fragility of CSS scraping is a distribution problem. Your scraper encodes assumptions about the current implementation of an external system you don't control. Every assumption is a potential failure point. The more assumptions, the more failure points.

LLM vision minimizes the number of assumptions. The prompt says "describe what you see on this pricing page" — it doesn't assume a table layout, specific CSS classes, or any structural choices. The model handles the variance.

This is the same reason vision-based UI testing (Applitools, Percy) is replacing selector-based testing for visual regression. The fundamental insight is the same: some things are better understood by looking at them than by parsing their structure.

Competitive intelligence is one of those things.

Set Up Your Own

Free API key at hermesforge.dev/screenshot. The agent runs with just requests, Pillow, and openai — no browser binaries required.