What to Log When Your API Consumer Has No Browser Session

2026-04-02 | Tags: [observability, logging, api, b2a, monitoring, ai-agents, autonomous-systems]

When a human developer hits a 500 error, they open the browser console, read the error message, search for the status code, and (sometimes) file a bug report. You get a support ticket with a timestamp and a description of what they were doing.

When an autonomous agent hits a 500 error, it either fails silently, retries according to its configured backoff policy, or logs the error to a system that nobody checks until something more dramatic breaks. You get nothing — unless you built your logging with that in mind.

The observability patterns for B2A APIs are not the same as for developer tools. The signals are different, the failure modes are different, and the action surface is different.

What you lose without a browser

A browser-based consumer gives you almost everything for free:

Referrer: where they came from (a tool page, a search result, a docs link)
User-Agent: the browser, OS, and version — enough to know they're human and roughly what environment they're in
Session signals: cookies, timing patterns that cluster by session
Navigation history: which pages they visited before making the API call, giving context about what they were trying to do
Error visibility: they saw the response body (probably)

An autonomous agent gives you: - An IP address (often a cloud datacenter range) - A User-Agent string that is either a library default (python-httpx/0.27.0) or whatever the developer hardcoded - No referrer (direct HTTP calls have none) - No session signals (each call is independent unless you built session tracking into the agent) - No guarantee the developer ever saw the error response

This isn't a problem to solve by asking agents to send richer headers — most agents won't, and you can't rely on it. It's a problem to solve by logging what you can actually observe and deriving useful signals from it.

The fields that matter

The minimum useful log entry for a B2A API call:

{
  "timestamp": "2026-04-13T09:17:42.831Z",
  "request_id": "req_01abc...",
  "api_key_id": "hf_key_01xyz...",
  "endpoint": "/api/screenshot",
  "method": "POST",
  "status_code": 200,
  "latency_ms": 847,
  "response_bytes": 84231,
  "client_ip": "20.169.78.141",
  "user_agent": "python-httpx/0.27.0",
  "error_code": null,
  "params_hash": "sha256:a3f9..."
}

A few things to notice:

api_key_id not api_key. The key ID is not secret (it's the non-secret half of the key/secret split from the previous post). It appears in every log entry. The secret never does. This lets you correlate logs to accounts, build per-key dashboards, and investigate abuse — all without storing secrets in logs.

params_hash not params. You don't want to log the full request body — it may contain URLs, content, or other information that is sensitive or just large. A SHA-256 of the normalized parameters lets you detect when the same agent is making identical calls repeatedly (stuck retry loop, duplicate processing) without logging the content itself.

latency_ms not just status code. A 200 that takes 8 seconds is a different problem than a 200 that takes 800ms. For agents that have their own timeout budgets, a slow 200 may trigger a client-side timeout that the server never sees — it just stops getting calls from that client.

Signals that are useful and surprising

Call gap anomaly: An agent that has been calling your API every 5 minutes for three weeks and then stops is more alarming than one that was never consistent. The absence of expected calls is a signal. You need a rolling call pattern per key to detect this.

-- Keys with calls in the last 7 days but none in the last 24h
SELECT api_key_id, MAX(timestamp) as last_call, COUNT(*) as calls_7d
FROM api_calls
WHERE timestamp > NOW() - INTERVAL '7 days'
GROUP BY api_key_id
HAVING MAX(timestamp) < NOW() - INTERVAL '24 hours'
  AND COUNT(*) > 10;

This query catches the most important silent failure mode: the agent stopped working and the developer doesn't know. A proactive email ("we haven't seen a call from your key hf_key_01xyz in 48 hours — is everything working?") is actionable and rare enough to be welcome, not spammy.

Repeat-identical-call detection: An agent calling the same endpoint with the same parameters 50 times in a minute is probably stuck in a retry loop. The error it's retrying may be intermittent (a 429, a timeout) and the loop may resolve itself — or it may be burning the agent's budget calling a URL that will never return a useful result.

Status code distribution shift: An agent's normal pattern might be 98% 200, 2% 429. If that suddenly shifts to 40% 429, the agent has changed its call rate (probably a scale event) or the rate limit has become a binding constraint. The useful metric here is the 429 rate per key per hour, not just the total count.

Error code after success: An agent that gets a 200 response but immediately retries the same call is telling you something. Either it couldn't parse your response (the 200 contained an error in the body — an anti-pattern, but common), or its internal validation failed. You can't directly observe this, but a pattern of same-key, same-endpoint calls with very short inter-call intervals is a signal.

Latency percentiles for agents vs. humans

Human developers tolerate high latency poorly but handle it gracefully — they wait, the browser shows a spinner, they move on. Agent consumers have a different latency profile:

They have configured timeouts (often 30s, sometimes much shorter for pipeline steps)
They may be running in a pipeline where your API call is on the critical path
A timeout failure often means the entire pipeline task fails and is retried

The metric that matters is not p50 (median) latency — it's p95 and p99. If your p95 is 4 seconds and an agent has a 3-second timeout, roughly 5% of that agent's calls will time out client-side. You'll never see these timeouts in your logs (from your perspective, those were successful responses), but the agent is failing on them. The signal is the agent's call rate dropping to zero after a period of elevated latency.

Log p95 and p99 per endpoint per hour. Alert when p95 exceeds half the expected agent timeout for that endpoint. That half-margin gives you warning before timeouts become the dominant failure mode.

What the log is not for

Resist the temptation to build the log into a dashboard that requires human review. No one will review it reliably. The value of B2A observability is in automated signals — alerts, emails, anomaly notifications — that reach the right person at the right time without requiring anyone to open a dashboard.

The log is the raw material. The signals are derived automatically. The actions are triggered by thresholds, not by a human noticing something unusual in a table of numbers.

The next post in this arc covers how to build those automated alert thresholds: when to alert, who to alert, and how to avoid alert fatigue when your consumers are running at all hours.

Part of the API observability for autonomous agents arc. Previous: API key management for always-on agents. Next: alerting thresholds for B2A APIs.