How to Add Broken Link Checking to Your CI/CD Pipeline
How to Add Broken Link Checking to Your CI/CD Pipeline
Broken links are the kind of bug that slips through code review. They don't cause build failures, they don't trigger test failures, and they don't show up in staging until someone actually clicks them. By then, they're already in production.
The fix is simple: add a broken link check to your CI/CD pipeline that runs on every PR.
Why Check Links in CI/CD?
Most teams find broken links through: - Users reporting them (too late) - Periodic manual audits (too infrequent) - SEO crawl reports (delayed by days/weeks)
A CI/CD check catches them before merge. Every PR that touches content, docs, or link references gets validated automatically.
The API Approach
Instead of installing a link checker dependency (and managing its updates, configuration, and runtime), you can use a hosted API:
GET https://hermesforge.dev/api/deadlinks?url=YOUR_URL&format=github&threshold=95
This returns: - GitHub Actions annotations (errors appear inline on the PR diff) - A pass/fail result based on your health score threshold - Status codes for every link found on the page
No npm package. No Docker image. No dependency to maintain. Just an HTTP request.
GitHub Actions Setup
Create .github/workflows/check-links.yml:
name: Check Links
on:
pull_request:
paths:
- '**.md'
- '**.html'
- 'docs/**'
jobs:
check-links:
runs-on: ubuntu-latest
steps:
- name: Check for broken links
run: |
RESULT=$(curl -s "https://hermesforge.dev/api/deadlinks?url=${{ github.event.repository.homepage }}&format=github&threshold=90")
echo "$RESULT"
# Parse the RESULT line for pass/fail
if echo "$RESULT" | grep -q "RESULT: FAIL"; then
echo "::error::Broken links detected! Health score below threshold."
exit 1
fi
That's it. No action to install, no token to configure, no dependency to update.
What You Get
Inline Annotations
When the check runs with format=github, broken links appear as annotations directly on the PR:
::error file=index.html::Broken link: https://example.com/old-page (404)
::warning file=index.html::Redirect: https://example.com/moved (301 -> https://example.com/new)
Reviewers see the broken links right in the diff view — no need to check a separate report.
Pass/Fail Gating
The threshold parameter sets your minimum health score (0-100). A page with 95% healthy links and a threshold of 90 passes. Drop below the threshold and the check fails, blocking merge.
# Strict: all links must work
curl "https://hermesforge.dev/api/deadlinks?url=https://yoursite.com&threshold=100"
# Lenient: allow some broken external links
curl "https://hermesforge.dev/api/deadlinks?url=https://yoursite.com&threshold=80"
PR Comment Format
Want a markdown summary for PR comments? Use format=markdown:
REPORT=$(curl -s "https://hermesforge.dev/api/deadlinks?url=https://yoursite.com&format=markdown")
gh pr comment $PR_NUMBER --body "$REPORT"
This posts a formatted table of all links with their status codes.
Checking Multiple Pages
For sites with multiple entry points:
steps:
- name: Check all key pages
run: |
BASE="https://hermesforge.dev/api/deadlinks"
URLS=(
"https://yoursite.com"
"https://yoursite.com/docs"
"https://yoursite.com/blog"
)
FAILED=0
for URL in "${URLS[@]}"; do
echo "Checking: $URL"
RESULT=$(curl -s "$BASE?url=$URL&format=github&threshold=90")
echo "$RESULT"
if echo "$RESULT" | grep -q "RESULT: FAIL"; then
FAILED=1
fi
done
if [ $FAILED -eq 1 ]; then
exit 1
fi
Checking Internal Links Only
If external links are flaky (rate limiting, geoblocking), filter to internal only:
curl "https://hermesforge.dev/api/deadlinks?url=https://yoursite.com&check_only=internal&threshold=100"
Checking GitHub README Links
Got a repo README with links? Check it directly:
curl "https://hermesforge.dev/api/deadlinks?github=owner/repo"
This fetches the README from GitHub and checks every link in it.
Timeout for Large Sites
For sites with hundreds of links, set a max duration to avoid CI timeouts:
curl "https://hermesforge.dev/api/deadlinks?url=https://yoursite.com&max_pages=10&max_duration=30"
If the duration is exceeded, you get partial results (links checked so far) instead of a timeout error.
Comparison: API vs Local Tools
| Feature | API approach | html-proofer | linkchecker | broken-link-checker |
|---|---|---|---|---|
| Install | None | Ruby gem | Python package | npm package |
| Config | URL params | Config file | Config file | CLI flags |
| CI setup | 3 lines | 10+ lines | 10+ lines | 5+ lines |
| Updates | Automatic | Manual | Manual | Manual |
| GitHub annotations | Built-in | Plugin | No | No |
| Pass/fail gating | Built-in | Exit codes | Exit codes | Exit codes |
| PR comments | markdown format | Manual | Manual | Manual |
| Dependency | HTTP only | Ruby runtime | Python runtime | Node runtime |
The tradeoff: local tools give you offline capability and no external dependency. The API gives you zero-config setup and built-in CI integrations. For most teams, the API approach is simpler.
Rate Limits
The free tier allows 5 checks per day without an API key. For CI/CD (which typically runs on every PR), get a free API key:
curl "https://hermesforge.dev/api/keys?email=your@email.com"
This bumps you to 50 checks per day — enough for most teams.
Summary
Adding broken link checking to CI/CD doesn't require installing a dependency. A single curl command in your GitHub Actions workflow gives you inline annotations, pass/fail gating, and multiple output formats. Broken links get caught in the PR, not in production.