We were getting ready to redesign and simplify phobos.petabridge.com - our Akka.NET observability platform documentation site. The plan was to remove a bunch of old pages, restructure the information architecture, and redirect everything properly so we wouldn’t break any inbound links from Google, Stack Overflow, or the blog posts referencing our documentation.
The problem: how do we know we’re not blowing up external links during this restructuring? We needed full, measurable, observable control over the sitemap as we made changes. Every redirect had to work. Every removed page needed a destination. One broken link could cost us traffic and credibility. At its core, this is a continuous integration problem.
So I built LinkValidator - a CLI tool to validate all internal and external links in our statically generated sites during CI/CD on Azure DevOps and GitHub Actions.
The goal: fail the build if we break anything. Crawl the site, validate every link, and catch problems before they reach production.
Then I hit an immediate problem.
Our documentation has links to localhost resources - Grafana dashboards at http://localhost:3000
, Prometheus at http://localhost:9090
, Jaeger tracing at http://localhost:16686
. These are part of our observability stack demos. They’re 100% correct when someone’s running the demo locally, but they fail validation every time in CI because there’s no Grafana instance running on a GitHub Actions runner.
I needed a way to selectively suppress link validation, and I wanted it to be contextual - right there in the HTML where the “broken” link lives, not buried in some global configuration file that’s completely divorced from the context of the page itself.
Here’s how I built that in C# using HtmlAgilityPack and a little sprinkling of Akka.NET.