28 May 2025 5 min read DevOps

DevOps in the Catacombs – Everyday Software Archaeology and Why I’d Still Bet on a Monorepo

Panoramic view of the Pompeii ruins in spring 2024, photographed from a nearby hill, with the archaeological site spread out © Sergio Fernández

I’m coming up on ten years with DevOps in my job title, badge, Slack handle—whatever. In that time the word has stretched like pizza-dough: some days it means “person who can also write the missing Python script the backend team never had time for,” other days it’s a grandiose “CTO-at-large” who is expected to pick the CI/CD stack, design the cloud topology, mentor the junior frontend hire, and brief the CFO on why the AWS bill doubled last month. In between those extremes sits the task I bump into most often:

Software archaeology.

That’s the unglamorous art of digging through a maze of unrelated Git repositories, hoping to unearth the single line of YAML or environment variable that will let a feature work, a service boot, or an invoice stay under budget. If you’re already nodding, welcome to the club; if not, here’s a field report from one of my more memorable digs.

How a Sentry Upgrade Turned Into a Spelunking Expedition

Some years ago, one of my clients—an e-commerce scale-up—asked for what sounded like a routine chore: “Hey, our self-hosted Sentry is three major versions behind, can you bring it up to date?”

No problem, right? Except:

The newer Sentry releases moved a few features behind a paywall.
The SaaS Terms of Service changed in ways Legal didn’t love.
The related Docker images now demanded more CPU and RAM.

Suddenly an “upgrade” felt suspiciously like “time to shop for alternatives.” After a bit of reconnaissance I proposed GlitchTip. It checks most of the open-source boxes, keeps the familiar Sentry SDKs, and doesn’t penalize you for hosting it yourself. Yes, you lose some bells and whistles—no automatic performance tracing and the UI is a tad spartan—but for error tracking it’s more than fine. And the price tag for the commercial cloud plan? A fraction of Sentry’s.

Everyone nodded; we green-lit the migration.

GlitchTip’s Docs: A Choose-Your-Own-Adventure Novel

I spun up a test cluster and—boom—ran straight into my first wall: we were planning Redis in sentinel mode for high availability, and GlitchTip’s docs simply shrugged. The official install page back then listed plain Redis and called it a day.

Great.

Time to switch hats from “DevOps” to “archaeologist.”

GlitchTip, as I soon learned, is spread across multiple GitLab projects:

• glitchtip-frontend
• glitchtip-helm-charts
• glitchtip-terraform
• glitchtip-backend
• …and a couple of half-archived helpers

The guide links only to the meta-repo; the sentinel keyword appears in exactly zero READMEs. After an hour of grepping (as GitLab is not yet offering a valuable code search tool), I nearly gave up—until a stray Dockerfile referenced an ENV called CACHE_URL. That smelled like Django’s config pattern, so I dove into the backend repo directly. Page 570 of settings.py finally revealed the grail:

CACHE_SENTINEL_URL = env("CACHE_SENTINEL_URL", default=None)

One innocuous line, undocumented anywhere else, yet the difference between a resilient deployment and an angry pager at 3 a.m. Once I knew the variable name, wiring sentinel Redis took ten minutes. Finding it took hours.

As of this year, the install guide now graciously mentions sentinel. You’re welcome, future travelers.

The Day-to-Day Reality of a DevOps Job

That story isn’t an outlier; it’s Tuesday. Whether you’re deploying a shiny service mesh, resurrecting a flaky CI runner, or re-tagging Docker images so your air-gapped registry stops choking, the pattern is the same:

Something vaguely infra-related breaks or needs an upgrade.
Official docs are incomplete, contradictory, or nonexistent.
You hunt through issue trackers, commit logs, and abandoned branches until a breadcrumb appears.
You automate the fix, document it locally, and pray the project maintainers merge your pull request.

If the phrase “works on my machine” is a developer meme, “it’s probably in another repo” is the DevOps equivalent.

Why Polyrepos Make the Digging So Much Harder

GlitchTip isn’t unique; tons of modern stacks are polyrepo by default. The marketing site lives in one repo, API in another, Helm charts in a third, Terraform in a fourth, and so on. On paper that modularity feels clean. In practice:

• Discovery is awful—GitHub/ GitLab search only works when you already know the keyword.

• Cross-cutting changes (think moving a config variable from Docker to Helm) require multiple pull requests and synchronized reviews.

• Documentation scatters; each repo ships its own README, style guide, and sometimes conflicting environment samples.

And that is why, whenever I have green-field influence, I start with a monorepo.

My Monorepo Bias (and Its Honest Drawbacks)

A single repository is not a panacea. At scale it needs special tooling (Bazel, Nx, or your own half-mad scripts), and yes, cloning gigabytes of history can feel like downloading Netflix over dial-up.

But for a team that’s still learning its own product boundaries, the benefits are enormous:

• One git grep shows you all references to CACHE_SENTINEL_URL.
• One PR can rename a function across backend, worker, and CLI without juggling sub-modules.
• A unified README forces you to write a single source of truth—or at least makes the drift obvious.
• On-boarding becomes “clone the repo, run the dev script,” not “here’s a Notion doc with twelve URLs.”

Splitting later is easy: filter-branch, subtree-split, or just copy directories into fresh repos. Combining late is a nightmare.

So my rule of thumb echoes the old startup advice: default to monorepo until it hurts, then measure the pain before you eject pieces.

DevOps, Archaeology, and You

If you’re new to this line of work, the takeaway isn’t “avoid polyrepos” or even “use GlitchTip.” It’s this:

DevOps is 50 % automation, 30 % communication, and 20 % archaeology.

You might spend a week designing a zero-downtime deployment strategy, then burn the same amount of time spelunking through code to find the environment flag that makes it possible. Both tasks matter. Both make you valuable. The second rarely appears on the sprint board, but trust me, people remember the engineer who unbricked production at midnight.

Parting Advice for Fellow Tomb Raiders

Treat every config variable you discover like an artifact: label it, document it, and store it somewhere searchable—Confluence, a markdown repo, even a sticky note wall if that’s all you have.
When you solve a mystery, upstream the docs. Future-you (and strangers on the internet) will thank you.
Invest early in unified search: Sourcegraph, OpenGrok, or at least good old ripgrep wired into your editor.
When a project spans more than two repos, create a “curation” README that links them all. A breadcrumb is better than silence.
Remember that DevOps scope creep is normal. One day you’re debugging Helm, the next you’re reviewing product telemetry GDPR compliance. Embrace the learning curve; it keeps the job interesting.

Epilogue

GlitchTip has been running smoothly in sentinel mode for years now. The engineers stopped getting 2 a.m. alerts, finance loves the lower bill, and the install guide finally references the feature I dug up in the catacombs.

Some victories are loud, some are footnotes in a changelog. Either way, they start the same way: with a DevOps engineer willing to pick up a shovel and start digging.

Happy excavating.