PyCon Lithuania 2026, Data Day track · Keynote · April 9, 2026 · Vilnius
Technical Debt: When to Pay It Down vs. When to Just Live With It
Every team has code that makes them cringe. Legacy pipelines nobody dares touch, "temporary" tables that became the source of truth, and SQL queries that outlived three team leads. We call it technical debt, and the instinct is always the same: fix it.
But should we? After migrating from Redshift to Snowflake, learning PIG just to rewrite a pipeline deprecated months later, and surviving Airflow OOM kills, I have opinions. And maybe a framework slightly better than "it depends."
Download slides (PDF)The Code We Don't Talk About
The talk opens with something everyone in the room recognizes. Every team has code in production that makes engineers uncomfortable. Legacy pipelines nobody dares touch. Temp tables that somehow became the source of truth. A SQL query over a thousand lines long that has outlived three team leads. The instinct, always, is to fix it. But should you?
Three definitions of technical debt, in order of usefulness. The textbook version: shortcuts taken for speed. The honest version: the gap between your current codebase and what you would build if you started today. The actionable version: debt that is actively slowing you down or costing money. That last one is the only definition worth acting on.
Personal Examples (Yes, Real Ones)
Three real examples: an ETL pipeline from 2018 that nobody fully understands anymore, a temp table someone named "One Source of Truth" that became exactly that, and a SQL query over a thousand lines long that has quietly outlived everyone who wrote it. Not hypothetical. These are things I shipped.
The point is not shame. This is normal. It also opens up Martin Fowler's Debt Quadrant: deliberate vs inadvertent, reckless vs prudent. Most real-world debt lands in the prudent-inadvertent corner: "now we know how we should have done it." That is not failure. It is just how software evolves.
Deciding What to Fix
Before touching any piece of debt, a two-by-two matrix: pain level on one axis, cost to fix on the other. High pain, low cost: fix it this sprint. High pain, high cost: make it a project and get buy-in. Low pain, low cost: do it when you have slack. Low pain, high cost: leave it alone. That last quadrant is where a lot of well-intentioned rewrites go wrong.
Three War Stories
Redshift to Snowflake. The before picture: three separate Airflow environments, no dev environment, pipelines slow enough to block daily reporting, costs that kept climbing. The migration took 10 to 12 months. After: one production Airflow environment, a proper dev environment, a 20% cost reduction, and engineers who actually liked working in the stack again. The pain was high, spread across the whole team, and the cost was manageable with proper planning. This one was worth doing.
The PIG Pipeline. A legacy PIG job had been running weekly batch processing without incident for years. I learned PIG specifically to rewrite it in Spark. Cleaner code, modern patterns. The right way to do it. The rewrite took one to one and a half months. Five months after it shipped, the upstream data source was deprecated and the pipeline was killed. The lesson: working code that is ugly is often better than perfect code that took a long time to build. Before rewriting anything, ask whether the system it depends on is likely to survive.
Airflow Consolidation. Before: Airflow on EC2 with Docker-in-Docker, no dev environment, no infrastructure as code, pure ClickOps. After: MWAA, a single managed environment, CI/CD pipelines, autoscaling, better monitoring. Mostly. The catch: the old debt was replaced with new debt. Worker autoscaling still hits bottlenecks at peak load. OOM kills still happen, just in a different place. Migrating away from debt does not mean debt-free. You trade one set of problems for a hopefully better set.
Five Questions Before You Refactor
A checklist worth keeping before starting any refactor:
- Is this actually causing pain, or just offending your engineering sensibilities?
- Can you measure the cost?
- What is the opportunity cost of fixing it now versus building the next feature instead?
- Will it get worse if you leave it?
- Can you get stakeholder buy-in?
If you cannot answer yes to most of these, you probably should not start.
Stable vs Compounding Debt
Not all debt compounds. A single ugly function nobody touches is stable. An isolated legacy system that works is stable. Bad architecture that delivers reliably is stable. Missing tests, inconsistent patterns, and absent documentation are compounding, because they spread. The debt worth paying down is the kind that gets worse the longer you leave it.
Talking to Stakeholders
"The code is messy" gets you nowhere. "We spend five engineer-hours per week on workarounds" gets a meeting. Drop "best practices" and "technical hygiene." Use numbers and consequences instead: a concrete cost per month, the specific feature this debt is blocking, the number of production incidents it caused in the last quarter. That is the language.
Four traps most teams fall into: the Perfection Trap (rewriting because it bothers you, not because it costs anything), the Shiny Object Trap (adopting new tech because it is new), the Ignore-It Trap (pretending the debt will pay itself down), and the Death March Trap (committing to a full rewrite with no incremental value along the way).
Technical Debt in the Age of AI
AI tools change the economics. Understanding legacy code faster, generating test coverage, migrating patterns at scale: things that used to be expensive are getting cheaper. More items shift from long strategic projects to quick wins on your matrix.
The risk is a new category of debt. Code that works but nobody on the team fully understands. Accepted suggestions that were not reviewed carefully. "It works" substituting for "it is maintainable." AI-generated debt compounds faster than hand-written debt, because the team did not write it and may not know how to change it.
Living with Debt Intentionally
Document what you keep. Monitor it. Rotate ownership so knowledge does not concentrate in one person. Review the list quarterly. On the budget: 15 to 20 percent of each sprint is a reasonable baseline for debt work, as a continuous habit rather than a periodic big bang. Track the ratio by tagging work as Feature, Tech Debt, Bug, Toil, or Ad-hoc. If Tech Debt never appears in the tag list, something is being ignored.
The debt catalog is a simple spreadsheet: what, pain level, cost to fix, type, status. The point is not a perfect tracking system. The point is making the debt visible, so decisions about it are intentional rather than reactive.
Key takeaways
- Not all debt needs to be paid down.
- Measure pain and cost before refactoring.
- Translate debt into business language your stakeholders understand.
- Budget for debt work continuously, not in big-bang cleanups.
- Document and monitor the debt you choose to keep.
- AI lowers the cost of fixing debt but introduces new categories of it.
Technical debt is not a failure. It is a tool. Use it wisely.