Writing
Technical deep-dives, architecture decisions, and lessons from building production systems.
Architecture
GPU Spot Market Arbitrage: Finding Value in Real-Time
The dashboard said "cheapest available: RTX 4090 at $0.35/hr." But the API showed 24 offers under $0.30. The dashboard was optimizing for the wrong thing.
Architecture
The SOCKS5 Escape Hatch: Networking from Restricted Containers
The GPU rental started. Tailscale installed. But nothing could connect out. The container was sandboxed. I needed a tunnel through the only hole they left open.
ML Pipelines
Speed vs. Completeness
When your enrichment service goes down, should your entire index stop updating? The two-phase ingestion pattern solves it.
ML Pipelines
Your Batch Size Shouldn't Be a Constant
I hardcoded batch_size=8. Then I rented an A100 with 80GB of VRAM. My code used 24GB and left 56GB sitting idle. Throughput could have been 4x higher.
Architecture
Building a Form Templating Engine
Every hospital had its own forms. Every department its own fields. Doctors requested modifications. Developers drowned in tickets. I built a templating engine. Doctors now configure their own forms.
Data Architecture
When Your Sources Lie
What happens when three different systems claim different truths about the same product? Trust hierarchies solve it.
Architecture
Concurrency Limits Aren't Just for CPUs
The queue said process 10 messages. My GPU said I can handle 4 at once. The queue won. The service crashed with OOM.
MLOps
Why I Version My ML Outputs
I upgraded the embedding model. Hundreds of thousands of items needed re-embedding. I kicked off the job. Three hours later, I realized half of them had already been processed by v2.
ML Pipelines
When Your Taxonomy Doesn't Match Reality
The input said 'robe.' My taxonomy said 'Dresses.' String matching said 'no match.' But they meant the same thing.
Frontend
Turning an Internal Tool into a PWA
The internal chatbot worked well. Too well. Teams wanted to access it from their phones, on the go, between client meetings. 'Can you make a mobile app?' I said yes. But not the one they imagined.
DevOps
When Your GPU Bill Becomes the Autoscaler
The queue had 10,000 items. I spun up a GPU. The queue emptied. The GPU kept running. For 6 hours. At $0.50/hour. I paid $3 to process nothing.
Architecture
Why Culture Beats Tech
The biggest win wasn't technical--it was cultural. When consultants started answering each other's questions through the platform instead of email chains, we knew it worked.
DevOps
Not All Errors Deserve the Same Retry
Rate limited? Wait and retry. Blocked? Rotate proxy immediately. Parsing failed? Don't retry at all.
ML Pipelines
When Stage 4 Fails
Your GPU ran out of memory. Should 10,000 items disappear into a retry queue--or get processed anyway, just worse?
Frontend
Adopting a Design System Mid-Project
Every PR contained a new button variant. Every form had its own styles. Juniors spent more time on CSS than on business logic. I proposed adopting ElementPlus. Six months later, our development velocity had doubled.
Search
Why Your Search Needs Both Keywords and Vectors
Semantic search understood intent perfectly but missed brand names. Keyword search found exact terms but missed concepts. The solution combines both using Reciprocal Rank Fusion.
Architecture
The Database Lock You Didn't Need
Two workers grabbed the same job. Both started processing. One overwrote the other's work. I almost added Redis. Then I remembered: my database already has locks.
Architecture
The Case for Internal Accelerators
Every SaaS needs auth, mailing, and background jobs. Instead of building these from scratch each time, I invested upfront in a battle-tested template.