Engineering Writing

Thoughts on Building Software

Deep dives into engineering decisions, lessons learned, and practical guides for building production-grade systems.

Next.jsSystem DesignData EngineeringAI/MLBest Practices

April 5, 2026•4 min read

That Time Your Pipeline Ran Successfully and Deleted 75% of Your Data

Your DAG completed. No errors. Success metrics green. Then your dashboard showed 75% fewer records than yesterday. Here's what happened — and why it kept happening.

Data Engineering

AirflowData WarehousingOrchestration

April 5, 2026•5 min read

Your ML Model Passed All Tests. Then It Failed in Production.

Model evaluation: 94% accuracy. Production: wrong predictions everywhere. Your model is fine. Your features are lying to you.

Machine Learning

MLFeature StoresData Warehousing

April 5, 2026•4 min read

Why Your Flawless AI Demo Failed in Production

Your AI demo was flawless. The model answered every question. Stakeholders approved the budget. You deployed to production. Two weeks later, it's falling apart. Here's why — and it's not the model.

Data Engineering

KafkaStreamingData Architecture

March 23, 2026•6 min read

Why Every LLM Security Tool Misses Multi-Turn Attacks — And What That Costs You

Stateless tools score 0% on progressive extraction, rephrased blocked attempts, and cross-agent attacks. Here's why the architecture is the problem, and what a stateful approach looks like.

LLM Security

LLMSecurityAI SafetyOWASP

March 10, 2025•3 min read

Stateful LLM Security: Lessons from Building StreamGuard

Building StreamGuard taught me that stateful security is table stakes for production LLM apps. Here's what I learned about session history, multi-turn attacks, and architecture decisions.

Security

PythonLLMRedisDynamoDBFastAPI

March 5, 2025•4 min read

Real-Time Data Processing: Flink vs Kafka Streams vs Spark Streaming

Three stream processing frameworks, different strengths. Here's when to use each one based on actual production experience.

Data Engineering

KafkaFlinkSparkStreaming

February 28, 2025•6 min read

Why Your Kafka Pipeline Will Break When You Add an LLM to It

Most teams wire LLM calls directly into Kafka consumers. Here's why that fails in production and what to do instead.

March 10, 2024•5 min read

Why Architecture Matters: Lessons from a Production Outage

A deep dive into how a single architectural decision caused a weekend-long outage and what we learned.

System Design

System DesignPostgreSQLNext.jsKubernetes

Enjoyed these articles?

Interested in discussing ideas from these articles, or want to collaborate on content? I'd love to hear from you.

Let's Connect Send a Message

raushansingh116@gmail.com linkedin.com/in/singhraushan github.com/raushan-s