// BLOG

Latest notes.

Notes from inside the build. No thought-leadership, no recycled takes, just what I figured out the hard way so the next person doesn't have to.

5 entries

April 8, 2026SLM

Small Language Models for Vertical Agents

Small models fine-tuned for a narrow task outperform generalist frontier models on that task, cost a fraction to run, and unlock deployment options the big ones cannot touch.

Saif Pasha

March 14, 2026Evals

AI Evals in Production: The Work Nobody Sees Until It Breaks

A model that scored well on benchmarks will quietly rot in production. Evals are the unglamorous work that turns an AI demo into a system you can operate.

Saif Pasha

February 24, 2026LLM

LLM Cost Control: Strategies That Actually Move the Bill

Most LLM cost advice stops at "use a smaller model." Real bills get reduced by caching, routing, context discipline, and knowing which 20% of calls drive 80% of spend.

Saif Pasha

February 6, 2026RAG

Agentic RAG: When Retrieval Becomes a Reasoning Step

Traditional RAG retrieves once and hopes. Agentic RAG treats retrieval as a loop the model controls, which changes everything about how you build the pipeline.

Saif Pasha

January 18, 2026MCP

MCP Servers in Production: What the Hype Misses

Model Context Protocol finally gives agents a clean contract for tool use, but the production story depends on auth, versioning, and rate limits most teams skip.

Saif Pasha