2025

an archive of posts from this year

Dec 29, 2025 Similarity metrics for embeddings
Dec 25, 2025 Tokenizers: production economics cheat-sheet
Dec 19, 2025 The metric gap: bridging business outcomes and AI component optimization
Dec 15, 2025 Reflection vs evaluation: why the Agent-Critic pattern fails without separation of concerns
Dec 12, 2025 Vector search + hard filters in Elasticsearch: the hidden RAG bottleneck
Dec 09, 2025 Architecture design: a constraint-satisfaction approach
Dec 05, 2025 Classification with LLMs: getting accurate probabilities from structured output
Dec 02, 2025 Token optimization: three production patterns that reduce LLM costs by 70%
Nov 26, 2025 Hierarchical signal tuning: optimizing components before fusion
Nov 22, 2025 Jensen-Shannon divergence for meaningful clustering
Nov 19, 2025 Hybrid intent classification: compact-encoder-first routing for production systems
Nov 15, 2025 Few-shot prompt ordering: the impact of example position
Nov 10, 2025 Temporal for LLM pipelines: durable execution starter pack
Nov 07, 2025 GraphRAG: beyond vector search for connecting the dots
Nov 04, 2025 Domain-driven design for AI systems: architectural patterns and production experience
Oct 30, 2025 Semantic prompt caching: when LLM-judge beats exact match
Oct 27, 2025 The reranking trap: when cross-encoders make things worse
Oct 24, 2025 Structured output engineering for production LLMs
Oct 20, 2025 The chunk size dilemma: identifying the optimal value in RAG systems
Oct 10, 2025 Mitigating positional bias in LLM-as-a-judge evaluation: the swapping technique
Oct 06, 2025 Hybrid retrieval with RRF: solving the score normalization problem
Oct 02, 2025 LLM orchestration: a pragmatic guide to complexity
Sep 28, 2025 How Qdrant's scalar quantization cut our RAG latency by 3x
Sep 21, 2025 Why VLMs ignore visual evidence (and how to fix it)
Sep 14, 2025 Our agents argued endlessly. Here's how a hybrid AI pattern tamed LLM chaos
Sep 09, 2025 VLM pipeline debugging: lessons from visual monitoring
Sep 05, 2025 Machine learning metrics for undefined projects: 3 critical mistakes
Sep 02, 2025 Pragmatic LLM debugging: a survival guide to chaos