Blog

    Notes on data & AI engineering

    Practical, no-fluff writing on the things I build for clients — retrieval-augmented generation, event tracking, experimentation, MLOps, and shipping AI features that hold up in production.

    ExperimentationJune 4, 20266 min read

    CUPED explained: faster A/B tests with variance reduction

    A practical explanation of CUPED for A/B testing: how variance reduction works, when it helps, and what to watch before trusting the results.

    Data VisualizationMay 28, 20266 min read

    Dashboard design for executives: clarity before charts

    How to design executive dashboards that leaders actually use: decision framing, metric hierarchy, context, drill-downs, speed, and trust.

    AI AutomationMay 21, 20267 min read

    An AI workflow automation playbook for operations teams

    How to find, scope, and ship reliable AI workflow automations for operations: intake, triage, enrichment, routing, reporting, human review, and observability.

    AnalyticsMay 14, 20267 min read

    Marketing attribution with first-party data

    How to build practical marketing attribution with first-party events, UTMs, ad platform data, CRM stages, revenue, and transparent assumptions.

    Analytics EngineeringMay 7, 20266 min read

    A dbt analytics engineering checklist for trustworthy metrics

    A dbt checklist for analytics engineering: sources, staging models, marts, tests, documentation, naming, performance, and dashboard ownership.

    LLM EngineeringApril 30, 20267 min read

    How to choose a vector database for RAG

    A practical guide to choosing a vector database for RAG: pgvector, Pinecone, Weaviate, Qdrant, filtering, hybrid search, scale, and operations.

    AnalyticsApril 23, 20266 min read

    Product analytics metrics that actually matter

    How to choose product analytics metrics that support decisions: activation, retention, adoption, conversion, guardrails, and north-star metrics.

    AnalyticsApril 16, 20266 min read

    Server-side tracking explained for analytics and attribution

    What server-side tracking is, when it helps, when it adds unnecessary complexity, and how to design it for cleaner analytics and attribution.

    LLM EngineeringApril 9, 20266 min read

    A RAG evaluation checklist for production AI systems

    A practical checklist for evaluating RAG systems: retrieval relevance, source coverage, grounded answers, citations, abstention, latency, and feedback loops.

    LLM EngineeringApril 2, 20267 min read

    LLM evaluation: what to measure before an AI feature ships

    A production-focused guide to LLM evaluation: golden datasets, groundedness, retrieval quality, refusal behavior, latency, cost, and regression tests.

    LLM EngineeringMarch 18, 20265 min read

    How to reduce LLM hallucinations in production

    Practical techniques to reduce LLM hallucinations: retrieval grounding, citations, evaluation harnesses, output guardrails, and knowing when to make the model say 'I don't know'.

    ExperimentationMarch 5, 20266 min read

    5 A/B testing mistakes that quietly ruin your results

    Peeking, sample-ratio mismatch, underpowered tests, ignored guardrails, and multiple comparisons — the common A/B testing mistakes that lead to confident but wrong decisions.

    MLOpsFebruary 26, 20266 min read

    From notebook to production: an MLOps checklist

    A practical MLOps checklist for shipping machine learning models to production: reproducible training, deployment, monitoring, evaluation, retraining, and cost control.

    LLM EngineeringFebruary 10, 20267 min read

    What is RAG? A practical guide to Retrieval-Augmented Generation

    A plain-English guide to Retrieval-Augmented Generation (RAG): what it is, how the pipeline works, where it beats fine-tuning, and how to keep answers grounded and accurate.

    AnalyticsJanuary 22, 20266 min read

    How to design an event tracking plan that scales

    A practical framework for designing an event tracking plan: naming conventions, schema versioning, governance, and validation that keep your analytics clean as you grow.