Fast Wrong Is Worse Than Slow Right
The fastest optimization I tested was wrong. The working redesign used multi-scale benchmarks, row-level validation, and parallel AI-agent workflows.
Principal consultant with deep experience across enterprise data platforms, time series / telemetry ML, and hands-on AI delivery in messy real-world environments.
~1 week Cross-platform migration validator from zero to live dashboard
5–10x faster SQL pipeline optimization with row-level correctness validation
40% faster AI-driven quality assessment pilot for circular fashion
Team Lead Interim lead & project manager for DS and advanced analytics
The problem: You are migrating between data platforms and nobody can answer “is the target data correct?” with confidence. Spot-checks and row counts are not enough.
What I do: Build systematic cross-platform validation — automated comparison of schemas, row counts, key distributions, and date ranges across dozens of tables. I design the framework, deploy dashboards the team can trust, and surface real defects early.
Example outcomes: Validated dozens of tables across Databricks and Snowflake in under a week. Surfaced a row-count discrepancy of more than 2x on the first automated run.
The problem: Your warehouse bill is growing faster than your data. Pipelines run on brute-force full scans. Nobody has profiled the actual bottlenecks or tested whether an incremental approach is both faster and correct.
What I do: Profile pipeline execution step by step, benchmark alternatives at multiple data scales and warehouse sizes, and validate correctness with row-level comparisons — not just aggregate checksums.
Example deliverables: Profiling reports, benchmark harnesses across warehouse sizes, correctness-validation checks, and rollout plans for incremental pipeline changes.
The problem: Your analytics team spends too much time on manual notebook runs, repetitive SQL, and copy-paste workflows. You have heard about coding agents but need someone who has shipped real work with them, not just demos.
What I do: Design and implement coding-agent workflows for migration, validation, and pipeline modernization. I write the harness, the test discipline, and the operating constraints that make agents reliable in production.
Example outcomes: Used a coding agent to build a full validation workflow — discovery, per-pipeline scripts, unified validator, historical storage, and live dashboards — across seven working sessions.
The problem: You have sensor, telemetry, or time series data and need production ML — activity recognition, anomaly detection, forecasting, or classification — but your team’s ML experience is limited or focused elsewhere.
What I do: Design and deliver time series ML systems from evaluation harness to production deployment. I specialize in approaches that are fast enough for edge and IoT (ROCKET family, lightweight models) while maintaining accuracy.
Example outcomes: Delivered a time series classification system with state-of-the-art accuracy and 10-100x faster inference than deep learning baselines. Established in-house activity recognition from CAN/telemetry data.
Redesigned a years-old daily full-recompute pipeline into a validated incremental one — 5–10x faster with row-level correctness proof. The fastest approach tested was wrong; multi-scale benchmarks and parallel AI-agent workflows found the working design.
Built a production-grade validation system for a large Databricks-to-Snowflake migration: automated comparison across dozens of tables, historical result storage, and a live team dashboard — from zero to deployed in about a week using a coding agent.
Delivered a time series classification system for industrial telemetry: state-of-the-art accuracy with 10-100x faster inference than deep learning, designed for edge and IoT deployment.
End-to-end AI system for second-hand garment quality assessment: object detection, synthetic data generation, and pilot deployment. 40% faster processing, 50%+ reduction in data collection effort.
Let’s talk.