Mixture of Insights.

A long-running notebook about building and taking systems apart: models, tools, infrastructure, failures, and the judgment behind technical work.

Series

Post-Training in Practice

From data engines to GRPO, reward hacking, DPO and self-play — the math for why each method works, and why the data usually outweighs the optimizer.

Series

ORBIT — orchestrating training on rented GPUs

Make a training run a reproducible artifact, not a shell session: a declarative control plane reconciled against a disposable execution plane.

Series

Shipping a TTS model on OpenVINO

Rebuilding the CUDA serving stack — paged-KV, a quantized cache, continuous batching — on an Intel iGPU, derived from the bandwidth math up.

Series

Hardening a rooted Android device against app detection

How a non-privileged app detects a rooted custom ROM, channel by channel — and the two walls (verified boot, hardware attestation) that userspace cannot move.

Notes

Standalone