Guides from the machine room.

How small models, RAG pipelines, humanoid robots and world models actually work — written by the team that deploys them behind firewalls, with code you can run on your own hardware.

[all 26] [Fine-Tuning] [Engineering] [Strategy] [Robotics] [Foundations] [Azure] [RAG]

RAG 16 Feb 2026 8 min Python

Embeddings Explained: Building Semantic Search Over 100,000 Documents

What an embedding vector really encodes, why cosine similarity works, and a complete on-prem semantic search index in 60 lines.

Read the guide →

RAG 02 Feb 2026 10 min SQL · Python

Vector Databases On-Prem: pgvector vs. Qdrant for Self-Hosted RAG

When Postgres with pgvector is enough, when Qdrant earns its place, and the HNSW parameters that actually matter.

Read the guide →

RAG 26 Jan 2026 12 min Python

RAG From First Principles: Architecture of a Private Retrieval Pipeline

Chunking, embedding, hybrid retrieval, re-ranking and grounding — the full anatomy of a production RAG system that never leaves your network.

Read the guide →