Blog

AI, voice agents & platform engineering

Long-form posts on voice AI, WhatsApp automation, RAG, and building production-grade customer platforms.

92 posts

Article

20 articlesClear filter
6 min read
ArticleMay 8, 2026

Windsurf and Codeium: The Underdogs of AI IDE

Windsurf — the AI-native IDE from the company formerly known as Codeium — spent 2025 as the most-rumored M&A target in developer tooling. After the OpenAI deal fell through and Cognition AI bought the company in December 2025, the question shifted from "who will own Windsurf?" to "what is Windsurf a…

Read more
6 min read
ArticleMay 8, 2026

AI-Powered Debugging Tools in 2026

Debugging in production is mostly archaeology — finding the trace, the log line, and the diff that explain why something broke. AI debugging tools in 2026 are not about replacing the engineer doing that archaeology; they're about cutting the time-to-context from "twenty minutes of dashboard hopping"…

Read more
6 min read
ArticleMay 8, 2026

AI in Testing: Auto-Generation, Mutation, Coverage

"AI generated my tests" was the 2024 selling point. By 2026 the conversation has moved on to a harder question: are those generated tests actually any good? Coverage numbers say yes; mutation testing says often no. The 2026 stack pairs AI generation with mutation analysis as the truth-teller — and t…

Read more
6 min read
ArticleMay 8, 2026

AI Documentation Tools: From Docstrings to Knowledge Bases

Documentation in 2026 has two readers, not one. Humans still scan headings, copy code samples, and search the FAQ. But the second reader — the AI coding agent inside Cursor, Claude Code, or Copilot — is now consuming docs at a rate that, on many sites, exceeds human traffic. That shift has reshaped …

Read more
5 min read
ArticleMay 8, 2026

AI Startup GTM in 2026: What's Actually Working

The classic SaaS playbook — outbound SDR teams, MEDDIC qualification, 12-month enterprise sales cycles — is breaking against AI-native buying behavior. In 2026, the AI startups crossing $10M ARR fastest are the ones that have abandoned most of that motion. Here is what the live data shows is working…

Read more
6 min read
ArticleMay 8, 2026

AI Product Pricing: Per-Token, Per-Seat, Per-Outcome

Pricing AI products is harder than pricing SaaS for one structural reason: unlike a database row, an AI inference has a real, variable cost. That single fact reshapes every pricing decision. Here are the four pricing models actually deployed in 2026, what each is good for, and where each breaks. Why…

Read more
5 min read
ArticleMay 8, 2026

Qwen 3.5: Alibaba's Multilingual Powerhouse

Alibaba's Qwen line has quietly become the multilingual default for the open-weight world. The Qwen 3.5 release in February 2026 cemented that — the family now spans 201 languages and dialects, leads instruction-following benchmarks, and sets a new baseline for what an open-weight model can do acros…

Read more
5 min read
ArticleMay 8, 2026

Voice Agent Architecture in 2026: LiveKit, Pipecat, and the End of the Pipeline

For most of voice AI's history, the mental model was a pipeline: microphone → STT → LLM → TTS → speaker. Each stage was a discrete component, and the framework's job was to connect them. By 2026 that model is breaking down — partly because of multimodal models that fuse stages, partly because of arc…

Read more
5 min read
ArticleMay 8, 2026

Why Model Context Protocol (MCP) Won the Agent Integration Wars

Eighteen months ago Model Context Protocol (MCP) was an Anthropic-released standard with a small reference implementation and a handful of integrations. As of March 2026, monthly SDK downloads passed 97 million, over 10,000 active public MCP servers exist, and 78% of enterprise AI teams report at le…

Read more
5 min read
ArticleMay 8, 2026

On-Device AI in 2026: Apple Intelligence, Phi, and the Local LLM Renaissance

For most of LLMs' history, "local model" meant either "demo-quality" or "you own a GPU." In 2026 that has shifted. Small models tuned for consumer hardware are crossing the threshold of usefulness — not parity with frontier models, but good enough that real apps are shipping with on-device inference…

Read more
5 min read
ArticleMay 8, 2026

The Agentic AI Stack: From Tool Use to Autonomous Workflows

"Agent" was the most overused word in AI in 2024. By 2026 the term has stratified — a real agent stack now has identifiable layers, each with its own design decisions, failure modes, and competitive landscape. Here is how the stack looks today. Layer 1: The model This is the bottom of the stack and …

Read more
5 min read
ArticleMay 8, 2026

Gemini 3.1 Pro Benchmarks Explained: ARC-AGI-2 and Beyond

On February 19, 2026, Google released Gemini 3.1 Pro and the benchmark headline that followed was unusual: a verified score of 77.1% on ARC-AGI-2, more than double the previous Gemini 3 Pro number on the same test. ARC-AGI-2 is a benchmark designed to be hard for memorization, so a jump that size is…

Read more
5 min read
ArticleMay 8, 2026

DeepSeek R2: The Open-Source Reasoning Surprise

DeepSeek's R2 is the model that made open-weight reasoning a real category in 2026. Reasoning models — the variants that explicitly think before answering — were a closed-vendor club through 2025. R2 changed that: a 32B-parameter open-weight checkpoint that runs on a single 24GB consumer GPU and cle…

Read more
5 min read
ArticleMay 8, 2026

Inside GPT-5.5 Pro: OpenAI's Power-User Tier

GPT-5.5 Pro is the variant most users never touch — it costs roughly six times as much as standard GPT-5.5, requires a Pro/Business/Enterprise plan, and is reserved for the hardest single-shot tasks. But for the workloads that need it, nothing else in the OpenAI lineup is comparable. Here's where Pr…

Read more
5 min read
ArticleMay 8, 2026

Claude Mythos: Anthropic's Security-Focused Frontier

On April 7, 2026, Anthropic unveiled Claude Mythos Preview — a model the company described as "by far the most powerful AI model we've ever developed" — and immediately did something most labs don't: refused to release it publicly. Mythos is the most concrete public artifact yet of frontier AI being…

Read more
5 min read
ArticleMay 8, 2026

GPT-Rosalind: OpenAI's Frontier Reasoning for Science

On April 16, 2026, OpenAI launched GPT-Rosalind, a frontier reasoning model built specifically for drug discovery, genomics, protein reasoning, and scientific research workflows. It's named for Rosalind Franklin, the British chemist whose X-ray crystallography work was central to discovering the str…

Read more
5 min read
ArticleMay 8, 2026

Gemma 4: Google's Open-Weight Push for 2026

Google's Gemma line has always been the open-weight cousin to the closed-source Gemini family — same training pipeline, same research lineage, public weights, permissive license. Gemma 4 is the 2026 release, and the headline is that the 31B dense variant beats Llama 4 Scout on most reasoning benchma…

Read more
6 min read
ArticleMay 8, 2026

The Context Window Arms Race: 1M to 10M Tokens

The 2026 context-window numbers look science-fiction at first glance: Llama 4 Scout at 10 million tokens, Claude Opus 4.7 at 1 million (at standard pricing, no premium), Gemini 3.1 Pro at 1 million, Mistral Medium 3.5 at 256K. A single prompt can now hold the equivalent of 15,000 pages of text. The …

Read more
5 min read
ArticleMay 8, 2026

Voice Cloning in 2026: Ethics, Consent, and Compliance

Voice cloning crossed the uncanny-valley line in 2024. By 2026 it has crossed the legal one too. What used to be a research curiosity is now a production capability available from a dozen vendors, and regulators on both sides of the Atlantic are catching up. If you ship a product that synthesizes a …

Read more
5 min read
ArticleMay 8, 2026

Real-Time Voice Translation: The State of the Art

Real-time voice translation has been "two years away" for about a decade. In 2026 it finally landed in production — not as a perfect Star Trek universal translator, but as a set of constrained, latency-aware pipelines that work well enough for international meetings, customer support, and consumer a…

Read more