Cekura Addresses AI Agent Quality Assurance Gap
Cekura, a Y Combinator F24 startup, launched an AI agent testing platform on March 3, 2026, that uses simulation-based testing to evaluate agent behavior across voice and chat interactions. The platform addresses a critical gap in AI agent development: testing behavioral regressions without manual spot-checking or waiting for production failures.
Traditional Testing Methods Don't Scale for AI Agents
Manual QA doesn't scale when agents handle thousands of interaction patterns. The founding team—Tarush, Sidhant, and Shashij—identified that most teams rely on manual spot-checking, production monitoring, or brittle scripted tests. When teams ship new prompts, swap models, or add tools, they lack reliable ways to verify correct behavior across the full range of user interactions.
Simulation-Based Testing with LLM Judges
Cekura's approach uses synthetic users that interact with agents like real users, with LLM-based judges evaluating responses across full conversational arcs rather than individual turns. The platform includes three core features:
- Scenario generation that bootstraps test suites from agent descriptions and extracts test cases from production conversations
- Mock tool platform that simulates API calls without touching production systems
- Deterministic test cases defined as structured conditional action trees
Full-Session Evaluation Catches Multi-Turn Failures
Unlike traditional tracing platforms such as Langfuse and LangSmith that evaluate turn-by-turn, Cekura evaluates complete sessions. This catches failures that look fine in isolation but represent regressions across multi-turn interactions. For example, if an agent skips a verification step but proceeds anyway, individual turns appear correct, but the full session fails.
Company Has 1.5 Years of Voice Agent Experience
The team has operated voice agent simulation for 18 months before extending the same infrastructure to chat. Cekura has raised $2.4M and works with 75+ customers across healthcare, BFSI, logistics, recruitment, and retail. The platform offers a 7-day free trial with paid plans starting at $30 per month.
Key Takeaways
- Cekura provides simulation-based testing for AI agents that evaluates full conversational sessions rather than individual turns
- The platform includes mock tools, scenario generation from production data, and deterministic test cases
- The team has 1.5 years of experience running voice agent simulations before expanding to chat
- Full-session evaluation catches behavioral regressions that turn-by-turn tracing tools miss
- Paid plans start at $30 per month with a 7-day free trial available