Updates from Atla

Trees, not logs: structured evaluation of agent traces

Trees, not logs: structured evaluation of agent traces

Atla team

December 2, 2025

The problem with voice agents that no one’s talking about

The problem with voice agents that no one’s talking about

Henry

November 13, 2025

Latest posts

Comparing AI Agent Frameworks: A Guide to Building Reliable Agents

Comparing AI Agent Frameworks: A Guide to Building Reliable Agents

Kyle

June 12, 2025

AI agent failures in DA-Code: identifying errors and fixing them through critique

AI agent failures in DA-Code: identifying errors and fixing them through critique

Atla team

May 28, 2025

Why LLM Agents Still Fail

Why LLM Agents Still Fail

Atla team

May 20, 2025

Use Selene with Langwatch’s Evaluation Wizard

Use Selene with Langwatch’s Evaluation Wizard

Atla team

May 6, 2025

Identifying & auto-correcting agent failures: findings from TAU-bench

Identifying & auto-correcting agent failures: findings from TAU-bench

Nina

April 29, 2025

Introducing the Atla MCP Server: purpose-built LLM Judges now at your command

Introducing the Atla MCP Server: purpose-built LLM Judges now at your command

Atla team

April 22, 2025

Selene Mini: SOTA 8B LLM Judge, now available via API

Selene Mini: SOTA 8B LLM Judge, now available via API

Atla team

April 15, 2025

Announcing Atla’s native integration with Langfuse

Announcing Atla’s native integration with Langfuse

Atla team

March 25, 2025

Best practices for evaluating AI across multiple criteria

Best practices for evaluating AI across multiple criteria

Sashank

March 20, 2025