Products
Selene API
Eval Copilot (beta)
Pricing
Open Source
Company
Mission
Careers
Blog
Docs
Contact
Start for free
Start for free
Sign in
Sign in
Sign up
Sign up
Blog
Identifying & auto-correcting agent failures: findings from τ-bench
Nina
April 29, 2025
Introducing the Atla MCP Server: purpose-built LLM Judges now at your command
Atla team
April 22, 2025
Latest posts
LLM Judges as Reward Models
Henry
October 31, 2024
Selecting a training objective for an AI evaluator (SFT vs. DPO vs. RPO)
Andrei
October 22, 2024
Evaluating GenAI applications with LLM‑as‑a‑judge
Kyle
October 8, 2024
“AI’s $600B Question” and AGI’s $34T Answer
Maurice
September 10, 2024
Scaling Alignment: Training AI Evaluators to Capture Human Preferences
Maurice
July 11, 2024
Previous