Docs
Company
About
Learn about our team and culture
Careers
Open positions at Atla
Security
How Atla protects its users
Case Studies
Fieldly
How Fieldy uses Atla alongside LangSmith to ship agent improvements
twice as fast
ClaimWise
How ClaimWise spots failure modes of their agent prompts in days instead of weeks
Josepha
How Atla uncovered critical agent failures in JOSEPHA's Deep Research Agent
Research
Selene Models
The best models for evaluation on the market.
Blog
What’s the latest from the Atla labs
Pricing
Sign in
Book a demo
Docs
About
Careers
Security
Selene Models
Pricing
Blog
Contact Us
Updates from Atla
What works (and what doesn’t) when automating error analysis
Sashank
October 7, 2025
Why your evals keep breaking
Sashank
September 24, 2025
Latest posts
How to use Selene Mini locally in LM Studio
Kyle
February 25, 2025
From reward to reason - the role of LLM judges in training models like DeepSeek-R1
Sashank
February 11, 2025
Cookbooks to get started with Selene Mini
Sashank
February 6, 2025
Selene 1 Mini: the best small language model-as-a-judge
Atla team
January 27, 2025
How to build a general purpose LLM evaluator: Lessons from our literature review
Andrei
January 16, 2025
Inferring the Overseer: Insights from the AISI Research Sprint
Henry
December 18, 2024
Aligning AI with AI-Assisted Human Feedback
Maurice
December 12, 2024
Evaluating our Evaluator: Early Results
Nina
December 3, 2024
Training an LLM-as-a-Judge with Synthetic Data
Andrei
November 25, 2024
Previous
Load more
Load more