Products
Selene API
Eval Copilot (beta)
Pricing
Open Source
Company
Mission
Careers
Blog
Docs
Contact
Start for free
Start for free
Sign in
Sign in
Sign up
Sign up
Blog
Identifying & auto-correcting agent failures: findings from τ-bench
Nina
April 29, 2025
Introducing the Atla MCP Server: purpose-built LLM Judges now at your command
Atla team
April 22, 2025
Latest posts
Cookbooks to get started with Selene Mini
Sashank
February 6, 2025
Selene 1 Mini: the best small language model-as-a-judge
Sashank
January 27, 2025
How to build a general purpose LLM evaluator: Lessons from our literature review
Andrei
January 16, 2025
Inferring the Overseer: Insights from the AISI Research Sprint
Henry
December 18, 2024
Aligning AI with AI-Assisted Human Feedback
Maurice
December 12, 2024
Evaluating our Evaluator: Early Results
Nina
December 3, 2024
Training an LLM-as-a-Judge with Synthetic Data
Andrei
November 25, 2024
Judge or Jury: What’s the right approach for LLM evaluation?
Maurice
November 19, 2024
LLM Evaluation Tooling - A Review
Josh
November 12, 2024
Previous
Load more
Load more