Docs
Company
Mission
Careers
Security
Pricing
Research
Blog
Selene Mini (Open Source)
Contact
Start for free
Start for free
Sign up
Get Started
Blog
Why Deep Research Agents Fail: Lessons from GAIA
Sashank
July 10, 2025
Comparing AI Agent Frameworks: A Guide to Building Reliable Agents
Kyle
June 12, 2025
Latest posts
Best practices for evaluating AI across multiple criteria
Sashank
March 20, 2025
Build custom eval metrics with the Eval Copilot (formerly Alignment Platform)
Atla team
March 5, 2025
Frontier AI needs frontier evaluators. Meet Selene.
Atla team
February 26, 2025
How to use Selene Mini locally in LM Studio
Kyle
February 25, 2025
From reward to reason - the role of LLM judges in training models like DeepSeek-R1
Sashank
February 11, 2025
Cookbooks to get started with Selene Mini
Sashank
February 6, 2025
Selene 1 Mini: the best small language model-as-a-judge
Sashank
January 27, 2025
How to build a general purpose LLM evaluator: Lessons from our literature review
Andrei
January 16, 2025
Inferring the Overseer: Insights from the AISI Research Sprint
Henry
December 18, 2024
Previous
Load more
Load more