All model metrics
Search, sort, and compare reference board models in the selected size class (direct vendor APIs).
How to read scores
Each number is a 0–100 checklist composite from our lab battery — not a real-world accuracy percentage or a user preference ranking.
- 85+ Strong on the tested checklist
- 65–84 Solid, with room to improve
- <65 Early or mixed results — common on strict v1 gates
Compile means the model passed engine routing gates for deployment — not a product endorsement. Full methodology
Fast models: one curated pick per major direct API vendor (OpenAI, Anthropic, xAI, Google Gemini, Mistral, DeepSeek).
No models profiled in this size class.