All model metrics
Search, sort, and compare reference board models in the selected size class (direct vendor APIs).
How to read scores
Each number is a 0–100 checklist composite from our lab battery — not a real-world accuracy percentage or a user preference ranking.
- 85+ Strong on the tested checklist
- 65–84 Solid, with room to improve
- <65 Early or mixed results — common on strict v1 gates
Compile means the model passed engine routing gates for deployment — not a product endorsement. Full methodology
Fast models: one curated pick per major direct API vendor (OpenAI, Anthropic, xAI, Google Gemini, Mistral, DeepSeek).
Wide table — scroll sideways on desktop, or view as cards on mobile.
| Identity | Capability | Safety | Performance | Status | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model | Vendor | Deploy | Accuracy | Reasoning | Coding | Slop | Reliability | Cap. safety | Jailbreak | PII | Bias | Latency | Cost | Stability | Badges | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| mistral-small-latest | Mistral | 54.4% | 52.5% | 25% | 60% | 0% | 80% | 66.7% | 83.3% | 0% | 0% | 80.1% | 50% | 100% | Below compile bar Blocked | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Model key: mistral-small-latest
Model ID: mistral-small-latest
Size band: le8b
Provenance: provider_standard
Capability pack: bench-pack-v2
Safety pack: bench-pack-safety-v2
Latency P95: 1988.3 ms
Throughput P50: 40.9 tps
Cost / task: $0.000058
Strengths
Standards
Slop profiles
Compile gates
Safety gates
Weakness tags
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||