Infographic
MLEB Benchmark Map
A practical view of legal embedding benchmark coverage and model performance, designed to help Counterbench users pick safer defaults for retrieval.
Models scored
23
Datasets
12
Open-source models
15
65% of all models
Closed/API models
8
Dataset mix
Coverage by legal task family
Case law6
Regulation3
Contracts3
Top quality band
Highest average scores on MLEB
Kanon 2 Embedder0.8186
Voyage 4 Large0.8105
Voyage 40.7961
Voyage 4 Lite0.7644
Qwen3 Embedding 8B0.7588
Qwen3 Embedding 4B0.7477
Gemini Embedding 0010.7207
Jina Embeddings v5 Text Small0.7103
Why this helps Counterbench
Direct product value, not research theater
Benchmark-backed defaults let you recommend model tiers with evidence rather than vendor claims.
Category-level scores (case law, regulation, contracts) map cleanly to your existing playbooks and prompt packs.
Open-source and speed dimensions support practical deployment tradeoffs for legal teams with cost and compliance constraints.