Infographic

MLEB Benchmark Map

A practical view of legal embedding benchmark coverage and model performance, designed to help Counterbench users pick safer defaults for retrieval.

Models scored
23
Datasets
12
Open-source models
15
65% of all models
Closed/API models
8
Open shortlist toolBrowse resources
Dataset mix

Coverage by legal task family

Case law6
Regulation3
Contracts3
Top quality band

Highest average scores on MLEB

Kanon 2 Embedder0.8186
Voyage 4 Large0.8105
Voyage 40.7961
Voyage 4 Lite0.7644
Qwen3 Embedding 8B0.7588
Qwen3 Embedding 4B0.7477
Gemini Embedding 0010.7207
Jina Embeddings v5 Text Small0.7103
Why this helps Counterbench

Direct product value, not research theater

Benchmark-backed defaults let you recommend model tiers with evidence rather than vendor claims.

Category-level scores (case law, regulation, contracts) map cleanly to your existing playbooks and prompt packs.

Open-source and speed dimensions support practical deployment tradeoffs for legal teams with cost and compliance constraints.

Build model shortlistView attribution