Benchmark Detail View

School Library Management

MEDIUM Challenge82 models testedTop Score: 79.9
Success Rate
59.1%
Quality Score
82
Tests Passed
17
Models Tested
82
School Library Benchmark - Individual Model Results
Showing 82 of 82 models
1
Claude 3 Haiku
Claude08/2025
08/2025
79.978.6%
2
Grok 4
xAI08/2025
08/2025
79.578.6%
3
Horizon Beta
Other08/2025
08/2025
78.178.6%
4
OpenAI GPT-5.2 Chat
OpenAI12/2025
12/2025
76.775.0%
5
Claude 4.6 Opus
Claude02/2026
02/2026
76.575.0%
6
OpenAI GPT-5.1 Chat
OpenAI11/2025
11/2025
76.375.0%
7
OpenAI o4-mini
OpenAI08/2025
08/2025
76.175.0%
8
Claude 4.5 Sonnet
Claude10/2025
10/2025
76.175.0%
9
R1
DeepSeek08/2025
08/2025
76.175.0%
10
Openai Oss 120b
OpenAI08/2025
08/2025
76.175.0%
11
OpenAI o1-mini
OpenAI08/2025
08/2025
75.975.0%
12
Grok 3 Mini
xAI08/2025
08/2025
75.775.0%
13
OpenAI o3-mini
OpenAI08/2025
08/2025
75.575.0%
14
OpenAI o3-mini (High)
OpenAI08/2025
08/2025
75.575.0%
15
OpenAI GPT-4o
OpenAI08/2025
08/2025
75.375.0%
16
Glm 4 7
Other12/2025
12/2025
74.775.0%
17
OpenAI o4-mini (High)
OpenAI08/2025
08/2025
74.575.0%
18
OpenAI GPT-5
OpenAI09/2025
09/2025
74.575.0%
19
OpenAI GPT-4.1
OpenAI08/2025
08/2025
74.375.0%
20
Nova Pro V1
Amazon08/2025
08/2025
73.371.4%
21
Mistral Medium 3
Mistral08/2025
08/2025
73.171.4%
22
OpenAI GPT-5 mini
OpenAI09/2025
09/2025
72.975.0%
23
OpenAI 5.1 Codex
OpenAI11/2025
11/2025
72.771.4%
24
Claude 4 Sonnet
Claude08/2025
08/2025
72.771.4%
25
Kimi K2
Moonshot12/2025
12/2025
72.571.4%
26
Mimo V2 Flash Free
Other12/2025
12/2025
72.571.4%
27
OpenAI GPT-4.1 nano
OpenAI08/2025
08/2025
71.971.4%
28
Sonoma Sky Alpha
Other09/2025
09/2025
71.371.4%
29
Nova Lite V1
Amazon08/2025
08/2025
70.967.9%
30
OpenAI GPT-4o mini
OpenAI08/2025
08/2025
70.771.4%
31
OpenAI 5.1 Codex Mini
OpenAI11/2025
11/2025
70.167.9%
32
OpenAI GPT-5 mini
OpenAI08/2025
08/2025
69.371.4%
33
Coder Large
Other08/2025
08/2025
66.964.3%
34
Grok Code Fast 1
xAI09/2025
09/2025
66.564.3%
35
Nova Micro V1
Amazon08/2025
08/2025
66.364.3%
36
Gemini 3 Flash
Google12/2025
12/2025
64.260.7%
37
Gemini 3 Pro Preview
Google11/2025
11/2025
64.060.7%
38
Minimax M2 1
Other12/2025
12/2025
63.660.7%
39
OpenAI GPT-5.2
OpenAI12/2025
12/2025
63.460.7%
40
Gemini 2.5 Flash Lite
Google08/2025
08/2025
62.260.7%
41
Openai Oss 20b
OpenAI08/2025
08/2025
62.260.7%
42
Grok 3
xAI08/2025
08/2025
60.857.1%
43
Claude 4.5 Opus
Claude11/2025
11/2025
59.857.1%
44
OpenAI 5 Codex
OpenAI10/2025
10/2025
59.257.1%
45
DeepSeek V3
DeepSeek08/2025
08/2025
57.853.6%
46
OpenAI 5.2 Codex
OpenAI01/2026
01/2026
57.453.6%
47
DeepSeek V3
DeepSeek12/2025
12/2025
57.453.6%
48
Mistral Large 2512
Mistral12/2025
12/2025
57.253.6%
49
Llama 4 Scout
Meta08/2025
08/2025
56.853.6%
50
Claude 4.1 Opus
Claude08/2025
08/2025
56.653.6%
51
Gemini 2.5 Flash
Google08/2025
08/2025
56.653.6%
52
Qwen 3 Coder
Alibaba10/2025
10/2025
56.653.6%
53
Gemini 2.0 Flash-001
Google08/2025
08/2025
56.253.6%
54
OpenAI GPT-4
OpenAI08/2025
08/2025
55.853.6%
55
Qwen3 14b
Alibaba08/2025
08/2025
55.853.6%
56
Gemini 2.5 Pro
Google08/2025
08/2025
53.850.0%
57
Kimi K2
Moonshot08/2025
08/2025
53.850.0%
58
Qwen 3 Coder
Alibaba08/2025
08/2025
53.250.0%
59
Claude 4 Opus
Claude08/2025
08/2025
53.050.0%
60
DeepSeek V3
DeepSeek10/2025
10/2025
52.650.0%
61
Claude 3.7 Sonnet
Claude08/2025
08/2025
51.246.4%
62
Claude 3.7 Sonnet (Thinking)
Claude08/2025
08/2025
51.246.4%
63
Claude 3.5 Sonnet
Claude08/2025
08/2025
51.046.4%
64
Qwen3 Max
Alibaba10/2025
10/2025
50.846.4%
65
OpenAI GPT-4 Turbo
OpenAI08/2025
08/2025
50.646.4%
66
Glm 4 5
Other08/2025
08/2025
50.646.4%
67
Kimi K2
Moonshot10/2025
10/2025
50.246.4%
68
Claude 3.5 Haiku
Claude08/2025
08/2025
50.046.4%
69
Glm 4 6
Other10/2025
10/2025
49.646.4%
70
Claude 4.5 Haiku
Claude10/2025
10/2025
49.246.4%
71
Grok 4
xAI10/2025
10/2025
49.046.4%
72
OpenAI GPT-4.1 mini
OpenAI08/2025
08/2025
48.246.4%
73
Llama 4 Maverick
Meta08/2025
08/2025
47.842.9%
74
Codestral 25.08
Mistral08/2025
08/2025
47.242.9%
75
OpenAI GPT-5 nano
OpenAI08/2025
08/2025
46.042.9%
76
OpenAI GPT-4o
OpenAI08/2025
08/2025
44.639.3%
77
OpenAI GPT-5 Chat
OpenAI08/2025
08/2025
44.439.3%
78
Devstral 2512
Other12/2025
12/2025
44.039.3%
79
OpenAI GPT-3.5 Turbo
OpenAI08/2025
08/2025
41.535.7%
80
OpenAI GPT-5
OpenAI08/2025
08/2025
41.439.3%
81
OpenAI GPT-5.1
OpenAI11/2025
11/2025
31.532.1%
82
Command A
Cohere08/2025
08/2025
13.03.6%

Top Performers

School Library Champions
#1
Claude
79.9

Claude 3 Haiku

Success Rate
78.6%
22
Tests Passed
Q
92
Quality
4
Issues
28 total tests
#2
xAI
79.5

Grok 4

Success Rate
78.6%
22
Tests Passed
Q
88
Quality
6
Issues
28 total tests
#3
Other
78.1

Horizon Beta

Success Rate
78.6%
22
Tests Passed
Q
74
Quality
13
Issues
28 total tests

Explore More Benchmarks

See how models perform across different programming challenges and complexity levels.