Benchmark Detail View

School Library Management

MEDIUM Challenge43 models testedTop Score: 79.9
Success Rate
59.2%
Quality Score
84
Tests Passed
17
Models Tested
43
School Library Benchmark - Individual Model Results
Showing 43 of 43 models
1
Claude 3 Haiku
Claude08/2025
08/2025
79.978.6%
2
Grok 4
xAI08/2025
08/2025
79.578.6%
3
Horizon Beta
Other08/2025
08/2025
78.178.6%
4
OpenAI o4-mini
OpenAI08/2025
08/2025
76.175.0%
5
R1
DeepSeek08/2025
08/2025
76.175.0%
6
OpenAI o1-mini
OpenAI08/2025
08/2025
75.975.0%
7
Grok 3 Mini
xAI08/2025
08/2025
75.775.0%
8
OpenAI o3-mini (High)
OpenAI08/2025
08/2025
75.575.0%
9
OpenAI o3-mini
OpenAI08/2025
08/2025
75.575.0%
10
OpenAI GPT-4o
OpenAI08/2025
08/2025
75.375.0%
11
OpenAI o4-mini (High)
OpenAI08/2025
08/2025
74.575.0%
12
OpenAI GPT-4.1
OpenAI08/2025
08/2025
74.375.0%
13
Nova Pro V1
Amazon08/2025
08/2025
73.371.4%
14
Mistral Medium 3
Mistral08/2025
08/2025
73.171.4%
15
Claude 4 Sonnet
Claude08/2025
08/2025
72.771.4%
16
OpenAI GPT-4.1 nano
OpenAI08/2025
08/2025
71.971.4%
17
Nova Lite V1
Amazon08/2025
08/2025
70.967.9%
18
OpenAI GPT-4o mini
OpenAI08/2025
08/2025
70.771.4%
19
Coder Large
Other08/2025
08/2025
66.964.3%
20
Nova Micro V1
Amazon08/2025
08/2025
66.364.3%
21
Gemini 2.5 Flash Lite
Google08/2025
08/2025
62.260.7%
22
Grok 3
xAI08/2025
08/2025
60.857.1%
23
DeepSeek V3
DeepSeek08/2025
08/2025
57.853.6%
24
Llama 4 Scout
Meta08/2025
08/2025
56.853.6%
25
Gemini 2.5 Flash
Google08/2025
08/2025
56.653.6%
26
Gemini 2.0 Flash-001
Google08/2025
08/2025
56.253.6%
27
Qwen3 14b
Alibaba08/2025
08/2025
55.853.6%
28
OpenAI GPT-4
OpenAI08/2025
08/2025
55.853.6%
29
Kimi K2
Moonshot08/2025
08/2025
53.850.0%
30
Gemini 2.5 Pro
Google08/2025
08/2025
53.850.0%
31
Qwen 3 Coder
Alibaba08/2025
08/2025
53.250.0%
32
Claude 4 Opus
Claude08/2025
08/2025
53.050.0%
33
Claude 3.7 Sonnet (Thinking)
Claude08/2025
08/2025
51.246.4%
34
Claude 3.7 Sonnet
Claude08/2025
08/2025
51.246.4%
35
Claude 3.5 Sonnet
Claude08/2025
08/2025
51.046.4%
36
OpenAI GPT-4 Turbo
OpenAI08/2025
08/2025
50.646.4%
37
Claude 3.5 Haiku
Claude08/2025
08/2025
50.046.4%
38
OpenAI GPT-4.1 mini
OpenAI08/2025
08/2025
48.246.4%
39
Llama 4 Maverick
Meta08/2025
08/2025
47.842.9%
40
Codestral 25.08
Mistral08/2025
08/2025
47.242.9%
41
OpenAI GPT-4o
OpenAI08/2025
08/2025
44.639.3%
42
OpenAI GPT-3.5 Turbo
OpenAI08/2025
08/2025
41.535.7%
43
Command A
Cohere08/2025
08/2025
13.03.6%

Top Performers

School Library Champions
#1
Claude
79.9

Claude 3 Haiku

Success Rate
78.6%
22
Tests Passed
Q
92
Quality
4
Issues
28 total tests
#2
xAI
79.5

Grok 4

Success Rate
78.6%
22
Tests Passed
Q
88
Quality
6
Issues
28 total tests
#3
Other
78.1

Horizon Beta

Success Rate
78.6%
22
Tests Passed
Q
74
Quality
13
Issues
28 total tests

Explore More Benchmarks

See how models perform across different programming challenges and complexity levels.