Ruby LLM benchmarks - AI Model Performance Dashboard

RRuby LLM Benchmarks

Models Tested

187

Providers

New This Month

models added

Latest Added

Claude Fable 5

2026-07

187 models

1–50 of 187

#	Model	Config	Score ↓	Success	Quality	Date
1	GPT-5.5OpenAI Configlow	low	80.3	79.7%	86	2026-04
2	↳ GPT-5.5 Configxhigh	xhigh	77.6	78.4%	70	2026-04
3	↳ GPT-5.5 Confighigh	high	77.4	77.3%	78	2026-04
4	Claude Fable 5Anthropic Configlow	low	77.4	78.0%	72	2026-06
5	↳ Claude Fable 5 Configlow	low	77.2	78.0%	70	2026-07
6	Claude Opus 4.7Anthropic Configmedium	medium	77.1	78.0%	69	2026-05
7	Claude Fable 5Anthropic Confighigh	high	77.0	78.0%	68	2026-07
8	↳ Claude Fable 5 Configmedium	medium	76.9	78.0%	67	2026-07
9	Claude Opus 4.6Anthropic Confighigh· 8,192 tokens	high· 8,192 tokens	76.4	77.2%	70	2026-04
10	Claude Opus 4.7Anthropic Configlow	low	76.4	77.2%	70	2026-05
11	Claude Sonnet 4.6Anthropic Confighigh· 8,192 tokens	high· 8,192 tokens	76.3	77.2%	69	2026-04
12	Claude Opus 4.6 FastAnthropic Config—	—	76.2	77.2%	68	2026-04
13	Claude Opus 4.6Anthropic Configmedium· 2,048 tokens	medium· 2,048 tokens	76.2	77.2%	68	2026-04
14	↳ Claude Opus 4.6 Config—	—	76.1	77.2%	67	2026-02
15	GPT-5.5OpenAI Config—	—	76.0	75.2%	83	2026-04
16	Claude Sonnet 4.6Anthropic Configmedium· 2,048 tokens	medium· 2,048 tokens	76.0	77.0%	67	2026-04
17	GLM 5Z.AI Config—	—	75.6	77.2%	61	2026-02
18	Claude Opus 4.8 (Fast)Anthropic Configmedium	medium	75.2	75.6%	71	2026-05
19	↳ Claude Opus 4.8 (Fast) Confighigh	high	74.9	75.6%	69	2026-05
20	GPT-5.5OpenAI Confighigh	high	74.9	74.8%	76	2026-05
21	GPT 5.4OpenAI Confighigh	high	74.8	73.5%	87	2026-04
22	Claude Opus 4.8 (Fast)Anthropic Configlow	low	74.7	75.6%	66	2026-05
23	GLM 5.2Z Ai Configmedium	medium	74.2	74.8%	69	2026-06
24	Kimi K2.6Moonshotai Configlow	low	74.2	73.6%	80	2026-04
25	GPT-5.5OpenAI Configmedium	medium	73.9	72.7%	85	2026-04
26	Claude Opus 4.8Anthropic Confighigh	high	72.6	72.7%	73	2026-05
27	Claude Fable 5Anthropic Confighigh	high	72.6	72.7%	72	2026-06
28	Claude Sonnet 4Anthropic Config—	—	72.6	73.7%	63	2025-08
29	DeepSeek V4 ProDeepSeek Configgen	gen	72.5	72.9%	69	2026-05
30	Claude Fable 5Anthropic Configmedium	medium	72.4	72.7%	70	2026-06
31	Claude Sonnet 4.5Anthropic Config—	—	72.3	73.2%	64	2025-10
32	Claude Opus 4.7Anthropic Configmedium· 2,048 tokens	medium· 2,048 tokens	72.2	72.7%	68	2026-04
33	Claude Sonnet 4.6Anthropic Config—	—	72.2	72.7%	68	2026-02
34	GPT 5.2OpenAI Config—	—	72.1	71.6%	77	2025-12
35	Claude Opus 4.8Anthropic Configlow	low	71.6	71.8%	71	2026-05
36	Claude Opus 4.5Anthropic Config—	—	71.6	72.7%	62	2025-11
37	GPT-5.5OpenAI Configlow	low	71.5	69.9%	87	2026-05
38	↳ GPT-5.5 Confignone	none	71.5	69.9%	87	2026-04
39	Claude Opus 4.1Anthropic Config—	—	71.3	73.2%	55	2025-08
40	Claude Opus 4.8Anthropic Configmedium	medium	71.2	71.1%	72	2026-05
41	Gemma 4 26B A4BGoogle Confignone	none	71.0	72.2%	61	2026-04
42	Gemma 4 31BGoogle Configmedium	medium	70.9	70.5%	75	2026-04
43	Claude Fable 5Anthropic Configgen	gen	70.9	70.9%	71	2026-07
44	Claude Opus 4Anthropic Config—	—	70.8	72.3%	58	2025-08
45	GPT 4.1OpenAI Config—	—	70.8	73.1%	50	2025-08
46	Claude Opus 4.7Anthropic Confighigh	high	70.5	70.7%	68	2026-05
47	Claude Opus 4.8Anthropic Configgen	gen	70.5	70.9%	67	2026-05
48	Claude Sonnet 5Anthropic Configgen	gen	70.3	70.0%	73	2026-07
49	↳ Claude Sonnet 5 Configxhigh	xhigh	70.2	69.8%	74	2026-07
50	GPT 5.3 ChatOpenAI Config—	—	70.2	70.9%	64	2026-03

1–50 of 187