AI Image Quest

Overall Leaderboard

Model Correct Incorrect Not Found Refusal
Google: google/gemini-2.5-flash 88.9% 0% 11.1% 0%
Google: google/gemini-2.5-pro 88.9% 0% 11.1% 0%
OpenAI: openai/gpt-5 88.9% 0% 11.1% 0%
Qwen: qwen/qwen3-vl-235b-a22b-instruct 88.9% 0% 11.1% 0%
Qwen: qwen/qwen3-vl-8b-instruct 88.9% 0% 11.1% 0%
Anthropic: anthropic/claude-opus-4.1 77.8% 0% 22.2% 0%
OpenAI: openai/gpt-5-mini 77.8% 0% 22.2% 0%
Anthropic: anthropic/claude-sonnet-4.5 66.7% 11.1% 22.2% 0%
Google: google/gemma-3-27b-it 66.7% 33.3% 0% 0%
xAI: x-ai/grok-4 55.6% 44.4% 0% 0%

Data & Infographics

Model % Correct
Anthropic: anthropic/claude-opus-4.1 100%
Google: google/gemini-2.5-flash 100%
Google: google/gemini-2.5-pro 100%
OpenAI: openai/gpt-5 100%
OpenAI: openai/gpt-5-mini 100%
Qwen: qwen/qwen3-vl-235b-a22b-instruct 100%
Qwen: qwen/qwen3-vl-8b-instruct 100%
Anthropic: anthropic/claude-sonnet-4.5 0%
Google: google/gemma-3-27b-it 0%
xAI: x-ai/grok-4 0%

Objects & Products

Model % Correct
OpenAI: openai/gpt-5 100%
Google: google/gemini-2.5-flash 80%
Google: google/gemini-2.5-pro 80%
Google: google/gemma-3-27b-it 80%
OpenAI: openai/gpt-5-mini 80%
Qwen: qwen/qwen3-vl-235b-a22b-instruct 80%
Qwen: qwen/qwen3-vl-8b-instruct 80%
Anthropic: anthropic/claude-opus-4.1 60%
Anthropic: anthropic/claude-sonnet-4.5 60%
xAI: x-ai/grok-4 40%

People & Animals

Model % Correct
Anthropic: anthropic/claude-opus-4.1 100%
Anthropic: anthropic/claude-sonnet-4.5 100%
Google: google/gemini-2.5-flash 100%
Google: google/gemini-2.5-pro 100%
Google: google/gemma-3-27b-it 100%
Qwen: qwen/qwen3-vl-235b-a22b-instruct 100%
Qwen: qwen/qwen3-vl-8b-instruct 100%
xAI: x-ai/grok-4 100%
OpenAI: openai/gpt-5 0%
OpenAI: openai/gpt-5-mini 0%

Scenes & Environments

No results found for this category yet.

Symbolic & Social

No results found for this category yet.

Text & Documents

Model % Correct
Anthropic: anthropic/claude-opus-4.1 100%
Anthropic: anthropic/claude-sonnet-4.5 100%
Google: google/gemini-2.5-flash 100%
Google: google/gemini-2.5-pro 100%
OpenAI: openai/gpt-5 100%
OpenAI: openai/gpt-5-mini 100%
Qwen: qwen/qwen3-vl-235b-a22b-instruct 100%
Qwen: qwen/qwen3-vl-8b-instruct 100%
xAI: x-ai/grok-4 100%
Google: google/gemma-3-27b-it 50%