AI Image Quest
Overall Leaderboard
| Model | Correct | Incorrect | Not Found | Refusal |
|---|---|---|---|---|
| Google: google/gemini-2.5-flash | 88.9% | 0% | 11.1% | 0% |
| Google: google/gemini-2.5-pro | 88.9% | 0% | 11.1% | 0% |
| OpenAI: openai/gpt-5 | 88.9% | 0% | 11.1% | 0% |
| Qwen: qwen/qwen3-vl-235b-a22b-instruct | 88.9% | 0% | 11.1% | 0% |
| Qwen: qwen/qwen3-vl-8b-instruct | 88.9% | 0% | 11.1% | 0% |
| Anthropic: anthropic/claude-opus-4.1 | 77.8% | 0% | 22.2% | 0% |
| OpenAI: openai/gpt-5-mini | 77.8% | 0% | 22.2% | 0% |
| Anthropic: anthropic/claude-sonnet-4.5 | 66.7% | 11.1% | 22.2% | 0% |
| Google: google/gemma-3-27b-it | 66.7% | 33.3% | 0% | 0% |
| xAI: x-ai/grok-4 | 55.6% | 44.4% | 0% | 0% |
Data & Infographics
Objects & Products
People & Animals
Scenes & Environments
No results found for this category yet.
Symbolic & Social
No results found for this category yet.
Text & Documents
| Model | % Correct |
|---|---|
| Anthropic: anthropic/claude-opus-4.1 | 100% |
| Anthropic: anthropic/claude-sonnet-4.5 | 100% |
| Google: google/gemini-2.5-flash | 100% |
| Google: google/gemini-2.5-pro | 100% |
| OpenAI: openai/gpt-5 | 100% |
| OpenAI: openai/gpt-5-mini | 100% |
| Qwen: qwen/qwen3-vl-235b-a22b-instruct | 100% |
| Qwen: qwen/qwen3-vl-8b-instruct | 100% |
| xAI: x-ai/grok-4 | 100% |
| Google: google/gemma-3-27b-it | 50% |