Q2 2025
GAIA Benchmark
Current performance comparison on the industry-standard Level 3 agent benchmark for autonomous AI systems
OpenAI Deep Research (pass@1)
Model Performance
OpenAI Deep Research (pass@1)
47.6%
Success Rate (pass@1)
Source: GAIA Benchmark Q2 2025 Evaluation — Updated May 2025