Today, we at @OpenAI achieved a milestone that many considered years away: gold medal-level performance on the 2025 IMO with a general reasoning LLM
this isn’t an IMO-specific model. It’s a reasoning LLM that incorporates new experimental general-purpose techniques.

So what’s different? We developed new techniques that make LLMs a lot better at hard-to-verify tasks.
When you work at a frontier lab, you usually know where frontier capabilities are months before anyone else.
But this result is brand new, using recently developed techniques.
It was a surprise even to many researchers at OpenAI.
chokudaiベンチの出番だな