Compared to elite pupils, AlphaGeometry does quite well on geometry questions from the International Mathematical Olympiad.
On some IMO geometry questions, an AI from Google DeepMind can almost match the results of the top human contestants.
The inherent intricacy of the problems is the primary culprit, although a lack of training data does contribute. The event has been held every year since 1959, with a minimum of questions in each edition. On the other hand, sophisticated AI systems might require datasets containing millions, if not billions, of points. Out of the six questions, one or two are geometrical issues, which are infamously hard to convert to a computer-friendly format because they require demonstrating facts like angles or lines in complex designs.
An algorithm developed by Thang Luong and colleagues at Google DeepMind has successfully addressed this problem by producing geometrical proofs that can be understood by machines in the hundreds of millions. Using their contest results as a guide, the AI that trained AlphaGeometry only got 25 out of 30 IMO geometry questions right. This is lower than the expected 25.9 accurate answers for an IMO gold medallist.
Luong stated at a press conference that mathematics is an important domain for evaluating AI systems. Deep thinking, which calls for careful preparation and an overview of the bigger picture, is still a problem for present AI systems, he said.
As Luong sees it, AlphaGeometry is like human brains in that there are two halves: one that is quicker and more intuitive, and the other that is slower and more analytical. First, there’s GPT-f, which is a user-friendly language model based on the same technology as ChatGPT. It recommends additional theorems and arguments to investigate to address an issue after training on proofs that have been contributed by millions of users. In response to GPT-f’s recommendation for a subsequent stage, a more deliberate and systematic “symbolic reasoning” engine applies mathematical and logical rules to finish the argument. Once the issue has been identified, the two systems coordinate their efforts by seamlessly transitioning between them.
The system performs admirably when it comes to solving IMO geometry issues, but according to Luong, the results it generates are often more verbose and less “beautiful” than proofs conducted by people. Also, it can detect details that people overlook. It found a better, more generic, alternative response than the one officially listed for an IMO question from 2004, for instance.
Despite the system’s impressive approach to solving IMO geometry problems, Yang-Hui He of the London Institute of Mathematical Sciences claims that it is fundamentally constrained in the mathematics it can utilize. The reason is, IMO issues should have answers that are simple enough to be taught in an undergraduate course. According to him, AlphaGeometry could do better with its system or possibly do something entirely new in mathematics if it had access to more mathematical data.
Observing AlphaGeometry’s handling of the situation where it is unsure of the proof needed would be fascinating. A frequent technique to get mathematical insights, he claims, is to explore theories without clear confirmation. “If you are unsure of your destination, is it possible to identify a fascinating and novel theorem within the set of all possible mathematical paths?”
An algorithmic trading business called XTX Markets established a $10 million prize fund last year for mathematical models that employ AI. An IMO gold medal-winning publicly-disclosed AI model will also be considered, in addition to minor awards for major fund milestones and a $5 million main prize.
“The $10 million AIMO challenge fund is supporting several planned progress prizes, one of which is solving an IMO geometry problem,” according to Alex Gerko of XTX Markets. “It’s thrilling to witness advancements in this direction, even before we reveal the specifics of this advancement prize, which would involve disclosing the model and data and solving a real-life geometry problem amid an IMO contest.”
It is unclear at this time whether DeepMind intends to submit AlphaGeometry into a real-life IMO competition or to enhance the system to deal with IMO challenges that are not geometric. But DeepMind’s AlphaFold system has already competed in public protein folding prediction tournaments.
Feel free to share your thoughts and insights in the comment below. You can also subscribe to our Newsletter for amazing upcoming news.