AI & Formal Proofs Reshape Physics, Expose Flaws in Landmark Papers

The AI-Powered Physics Revolution

Theoretical physics is undergoing a seismic shift, driven by the dual forces of large language models and formal proof assistants. At a major physics conference, Harvard's Matthew Schwartz declared that AI puts the field "on the chopping block," estimating that problems like unifying quantum theory with general relativity could be solved within five years thanks to AI collaboration.

Schwartz's provocative stance is backed by personal experience. He co-authored a quantum field theory study in just two weeks using Anthropic's Claude chatbot, a task he estimates would have taken two years with a doctoral student. He now refuses to mentor students unwilling to collaborate with AI tools, envisioning a future of "10,000 Einsteins."

The Rise of the Proof Assistant

Simultaneously, a more rigorous revolution is unfolding. A computer language designed to verify mathematical theorems has, for the first time, uncovered a fundamental error in a widely-cited physics paper. The researcher, James Tooby-Smith, confirmed the authors agreed with the finding and an erratum will be published.

This event highlights a worrying possibility: how many other physics papers contain similar, undetected logical flaws? Kevin Buzzard, a mathematician championing formalization, notes the challenge: physics lacks the vast corpus of formalized theorems that mathematics has built, which is crucial for training effective AI models.

continue reading below...

The Human Role: From Calculation to Curation

As AI handles complex calculations and proof assistants verify logic, the human physicist's role is evolving. Schwartz conjectures that humans will become "taste-makers," determining which problems are most interesting and meaningful. This shift mirrors trends in software engineering, where the bottleneck has moved from writing code to defining precise specifications.

Agoda's experience with AI coding assistants reveals a parallel. The highest-value work is no longer implementation but "collaborative specification and architectural alignment." Engineers remain accountable for defining intent and verifying results, not inspecting generated code line by line. This "grey box" approach keeps humans in the loop where it matters most.

Technical Foundations and Future Challenges

The AI tools enabling this change are advancing rapidly. Anthropic has unveiled new models like Claude Mythos and Capabara, though specific parameter counts from sources conflict. The core capability, however, is clear: these models can engage with advanced physics problems at the level of an early-stage PhD student.

For formal proof systems to truly transform physics, Buzzard estimates a need for "a million lines" of formally verified physics. The initial manual work to create this corpus is substantial, but it's the necessary foundation for machines to eventually "take over" the verification process at scale.

Implications for Scientific Methodology

This convergence of AI and formal methods is more than a productivity boost; it's a methodological overhaul. The ability to rapidly generate insights with AI and then rigorously verify them with proof assistants creates a powerful new scientific workflow. It also exposes the fragility of traditional peer review when faced with complex, error-prone human reasoning.

Schwartz captures the ambivalence of this moment: "It’s amazing and also a little scary." The fear is that the transition may cause disruption, but the potential is a new era of accelerated, more reliable discovery. The definition of a physicist is changing, from a lone calculator to a conductor of silicon-based intelligences.