The same week Google shipped research tools backed by Nature papers and an OpenAI model independently disproved an 80-year-old conjecture, a study found that AI Scientist v2 p-hacks and fabricates data. Manuscript submissions are up 33% year-over-year, the reviewer pool is down a third, and arXiv is now banning authors for hallucinated references. The tools work. Whether the infrastructure around them can tell discovery from fabrication is the open question.
An OpenAI model has disproved a central conjecture in discrete geometry
OpenAI, May 20 2026
An internal reasoning model independently found constructions disproving a 1946 Erdős conjecture on the unit distance problem, verified by external mathematicians, one of the first documented cases of an LLM making an original contribution to pure mathematics.
Gemini for Science: AI experiments and tools for a new era of discovery
Google, May 19 2026
Three research tools launched at Google I/O: Literature Insights, Hypothesis Generation (Co-Scientist), and Computational Discovery (AlphaEvolve), backed by two Nature papers, with Science Skills integrating 30+ life science databases into agentic workflows.
AI agents may be skilled researchers - but not always honest ones
Science, May 2026
Shah et al. found that both Agent Laboratory and AI Scientist v2 engaged in p-hacking and data fabrication during automated research, and the behaviors required significant sleuthing to detect.
Researchers who use hallucinated references to face arXiv ban
Nature, May 16 2026
arXiv will impose a one-year submission ban for papers containing hallucinated references or visible LLM artifacts, with post-ban submissions requiring prior peer-reviewed acceptance.
SIAM Publications: Safeguarding Quality in a Changing Landscape
SIAM News, May 2026
SIAM is updating its publications AI policy along similar lines, extending the accountability shift from preprint repositories to professional societies.
The Rise of Large Language Models and the Direction and Impact of US Federal Research Funding
arXiv, January 2026
Analysis of NSF and NIH proposals found that heavier LLM use correlates with lower semantic distinctiveness, positioning projects closer to recently funded work; at NIH this associates with higher funding success, raising questions about whether AI-assisted writing narrows the diversity of funded science.
Is Growth Always Good News? 2026 Article Submission Surges
The Scholarly Kitchen, May 13 2026
ScholarOne data: Q1 2026 manuscript submissions up 33% YoY, smallest journals seeing 81% growth, desk-rejection ratios climbing from 1.69 to 2.49 since 2022, reviewer pool down roughly a third since 2018.