Nature Research Intelligence

I just noticed something new showing up in Google searches, summaries of the state of scientific research areas such as this one about String Theory And Quantum Gravity. They’re produced by Nature Research Intelligence, which has been around for a couple years, trading on the Nature journal brand: “We’ve been the most trusted name in research for over 150 years.” The business model is you pay them to give you information about the state of scientific research that you can then use to make funding decisions.

Each of their pages has a prominent button in the upper right-hand corner allowing you to “Talk to an expert”. The problem with all this though is that no experts are involved. The page summarizing String Theory and Quantum Gravity is just one of tens of thousands of such pages produced by some AI algorithm. If you click on the button, you’ll be put in touch with someone expert in getting people to pay for the output of AI algorithms, not someone who knows anything about string theory or quantum gravity.

It’s very hard to guess what the impact of AI on scientific research in areas like theoretical physics will be, but this sort of thing indicates one very real possibility. Part of Nature’s previous business model was to sell high-quality summaries of scientific research content produced by the best scientific experts and journalists who consulted with such experts. This kind of content is difficult and expensive to produce. AI generated versions of this may not be very good, but they’re very cheap to produce, so you can make money as long as you can find anyone willing to pay something for them.

The relatively good quality of recent AI generated content has been based on having high-quality content to train on, such as that produced by Nature over the last 150 years. If AI starts getting trained not on old-style Nature, but on new-style Nature Research Intelligence, the danger is “model collapse” (for a Nature article about this, see here). Trained on their own output, large language models start producing worse and worse results.

I’m no expert, you should probably consult an AI about this instead, but it seems to me that one possibility is that instead of superintelligence producing ever more impressive content, we may have already hit the peak and it’s all downhill from here. A thought that occurred to me recently is that back in the 80s when people were talking about string theory as science that anomalously happened to fall out of the 21st century into the 20th, they may have been very right, but not realizing what was going to happen to science in the 21st century…

This entry was posted in Uncategorized. Bookmark the permalink.

14 Responses to Nature Research Intelligence

  1. Navneeth says:

    The technical term for what you’re describing is “AI slop*.

  2. Peter Woit says:

    Navneeth,
    Is it still “AI slop” if corporations or government funding agencies are paying for it and using it to make research funding decisions?

  3. Janne Sinkkonen says:

    It is obvious that naively training on AI-generated content leads to poor quality and eventual model collapse.

    I don’t know what the leading AI houses are doing with their training data, but the process is much more complicated than just taking “all internet” and pouring that to the model. It involves human curation, LLM curation, and, paradoxically, synthetic data.

    Recently lots of public talk and research activity has been around so called reasoning models, which use reinforcement learning to learn strategies to achieve goals, such as getting to a correct answer in a relatively formal problem, or agentic fulfilling of complex goals. This structure goes beyond passive predictive generation and produces more grounded models, also capable of planning. Multimodality is another path to grounding: basically all proprietary frontier models are multimodal, at some point maybe robotically so.

    Right now, Deepseek R1 is a good way to see a reasoning model close to state of the art without paying. (https://chat.deepseek.com, click the “deep think” button. And don’t say it doesn’t get twistors right, think it more like an everyday tool for problem solving. Forthcoming o3 from OpenAI is still at another level but expensive.)

    Overall, I do not have any insider information, but I have been following the AI field close, as part of my work. My impression is that model collapse was in the news about a year ago, but now nowhere in sight.

  4. Peter Woit says:

    Janne Sinkkonen,
    If you do Google search for recent information on “string theory and quantum gravity”, the Nature AI summary shows up #1 or 2 on the list. Maybe other algorithms can tell the difference between high-quality content and AI-generated crud, but right now Google’s best search algorithm can’t do it.

  5. WTW says:

    Peter,
    I recently came across this from Freya Holmér (who I believe is a real person, not an AI-generated clone bot):
    It’s Not Just Google: Generative AI is a Parasitic Cancer
    https://www.youtube.com/embed/-opBifFfsMY
    This is a very long video; perhaps someone could produce an AI tldr summary. ;=(
    After you get the idea of what she is experiencing, skip to 30m:29s to see how many seemingly AI-generated responses she gets to a simple query, on multiple search engines.

    I’ve noticed the same kind of not-quite-English drivel popping up, not quite giving any actual technical information on queries about technical topics, rather than any detailed technical info that a human expert would/should write. And that would/should be carefully considered by any kind of executive agent decision making, whether human or AI.

    I just assume this is AI generated, not from (possibly Indian or East European non-native English speaking ghost) writers who might write English in similar ways. If so, this definitely IS crap that AI is now being trained on — since there is just no good way to automagically distinguish it from genuine information. If, as JS says above, self-induced AI “collapse” is not happening, that is even more concerning — as this kind of self-serving garbage is being nebulously generated in order to make money from AI for nebulous corporations with little to no checks on accuracy or legitimacy or objectivity. Now, apparently, including companies like Nature.

  6. Peter Woit says:

    WTW,
    There is a very real general problem of huge amounts of AI generated junk passing itself off as human-generated. The Nature story I find remarkable because the stuff is labeled as AI generated, but Google is still putting it at the top of their search results (for recent sources).

    If you want some idea of the quality of these Nature things, take a look at the top level summaries of what is going on in Mathematical Sciences and in Pure Mathematics
    https://www.nature.com/research-intelligence/mathematical-sciences
    https://www.nature.com/research-intelligence/pure-mathematics

  7. Peter says:

    Peter, did you already try Deepseek?

    One of my friends develops the AI strategy for a university. A few days ago, he showed me Deepseek and wanted to know what I thought about it. I asked it a math question. Nothing sophisticated, but not trivial either.

    I have to admit I was impressed. We typed in the formulas with some made-up notation – and the model interpreted them correctly.

    We then could follow how the model developed a proof strategy. After a minute or so, it arrived at two different strategies. One of the strategies started from Vandermonde’s determinant. It was perfectly adequate, but I was surprised because Vandermonde was a rather complicated trick to solve the problem (and I had to delve rather deep in my memory to remember what Vandermonde was about).

    The other strategy was much more elegant – basically the five line proof a mathematician with “insight” would give.

    We did similar experiments with older versions of ChatGPT about a year ago, but this model was quite superior. Because you could follow the way it was “reasoning”, it was hard to avoid the feeling that the machine was thinking. Of course it wasn’t, but I never had the impression it was offering something that it scraped from the internet. As “an everyday tool for problem solving” it was better than I expected.

  8. Peter Woit says:

    Peter,
    I don’t want to host a general discussion of AI and its capabilities, since I’m quite ignorant and that’s already being done much better by others (I recommend for instance the substack of Michael Harris). That AI can be trained to solve standard sorts of math problems and that it can be a powerful tool for formalizing and checking proofs is very plausible, and many people are hard at work on this. We’ll all see how well this works in coming years, but I’m not all that interested in doing those things anyway.

    The one thing that does interest me is how AI is going to affect the information and research environment that is important to me. If Nature and other journals get rid of their best journalists and stop commissioning high-quality review articles by the best mathematicians and scientists, replacing this content with AI-generated things like Nature Research Intelligence, that seems likely to have a negative impact on my ability to learn new (insightful and true) things.

    One could argue that current fundamental physical theory has already experienced model collapse due to the way research has been consuming its own crud for decades, but I doubt having AI generating papers or making funding decisions is going to help.

  9. WTW says:

    Peter,
    Probably OT except possibly the last paragraph, but just FYI:
    My understanding (I may be wrong) is that DeepSeek was trained entirely on AI-generated “synthetic” data. That is, rather than replicate scanning the entire internet-of-today — with growing problems of websites not allowing their data to be consumed for free and the ever-increasing proportions of AI-regurgitated garbage, while incurring all the cost and time and infrastructure that data ingestion involves — you create a (distilled) data set from a suite of queries to existing AI engines, such as Llama and Qwen, and train your new model on that. Google DeepMind is also investigating (or doing) this: e.g., see “Best Practices and Lessons Learned on Synthetic Data for Language Models”, arXiv:2404.07503 . If you can extract and condense most of the actual value of the huge already-existing OpenAI dataset, then you can build a much more efficient model that your so-called “reasoning layer” agent can be applied to.

    But note what is missing from the above:
    (a) How do you extract the “actually valuable/useful” information? Well, by generating targeted AI queries against existing data sets. But how do you decide what to “target” and what not to? Who makes that call? What is the selection algorithm and how can it either fail or be manipulated?
    The synthetic data against which the new AI model is trained can now be filtered or selectively biased just by making over-all “system prompts” that alter the output of any subsequent queries, as well as by altering/selecting individual queries themselves.
    (b) What is the validation that the results of queries correspond to objective (experimentally reproducibly verified) reality, vs opinion/current fashion/social memes, vs political/PR BS, before being included in the synthetic data set? Currently, “validation” only seems to be done of AI models using the assembled synthetic data, not of the data itself.
    And note that the categories of data “quality” issues listed above apply to science and academia as well as to society as a whole.
    (Even more prosaically: If you get a code snippet in Rust or Python, or C++, from a prompt, who checks it against which test cases, and who wrote those tests cases? Against which edge cases/error conditions? For what version(s) of the software/compiler/OS? How are “back doors”/security holes detected? Etc. What if some of those bugs/security issues are intentional, based on how that synthetic data set or those queries to generate it were manipulated? How can one consider asking an AI to perform such validations and security checks when those problems originated from that AI model and its underlying synthetic data set?)
    c) If such errors in the synthetic data are found, how will they be corrected? Who will be responsible for doing it and distributing those corrections? Who verifies the corrections, and is held responsible if they are wrong or cause other problems?

    Which leads us back to the Nature/Google issue: Why not trust AI-created web content equally with human-generated. Go to the Fermilab Dune web site and try to figure out what that experiment is actually measuring. It’s almost impossible to find. Instead you get nebulous PR BS that is almost entirely factually inaccurate and irrelevant to the experiment itself. How would you use any of the above AI tools to figure that out? Or that the prompts that Nature Springer used generated complete garbage, and that — despite their claims to the contrary — no one bothered to even check, much less correct it. How can those institutions be held accountable, potentially for things like misappropriation of funds, caused by those errors both in the AI and by the human decision makers who used it?

  10. Commenter says:

    WTW’s understanding is very wrong. There is a paper about the training, you can read it: synthetic data is involved but is certainly not the only input.

  11. Peter Woit says:

    My mistake for allowing any comments about what general issues of what is going on with LLMs, since I’m not competent to moderate such a discussion. Future comments should be specifically about the Nature situation.

  12. John Baez says:

    I despise Nature’s business model: taking taxpayer-funded research papers, refereeing them with freely donated labor, and then selling them at exorbitant rates. I would love to destroy them, but I haven’t figured out how. So, I’d be really glad if they start churning out AI slop that damages their credibility and makes researchers more inclined to publish elsewhere.

  13. Peter Woit says:

    John Baez,
    For the sciences I know about (math + theoretical and particle physics), Nature doesn’t publish much in the way of research articles (and when they do, it can be a disaster, see Wormhole Publicity Stunt). But their science news content has always been of the highest quality, and there are few other sources like it (Science is one of such few). If their science news operation gets replaced with AI slop, that will be a serious loss. I can all too easily visualize a depressing future where news sources about science are AI slop, TikTok, Twitter, YouTube videos and a few blogs run by decrepit old guys.

  14. zzz says:

    “a few blogs run by decrepit old guys.”

    the future is now, old man

Leave a Reply

Informed comments relevant to the posting are very welcome and strongly encouraged. Comments that just add noise and/or hostility are not. Off-topic comments better be interesting... In addition, remember that this is not a general physics discussion board, or a place for people to promote their favorite ideas about fundamental physics. Your email address will not be published. Required fields are marked *