Rock, Paper, Scissors, and the Hard Place

Rock, Paper, Scissors, and the Hard Place
Host: Jon Crowcroft (University of Cambridge), Andrew Moore (University of Cambridge)
Scribe: Xing Fang (Xiamen University)

Introduction

The Pace of Science

Jon Crowcroft opened by highlighting the unprecedented pace of scientific production. Top AI conferences now publish over 10,000 papers each year, making it impossible for any individual to read even a fraction. Since Attention is All You Need (2017), more than 50,000 papers on large language models have appeared. Beyond papers, the number of patents has surged, often reflecting incentives rather than genuine innovation. Similarly, open-source software releases are multiplying, but only a handful become impactful while most remain unused.

Possible Factors

Part of this growth reflects positive developments: there are more scientists, more funding, and more problems to solve. Yet Jon Crowcroft questioned whether all this output corresponds to real advances.

  • Are we seeing the spread of “minimal publishable units”?
  • Are many contributions wafer-thin or mere refinements of process?
  • Is the lack of artifacts and the rise of “deep fake” papers eroding trust?

Why is this a problem?

The overwhelming scale risks burying significant contributions. Some of this may be “FOMO” (fear of missing out), but the concern runs deeper: science means knowledge—what humans know. If scale becomes unmanageable, or if machines dominate generation and filtering, science may drift away from its human foundation.

Mitigations and Solutions

Jon Crowcroft asked whether AI might speed up literature reviews—or instead derail them by generating even more content. He mentioned efforts to scale academic publishing to internet scale, and pointed to initiatives such as the Open Conference of AI Agents for Science 2025, which explore how AI can be part of a sustainable scientific ecosystem.

Solutions from Systems

Drawing on networking and distributed systems, Jon Crowcroft proposed several analogies:

  • Congestion control — just as the internet avoided collapse, science may need similar mechanisms.
  • Federation, hierarchy, de-duplication, and consensus — techniques from distributed systems could help manage papers, patents, and results.
  • Reflection — LLMs might assist in artifact evaluation, checking claims in human-generated work.

Questions :

Q&A Discussion

Is there really a scale problem, or did we create it ourselves?

Several participants recalled the early 2000s, when networking was saturated with simulation-based papers. The community eventually raised standards so that papers reporting results without deeper insight were no longer acceptable. They suggested that the current flood of AI model papers requires a similar response—acceptance should depend on explaining why a model works, how it works, and when it fails. Others reflected that by long prioritizing implementation-heavy system papers, the community may have encouraged a culture of quantity over creativity, making the current imbalance partly self-inflicted.

What are the risks of overload?

Participants noted that excessive volume risks obscuring breakthroughs. A striking example was given from medicine: rare cancers are mainly studied in Mandarin publications, while breast cancer dominates English-language literature. This partitioning means that globally important knowledge may remain hidden in silos. The group also worried about discovery infrastructure: Google Scholar has become the de facto entry point for most researchers. If such a tool vanished, the community would lose its primary means of navigating the literature, exposing science to a fragile “single point of failure.”

Do we need better metrics for quality?

Jon Crowcroft stressed that traditional measures like citation counts are flawed. Wrong results often accumulate citations long after they are disproved, inflating their influence. He suggested treating “quality of science” (QoS) as a first-class concern, possibly using ideas from networking—such as congestion control—to match submissions with limited reviewer capacity. Others asked why academic careers are still judged mainly by paper counts rather than originality or depth. Jon Crowcroft drew a parallel from industry: at Microsoft, developers who wrote more code often produced more bugs, showing that raw output is not a meaningful measure of value. He further argued that negative citations should exist, so that widely cited but incorrect papers could be formally downgraded.

How should contributions beyond papers be recognized?

Several participants argued that scientific contributions are broader than publications, encompassing code, datasets, replication studies, and teaching. Yet current incentives push students to resubmit papers with minimal changes until acceptance, creating a cycle where persistence matters more than quality. Jon Crowcroft observed that replication and teaching both produce valuable knowledge, but the community lacks clear channels to reward them. Recognizing these contributions would not only relieve pressure on students but also reduce the burden of redundant publications.

How do we sustain the review system?

Panelists agreed that reviewer capacity has not kept pace with the surge in submissions. Program committee sizes remain relatively fixed, while paper counts grow rapidly, leading to lower review quality and more randomness in outcomes. One proposal was to introduce early-stage filtering, where two-page summaries or extended abstracts are reviewed first before inviting full submissions. Others suggested token-based systems, linking the right to submit with reviewing obligations. Across perspectives, there was consensus that reviewer time is the most scarce and valuable resource, and that any sustainable solution must explicitly protect and respect it.

Personal Thought

I believe the discussion highlights a real need for better metrics of scientific quality. At the same time, designing such a metric will be extremely difficult. As noted during the panel, different reviewers come from different knowledge domains and inevitably bring their own academic taste to evaluation, which makes a universal standard unlikely. Still, having even a broad guiding principle would be valuable: shifting students’ and researchers’ goals away from paper counts and toward paper quality is an important step for the health of the community.