A Tale of Tails: Model Collapse as a Change of Scaling Laws
On this page
As AI model size grows, neural scaling laws have become a crucial tool topredict the improvements of large models when increasing capacity and the sizeof original (human or natural) training data. Yet, the widespread use ofpopular models means that the ecosystem of online data and text will co-evolveto progressively contain increased amounts of synthesized data. In this paperwe ask: How will the scaling laws change in the inevitable regime wheresynthetic data makes its way into the training corpus? Will future models,still improve, or be doomed to degenerate up to total (model) collapse? Wedevelop a theoretical framework of model collapse through the lens of scalinglaws. We discover a wide range of decay phenomena, analyzing loss of scaling,shifted scaling with number of generations, the ”un-learning" of skills, andgrokking when mixing human and synthesized data. Our theory is validated bylarge-scale experiments with a transformer on an arithmetic task and textgeneration using the large language model Llama2.
Further reading
- Access Paper in arXiv.org