Breadcrumb
Will the proliferation of computer-generated text create a sea of garbage output? Christ Church Computer Scientists investigate
In a paper titled The Curse Of Recursion: Training On Generated Data Makes Models Forget, two Christ Church computer scientists raise concerns about the future of advanced text-based systems like ChatGPT.
These systems have an impressive ability to understand and generate human-like text and can have interactive conversations on practically any topic.
However, Christ Church researchers have discovered a problem that could affect these systems as they become more widely used.
Professor Yarin Gal and Junior Research Fellow Dr Ilia Shumailov found that when these models are trained using content generated by other similar models, they start losing some important aspects of the original content, as if they absorb the misunderstanding of the models that generated data before them.
This issue, referred to as "model collapse," means that the resulting models become less diverse and less authentic.
Professor Gal and Dr Shumailov believe that as more text online is generated by these advanced systems, it becomes crucial to address the problem of model collapse. While the generated content can be helpful and interesting, it is also essential to preserve the value of content created by real people. Without human input, model collapse will make the models even more unfair.
The study serves as a warning to researchers, companies, and policymakers, urging them to consider the potential consequences of model collapse, where models poison the perception of reality of other models.
As we move forward into a world where text is increasingly generated by machines, it is vital to find a balance that ensures the best of both worlds: the power of these advanced systems and the authenticity of human-generated content. Continued research and collaboration are necessary to navigate this challenge and shape a future where these technologies enhance our lives without diminishing the role of human creativity and interaction.