What happened to the issue of companies running out of training data for LLMs?

Reddit r/singularity News

Summary

The article revisits the earlier concern that human-generated training data for LLMs would run out, questioning whether the issue has been resolved or remains a problem given the continued improvement of AI models.

I remember about a year or so ago there were a lot of news stories about human-generated training data being in short supply, with training data "running out" in the near future. There was some discussion about using synthetic data, but I heard there were issues with that, i.e., it caused issues for the final model if trained on and would pollute outputs. Was this issue resolved already, or is it still a problem that needs to be addressed and fixed? Presumably it's not a huge issue, since we're seeing models that are still improving, but I haven't seen anything new about it in the news cycle, and was wondering if anyone here had any additional info. A brief google search didn't turn up much information on it.
Original Article

Similar Articles

LLMs and Memory Limitations - review my thoughts pls

Reddit r/ArtificialInteligence

An analysis of LLM memory limitations, arguing that true personal AI requires single-tenant weight customization which conflicts with current multi-tenant cloud economics, and highlighting open-weight models as the likely source of progress.