Tag
New research suggests that with sufficient compute, filtering training data for language models may be unnecessary, and models can benefit from low-quality data.
Surprising new results show that for large LMs with enough compute, the best data filter might be no filter, as they tolerate low-quality data well.