Tag
The article analyzes the economic divergence between open and closed AI models, arguing that premium closed models will maintain high margins through superior intelligence (especially for coding agents), while open models follow a different trajectory of commoditization and efficiency.
This paper introduces a novel dataset watermarking method for closed LLMs that uses co-occurrence patterns of word pairs to provably detect if proprietary data was used in training, even when it constitutes a small fraction of the dataset.