Tag
This paper formalizes the concept of Bayes-sufficient representations in supervised learning, defining when a representation retains exactly the information needed for Bayes-optimal prediction under a given loss function. It introduces the Bayes quotient as a canonical loss-dependent object and connects the framework to property elicitation, illustrating distinctions between sufficiency, minimality, and excess retained information through experiments.
This paper studies symmetrization of loss functions for robust training under label noise, introducing SGCE and alpha-MAE loss functions that interpolate between multi-class unhinged loss and Mean Absolute Error, with theoretical guarantees and competitive empirical performance.
This paper identifies a critical 'model collapse' issue in standard fine-tuning for causal reasoning and proposes a semantic loss function with graph-based logical constraints to prevent it.
This paper proposes topological optimal transport-based loss functions for improving structured recipe generation in language models, addressing the limitations of standard cross-entropy training by better handling ingredient composition, quantities, and procedural accuracy. The approach shows significant improvements on recipe-specific metrics with 62% human preference over baseline methods.