Tag
This paper introduces Interactive Inverse Reinforcement Learning (IIRL), a framework where a learner actively interacts with an expert to infer reward functions, formulated as a stochastic bi-level optimization problem. The authors propose the BISIRL algorithm, providing convergence guarantees and experimental validation for this interactive learning paradigm.
University of Memphis researchers propose HAMR, a model-agnostic meta-learning framework that uses bi-level optimization and neighborhood-aware resampling to adaptively reweight hard examples and minority classes across six imbalanced NLP datasets.