Tag
Introduces ERP-XTTN, a cross-attention architecture for interpretable ERP classification across subjects without calibration. Evaluated on multiple datasets, it achieves competitive performance with black-box models while providing transparent routing insights.
This paper proposes an interpretable decision layer for AI-augmented classrooms that combines teacher and student feedback to rank course topics needing attention without using grades. The approach surfaces isolated learners and aligns with instructor concerns in a preliminary study.
This paper introduces Ex-ToxiCN-MM, the first Chinese harmful meme explanation dataset, along with a knowledge base C-HarmKB and an attribution analysis framework RIKE, to improve interpretable detection of harmful memes by considering cultural context and ambiguity.
This paper proposes an operational criterion for interpretable text representations based on inter-annotator agreement and label disentanglement, and introduces LLM-assisted Feature Discovery (LFD), a method that uses cross-LLM agreement screening and residual predictive gain to select clear, label-disentangled features. Experiments show LFD matches predictive performance while producing more interpretable features, validated by human audits.
OceanCBM is a concept bottleneck model for spatiotemporal prediction and mechanistic interpretability in ocean forecasting, using mixed supervision to predict mixed layer heat content while imposing soft physical structure. The model achieves interpretable, physically grounded representations without sacrificing predictive skill.