Tag
This paper describes UOL@IDEM's closed-track submission to the BEA 2026 shared task on L1-aware vocabulary difficulty prediction, combining multilingual contextual representations with engineered features. The system achieves competitive RMSE scores for Spanish, German, and Chinese, with frequency being the most stable predictor.
This paper proposes a fully local AI cascade for de-identifying educational dialogue, combining a recall-first candidate proposer with a contextual Redact/Keep reviewer. The approach achieves high accuracy without sending data to external APIs, outperforming both smaller local models and commercial APIs on math tutoring transcripts.