Tag
This paper presents a deployment-centered evaluation of an LLM system integrated in electronic health records, training a classifier to predict query-level rejection risk using pre-response context like provider type and department, achieving an AUROC of 0.719 over 4.5 months of feedback.