multi-model-evaluation

Tag

Cards List
#multi-model-evaluation

Federated Survival Analysis in Healthcare: A Multi-Model Evaluation on Cross-Institutional Heterogeneous Breast Cancer Data

arXiv cs.LG · 3d ago Cached

This paper systematically evaluates three survival models (Cox, DeepSurv, RSF) under federated learning on heterogeneous breast cancer data, finding that FL outperforms local training and RSF offers the best balance of performance across clients.

0 favorites 0 likes
#multi-model-evaluation

Knowledge Index of Noah's Ark

arXiv cs.AI · 2026-06-04 Cached

KINA (Knowledge Index of Noah's Ark) is an 899-item LLM benchmark spanning 261 fine-grained disciplines, introducing formal guarantees for disciplinary representativeness, incentive-aligned annotation via bonus-on-bar tournaments, and bootstrap ranking-stability reporting. Evaluating 42 models, top performers include Gemini-3.1-Pro-Preview (53.17%), Claude-Opus-4.6 (49.92%), and GPT-5.4 (48.55%), revealing a tiered rather than smooth leaderboard structure.

0 favorites 0 likes
← Back to home

Submit Feedback