Tag
Introduces PoQ-Judge, a multi-architecture evaluation framework with reference-free judge models (TextCNN, MiniLM, DeBERTa) for cost-aware Proof-of-Quality in decentralized LLM inference, achieving high correlation with ground-truth proxies while eliminating the need for reference answers.
Granuscore is a reference-free measure of granularity for text analysis and question answering. It uses hierarchical embedding spaces to capture fine-grained vs. coarse language and demonstrates consistent differences in model behavior across QA benchmarks.
This paper applies Group Relative Policy Optimization (GRPO) to encoder-decoder Seq2Seq models for machine translation fine-tuning, using reference-free rewards (LaBSE and COMET-Kiwi) that require no parallel data, and achieves consistent improvements across 13 languages.