rag-evaluation

#rag-evaluation

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

arXiv cs.AI ↗ · 2026-06-01 Cached

LLM-FACETS is an open-source evaluation framework designed to help practitioners assess LLM transparency and accountability with a focus on privacy and data flow transparency. It provides a browser interface, plugin architecture, and supports multiple auditing mechanisms including token-level log-probability visualization and RAG Triad metrics.

0 favorites 0 likes

#rag-evaluation

A Comparative Study of Language Models for Khmer Retrieval-Augmented Question Answering

arXiv cs.CL ↗ · 2026-05-22 Cached

This paper presents a comparative evaluation of embedding models and generator backends for Khmer-language retrieval-augmented question answering in the telecom domain, finding that BGE-M3 performs best for retrieval while generator strengths vary across metrics.

0 favorites 0 likes

#rag-evaluation

RARE: Redundancy-Aware Retrieval Evaluation Framework for High-Similarity Corpora

arXiv cs.CL ↗ · 2026-04-22 Cached

RARE introduces a redundancy-aware retrieval evaluation framework that decomposes documents into atomic facts to create realistic benchmarks for high-similarity corpora like finance, legal, and patents, revealing significant performance drops in existing retrievers.

0 favorites 0 likes

rag-evaluation

LLM-FACETS: A Privacy-Preserving Framework for Evaluating LLM Transparency and Accountability

A Comparative Study of Language Models for Khmer Retrieval-Augmented Question Answering

RARE: Redundancy-Aware Retrieval Evaluation Framework for High-Similarity Corpora

Submit Feedback