Tag
Introduces RusFinChain, the first Russian-language symbolic benchmark for verifiable chain-of-thought reasoning in finance, spanning 17 domains with 5,280 parameterized examples and enhanced evaluation metrics including fuzzy numeric alignment.
This paper benchmarks 17 compact language models (1B-8B parameters) as generators in Russian-language RAG systems under CPU-only inference, finding that Qwen-family models offer strong quality-latency tradeoffs for private, GPU-free deployment.