Tag
This paper explores cross-lingual prompting strategies to improve access to parametric knowledge in large language models, demonstrating significant gains in knowledge transfer and factual recall across 17 languages on multilingual benchmarks.
MMed-Bench-IR is a heterogeneous benchmark for multilingual medical information retrieval across six languages, evaluating cross-lingual alignment, concept discrimination, and evidence retrieval. It reveals severe performance drops for non-English queries, highlighting gaps in existing English-only evaluations.
Mistral AI releases Mistral OCR 4, which can read historical handwritten manuscripts and provides bounding boxes, block classification, and inline confidence scores in 170 languages.
Mistral AI releases Mistral OCR 4, a compact document intelligence model that provides bounding boxes, block classification, and inline confidence scores for structured text extraction. It supports 170 languages, runs in a single container for self-hosted deployment, and integrates with the Mistral Search Toolkit for enterprise search and RAG pipelines.
ReMMD introduces a realistic multilingual multi-image agentic verification framework for multimodal misinformation detection, including a benchmark (ReMMDBench) with 500 samples and 2,756 images, and an agent (ReMMD-Agent) that achieves superior veracity performance with reduced costs.
PP-OCRv6 is the latest generation of PaddleOCR's universal OCR model family, offering three tiers from 1.5M to 34.5M parameters, supporting 50 languages, and achieving significant accuracy improvements over previous versions.
Apertus is a fully open foundation model for sovereign AI, developed by the Swiss AI Initiative. It is open weights, open data, open science, compliant with EU AI Act, and competitive with top open models at 8B and 70B parameters, supporting over 1000 languages.
OpenAI announces GPT-5.5 Instant, now on par with frontier thinking models for health-related questions, available to all free users, with improvements in recognizing urgent care and explaining uncertainty.
Liquid AI introduces LFM2.5-Embedding-350M and LFM2.5-ColBERT-350M, two multilingual retrieval models optimized for fast and accurate search across 11 languages, with latency as low as 1.5ms.
MOSS-TTS-Local Transformer v1.5 is an open-source 48 kHz stereo TTS model with zero-shot voice cloning, native streaming, and support for 31 languages, built on a Qwen3-4B backbone and served via SGLang-Omni.
MosiAI has released MOSS-TTS Local Transformer v1.5, a text-to-speech model that supports voice cloning, over 30 languages, and high-quality 48 kHz output.
This paper presents a cross-lingual mechanistic analysis of mathematical reasoning in LLMs, finding partial overlap of math-associated parameters across languages, concentrated in intermediate layers. English has the largest set of math-relevant parameters, while lower-resource languages have smaller sets.
VoxCPM2 is an open-source speech synthesis model from OpenBMB, using a tokenizer-free diffusion autoregressive architecture, supporting 30 languages, voice design, and controllable voice cloning. It can clone a voice with just one sentence, or create a brand new voice using text, outputting 48kHz high-quality audio, and is commercially usable.
This paper empirically studies cross-lingual transfer in in-context learning across seven tasks, six models, and typologically diverse languages, showing that fine-tuning based expectations do not consistently apply and offering new heuristics for source language selection.
This paper addresses the problem of spoken language adherence in multimodal LLMs for ASR, proposing a soft prompting approach and novel metric to quantify language violations. It evaluates three mitigation strategies—zero-shot prompting, supervised fine-tuning, and chain-of-thought reasoning—across multiple languages to improve transcription fidelity.
This paper presents the first systematic study of multilingual instruction following in Vision-Language-Action (VLA) models, revealing significant performance degradation when models trained on English are evaluated on other languages. The authors propose Multilingual Principal Component Alignment (MPCA) to reduce the multilingual performance gap.
This paper introduces Multilingual-IRT, a statistical framework extending Item Response Theory with per-language difficulty deviations and split discriminability, enabling efficient prediction of unobserved evaluations, detection of translation errors, and recovery of culture-specific items across 29 languages.
This paper introduces Grammatical Error Representation (GER), a novel method for retrieving in-context demonstrations based on error patterns rather than semantic similarity, significantly improving multilingual grammatical error correction performance in LLMs with in-context learning.
AmchiBias introduces the first benchmark for measuring socio-cultural stereotypical bias in Goan identity groups, covering 313 minimal pairs in English and Konkani across eight sociodemographic dimensions. Evaluating multilingual encoder models reveals near-chance performance in Konkani and limited Goan cultural competence.
This paper introduces AdaMame, a two-stage training recipe (SFT + GRPO) to adaptively align reasoning language with query language in multilingual mathematical reasoning, mitigating language collapse without sacrificing accuracy.