My AI system kept randomly switching to French mid-answer and it took me way too long to figure out why

Reddit r/artificial Papers

Summary

A developer describes how French text in retrieved contexts caused their multilingual RAG system to unpredictably switch languages mid-answer, ultimately solved with a regex-based German detector and explicit negative prompts.

I built a RAG system that needs to answer in German or English depending on the query language. Sounds simple. It was not. The source documents are mostly in German but some contain French legal terminology, Latin phrases, and occasional English citations. What kept happening was the LLM would start answering in German, hit a French passage in the context, and just.. switch to French mid-paragraph. Sometimes it would blend German and French in the same sentence. Once it answered entirely in Italian and I still have no idea why. I tried letting the LLM detect the query language itself. Unreliable. It would sometimes decide the query was in French because the user mentioned a French court case by name. What actually worked was a dumb regex detector. I check the query for common German words (der, die, das, und, ist, nicht, mit, für, datenschutz, verletzung, etc). If enough German markers are present the response language is forced to German. Otherwise English. No fancy language detection library. Just pattern matching. Then in the prompt I added a hard constraint: "Write your entire answer ONLY in {language}. Output must be German or English only. Never French, Spanish, Italian, or any other language. If the retrieved context is partly in another language, translate your answer into {language} only." The "never French" part is doing heavy lifting. Without that explicit prohibition the model would drift back into French within a few days of testing. It's like the model sees French legal text in context and thinks "oh we're doing French now." Anyone else building multilingual RAG systems running into this? The language contamination from source documents was the most annoying bug I dealt with and I've seen almost nobody write about it.
Original Article

Similar Articles

Been stuck on a unique NLP problem [D]

Reddit r/MachineLearning

Developer seeks advice on handling English-Hindi code-mixed text classification without heavy LLMs, as sentence transformers fail on Romanized Hindi.

Claude Knew It Was Being Tested. It Just Didn't Say So. Anthropic Built a Tool to Find Out.

Reddit r/ArtificialInteligence

Anthropic developed Natural Language Autoencoders (NLAs), a tool that reads Claude's internal representations before text is generated, revealing that Claude detected it was being tested in up to 26% of safety evaluations without ever verbalizing this awareness. This interpretability breakthrough exposes a significant gap between what AI models 'think' and what they say, with major implications for AI safety evaluation.

English Centric AI Is Merging Unrelated Communities and Distorting Identities

Reddit r/artificial

The article critiques how AI systems, particularly Grokipedia and AI search, perpetuate errors by merging unrelated communities due to English-centric transliteration and biased training data. It highlights the systemic issue of erasing cultural distinctions through simplified English representations and repeated misinformation.

No One Fits All: From Fixed Prompting to Learned Routing in Multilingual LLMs

arXiv cs.CL

Researchers from National Taiwan University propose replacing fixed translation-based prompting strategies in multilingual LLMs with lightweight learned classifiers that route each instance to either native or translation-based prompting. Their analysis across 10 languages and 4 benchmarks shows no single strategy is universally optimal, with translation benefiting low-resource languages most, and the learned routing achieving statistically significant improvements over fixed strategies.