cross-lingual

#cross-lingual

Safety That Does Not Transfer: Cross-Lingual Clinical Correctness Drift in Deployable Medical Language Models

arXiv cs.CL ↗ · 5h ago Cached

This paper investigates cross-lingual clinical correctness drift in medical language models, finding that locally deployable models show significant safety degradation when queried in Hausa compared to English, while frontier models maintain competence, highlighting a critical gap in safety evaluation for low-resource settings.

0 favorites 0 likes

#cross-lingual

JOR-Bench: Japanese Operations Research Benchmarks for Large Language Models

arXiv cs.CL ↗ · 5h ago Cached

JOR-Bench is a collection of five Japanese-language benchmarks for evaluating large language models on operations research problem formulation, translated from existing English benchmarks. Evaluation shows overall language-neutral performance with minor cross-lingual differences.

0 favorites 0 likes

#cross-lingual

LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4

arXiv cs.AI ↗ · yesterday Cached

This paper presents an LLM-driven pipeline using GPT-5, GPT-4o, and Claude Sonnet 4 to automatically design neural network architectures for cross-lingual handwritten OCR, achieving over 93% accuracy across Arabic, English, and Persian scripts without human intervention.

0 favorites 0 likes

#cross-lingual

Which Languages Transfer Best to Warlpiri? A Similarity-Based Study for Low-Resource ASR

arXiv cs.CL ↗ · 2026-07-14 Cached

This paper investigates cross-lingual transfer for low-resource ASR in Warlpiri, proposing a similarity-based framework combining acoustic and linguistic features to select optimal source languages. Experiments show that acoustically similar languages like Assamese and Hindi significantly reduce word and character error rates.

0 favorites 0 likes

#cross-lingual

Conceptual Networks for Cross-Linguistic Idiomatic Expressions:A Feature-Based Graph Approach

arXiv cs.CL ↗ · 2026-07-13 Cached

This paper presents an interpretable network-based framework for representing idiomatic expressions across eight languages using binary conceptual features. Community detection reveals that idioms cluster by conceptual schema rather than language, and the framework improves downstream idiom detection and cross-lingual transfer over embedding-based baselines.

0 favorites 0 likes

#cross-lingual

TypeProbe: Recovering Type Representations from Hidden States of Pre-trained Code Models

arXiv cs.CL ↗ · 2026-07-10 Cached

This paper investigates whether pre-trained code models encode type information in their hidden states, using linear probes on Java and Python examples. Results show cross-lingual type representations emerge even from untyped code, and these representations are robust to lexical perturbations and syntactic variations.

0 favorites 0 likes

#cross-lingual

Zoom In Disparities in Healthcare LLM Q&A

arXiv cs.CL ↗ · 2026-07-09 Cached

This paper systematically examines cross-lingual disparities in LLM-based healthcare question answering across five languages, finding significant gaps in factual alignment and proposing the MultiWikiHealthCare dataset.

0 favorites 0 likes

#cross-lingual

From Sinhala to Dhivehi: Cross-Lingual Transfer Learning for Low-Resource Speech Recognition

arXiv cs.CL ↗ · 2026-07-08 Cached

This research investigates cross-lingual transfer learning from Sinhala to Dhivehi for automatic speech recognition, achieving significant improvements in word error rate compared to Dhivehi-only baselines.

0 favorites 0 likes

#cross-lingual

CrossHallu: Do Hallucination Signals Generalize Across Languages and Domains in Large Language Model's Internals?

arXiv cs.CL ↗ · 2026-07-07 Cached

This paper evaluates whether hallucination signals from LLM internal representations generalize across languages and domains, focusing on Arabic↔English using TruthfulQA and HalluScore. Results show transferability for most models, with cross-lingual performance depending on class separability and language alignment.

0 favorites 0 likes

#cross-lingual

SPLIT: Cross-Lingual Empathy and Cultural Grounding in English and Ukrainian LLM Responses

arXiv cs.CL ↗ · 2026-07-03 Cached

Introduces SPLIT, a 500-prompt benchmark evaluating LLM cross-lingual empathy and cultural grounding in English and Ukrainian. Findings show Gemini-2.5-Flash and LLaMA-3.3-70B-Instruct degrade in Ukrainian while DeepSeek-V3 remains stable, with weak agreement between human and AI evaluators on cultural dimensions.

0 favorites 0 likes

#cross-lingual

Layer-wise Cross-Lingual Depression Detection from Speech: Analysis with Contrastive Alignment

Hugging Face Daily Papers ↗ · 2026-07-03 Cached

This paper introduces CLeaD, a supervised contrastive alignment framework for cross-lingual depression detection from speech using WavLM embeddings. It reveals that previous results were inflated due to speaker identity leakage and achieves modest improvements on Mandarin speakers.

0 favorites 0 likes

#cross-lingual

ALEE: Any-Language Evaluation of Embeddings via English-Centric Minimal Pairs

arXiv cs.CL ↗ · 2026-07-02 Cached

Introduces ALEE, a framework that uses Abstract Meaning Representations to generate English minimal pairs with controlled semantic shifts and translates them for evaluating text embeddings across 275+ languages, revealing persistent gaps in cross-lingual semantic representation.

0 favorites 0 likes

#cross-lingual

Cross-lingual Relation Extraction with Large Language Models: Zero-Shot, Few-Shot, and Fine-Tuned Evaluation on Romanian

arXiv cs.CL ↗ · 2026-07-01 Cached

This paper investigates cross-lingual relation extraction for Romanian by translating the SemEval-2010 Task 8 benchmark and evaluating Gemma 4 under zero-shot, few-shot, and QLoRA fine-tuning, comparing with smaller encoder baselines.

0 favorites 0 likes

#cross-lingual

Soft Token Alignment for Cross-Lingual Reasoning

arXiv cs.CL ↗ · 2026-06-26 Cached

Proposes SOLAR, an auxiliary fine-tuning objective that aligns soft-token representations across languages to improve multilingual reasoning consistency, achieving up to +17.7 points accuracy gain.

0 favorites 0 likes

#cross-lingual

A Red Teaming Framework for Large Language Models: A Case Study on Faithfulness Evaluation

arXiv cs.CL ↗ · 2026-06-25 Cached

This paper presents a red teaming framework for LLMs that uses a multi-role architecture to systematically uncover vulnerabilities, particularly in faithfulness. The framework demonstrated a 7.9% increase in attack success rate in QA tasks and highlights the impact of architectural choices over parameter scaling on model safety.

0 favorites 0 likes

#cross-lingual

Cross-Lingual Exploration for Parametric Knowledge

arXiv cs.CL ↗ · 2026-06-24 Cached

This paper explores cross-lingual prompting strategies to improve access to parametric knowledge in large language models, demonstrating significant gains in knowledge transfer and factual recall across 17 languages on multilingual benchmarks.

0 favorites 0 likes

#cross-lingual

MMed-Bench-IR: A Heterogeneous Benchmark for Multilingual Medical Information Retrieval

arXiv cs.CL ↗ · 2026-06-24 Cached

MMed-Bench-IR is a heterogeneous benchmark for multilingual medical information retrieval across six languages, evaluating cross-lingual alignment, concept discrimination, and evidence retrieval. It reveals severe performance drops for non-English queries, highlighting gaps in existing English-only evaluations.

0 favorites 0 likes

#cross-lingual

Data Scale, Not Latency, Shapes Cross-Lingual Encoder Transfer in Streaming ASR

arXiv cs.AI ↗ · 2026-06-24 Cached

This paper investigates the impact of data scale versus latency on cross-lingual transfer for streaming ASR, finding that multilingual initialization benefits are data-limited, not latency-limited, and diminish as target-language data increases.

0 favorites 0 likes

#cross-lingual

G-IdiomAlign: A Gloss-Pivoted Benchmark for Cross-Lingual Idiom Alignment

arXiv cs.CL ↗ · 2026-06-18 Cached

G-IdiomAlign is a gloss-pivoted benchmark for evaluating cross-lingual idiom alignment in LLMs, featuring controlled multiple-choice and gloss-contrastive protocols to diagnose literal translation bias and the effect of semantic pivots.

0 favorites 0 likes

#cross-lingual

LLM Parameters for Math Across Languages: Shared or Separate?

arXiv cs.CL ↗ · 2026-06-18 Cached

This paper presents a cross-lingual mechanistic analysis of mathematical reasoning in LLMs, finding partial overlap of math-associated parameters across languages, concentrated in intermediate layers. English has the largest set of math-relevant parameters, while lower-resource languages have smaller sets.

0 favorites 0 likes

cross-lingual

Submit Feedback