cross-lingual

#cross-lingual

A Cross-lingual Comparison of Human and Classification Model Entrainment Behavior in Code-switched Speech Settings

arXiv cs.CL ↗ · 2d ago Cached

This paper presents a cross-lingual analysis of entrainment behavior in Mandarin-English, Hindi-English, and Spanish-English code-switched dialogues, finding that lexical entrainment generalizes but acoustic-prosodic and style entrainment vary. It also evaluates how classification models capture these behaviors, showing they prioritize different features than humans.

0 favorites 0 likes

#cross-lingual

The Cross-Domain Generalization Cost of Offensive Language Detection

arXiv cs.CL ↗ · 3d ago Cached

This paper proposes a diagnosis and optimization framework for cross-domain and cross-lingual generalization in offensive language detection, decomposing performance degradation into dataset and language effects, and quantifying the trade-off between multilingual capability and source-task performance.

0 favorites 0 likes

#cross-lingual

Do LLM Debates Repeat Arguments Differently Across Languages?

arXiv cs.CL ↗ · 3d ago Cached

This paper investigates whether LLM debates repeat arguments differently across languages, analyzing how argument repetition varies in multilingual contexts.

0 favorites 0 likes

#cross-lingual

Translation as Augmentation: Effect of Translated Data on Assessment of Difficulty

arXiv cs.CL ↗ · 2026-07-22 Cached

This paper proposes a cross-lingual data augmentation strategy that uses machine translation to transfer expert-annotated difficulty labels from high-resource languages to low-resource languages. Experiments with BERT-based regression models show that augmenting scarce native data with translated corpora significantly improves the accuracy of text difficulty assessment.

0 favorites 0 likes

#cross-lingual

Rationale-Guided Knowledge Distillation for Cross-Lingual Stance Detection

arXiv cs.CL ↗ · 2026-07-22 Cached

This paper proposes a rationale-guided knowledge distillation framework for cross-lingual stance detection, using chain-of-thought prompting from large language models to train a compact student model with dual-path distillation and contrastive learning.

0 favorites 0 likes

#cross-lingual

Safety That Does Not Transfer: Cross-Lingual Clinical Correctness Drift in Deployable Medical Language Models

arXiv cs.CL ↗ · 2026-07-21 Cached

This paper investigates cross-lingual clinical correctness drift in medical language models, finding that locally deployable models show significant safety degradation when queried in Hausa compared to English, while frontier models maintain competence, highlighting a critical gap in safety evaluation for low-resource settings.

0 favorites 0 likes

#cross-lingual

JOR-Bench: Japanese Operations Research Benchmarks for Large Language Models

arXiv cs.CL ↗ · 2026-07-21 Cached

JOR-Bench is a collection of five Japanese-language benchmarks for evaluating large language models on operations research problem formulation, translated from existing English benchmarks. Evaluation shows overall language-neutral performance with minor cross-lingual differences.

0 favorites 0 likes

#cross-lingual

LLM-Driven AutoML for Cross-Lingual Handwritten OCR: Closed-Loop Neural Architecture Search with GPT-5, GPT-4o, and Claude Sonnet 4

arXiv cs.AI ↗ · 2026-07-20 Cached

This paper presents an LLM-driven pipeline using GPT-5, GPT-4o, and Claude Sonnet 4 to automatically design neural network architectures for cross-lingual handwritten OCR, achieving over 93% accuracy across Arabic, English, and Persian scripts without human intervention.

0 favorites 0 likes

#cross-lingual

Which Languages Transfer Best to Warlpiri? A Similarity-Based Study for Low-Resource ASR

arXiv cs.CL ↗ · 2026-07-14 Cached

This paper investigates cross-lingual transfer for low-resource ASR in Warlpiri, proposing a similarity-based framework combining acoustic and linguistic features to select optimal source languages. Experiments show that acoustically similar languages like Assamese and Hindi significantly reduce word and character error rates.

0 favorites 0 likes

#cross-lingual

Conceptual Networks for Cross-Linguistic Idiomatic Expressions:A Feature-Based Graph Approach

arXiv cs.CL ↗ · 2026-07-13 Cached

This paper presents an interpretable network-based framework for representing idiomatic expressions across eight languages using binary conceptual features. Community detection reveals that idioms cluster by conceptual schema rather than language, and the framework improves downstream idiom detection and cross-lingual transfer over embedding-based baselines.

0 favorites 0 likes

#cross-lingual

TypeProbe: Recovering Type Representations from Hidden States of Pre-trained Code Models

arXiv cs.CL ↗ · 2026-07-10 Cached

This paper investigates whether pre-trained code models encode type information in their hidden states, using linear probes on Java and Python examples. Results show cross-lingual type representations emerge even from untyped code, and these representations are robust to lexical perturbations and syntactic variations.

0 favorites 0 likes

#cross-lingual

Zoom In Disparities in Healthcare LLM Q&A

arXiv cs.CL ↗ · 2026-07-09 Cached

This paper systematically examines cross-lingual disparities in LLM-based healthcare question answering across five languages, finding significant gaps in factual alignment and proposing the MultiWikiHealthCare dataset.

0 favorites 0 likes

#cross-lingual

From Sinhala to Dhivehi: Cross-Lingual Transfer Learning for Low-Resource Speech Recognition

arXiv cs.CL ↗ · 2026-07-08 Cached

This research investigates cross-lingual transfer learning from Sinhala to Dhivehi for automatic speech recognition, achieving significant improvements in word error rate compared to Dhivehi-only baselines.

0 favorites 0 likes

#cross-lingual

CrossHallu: Do Hallucination Signals Generalize Across Languages and Domains in Large Language Model's Internals?

arXiv cs.CL ↗ · 2026-07-07 Cached

This paper evaluates whether hallucination signals from LLM internal representations generalize across languages and domains, focusing on Arabic↔English using TruthfulQA and HalluScore. Results show transferability for most models, with cross-lingual performance depending on class separability and language alignment.

0 favorites 0 likes

#cross-lingual

SPLIT: Cross-Lingual Empathy and Cultural Grounding in English and Ukrainian LLM Responses

arXiv cs.CL ↗ · 2026-07-03 Cached

Introduces SPLIT, a 500-prompt benchmark evaluating LLM cross-lingual empathy and cultural grounding in English and Ukrainian. Findings show Gemini-2.5-Flash and LLaMA-3.3-70B-Instruct degrade in Ukrainian while DeepSeek-V3 remains stable, with weak agreement between human and AI evaluators on cultural dimensions.

0 favorites 0 likes

#cross-lingual

Layer-wise Cross-Lingual Depression Detection from Speech: Analysis with Contrastive Alignment

Hugging Face Daily Papers ↗ · 2026-07-03 Cached

This paper introduces CLeaD, a supervised contrastive alignment framework for cross-lingual depression detection from speech using WavLM embeddings. It reveals that previous results were inflated due to speaker identity leakage and achieves modest improvements on Mandarin speakers.

0 favorites 0 likes

#cross-lingual

ALEE: Any-Language Evaluation of Embeddings via English-Centric Minimal Pairs

arXiv cs.CL ↗ · 2026-07-02 Cached

Introduces ALEE, a framework that uses Abstract Meaning Representations to generate English minimal pairs with controlled semantic shifts and translates them for evaluating text embeddings across 275+ languages, revealing persistent gaps in cross-lingual semantic representation.

0 favorites 0 likes

#cross-lingual

Cross-lingual Relation Extraction with Large Language Models: Zero-Shot, Few-Shot, and Fine-Tuned Evaluation on Romanian

arXiv cs.CL ↗ · 2026-07-01 Cached

This paper investigates cross-lingual relation extraction for Romanian by translating the SemEval-2010 Task 8 benchmark and evaluating Gemma 4 under zero-shot, few-shot, and QLoRA fine-tuning, comparing with smaller encoder baselines.

0 favorites 0 likes

#cross-lingual

Soft Token Alignment for Cross-Lingual Reasoning

arXiv cs.CL ↗ · 2026-06-26 Cached

Proposes SOLAR, an auxiliary fine-tuning objective that aligns soft-token representations across languages to improve multilingual reasoning consistency, achieving up to +17.7 points accuracy gain.

0 favorites 0 likes

#cross-lingual

A Red Teaming Framework for Large Language Models: A Case Study on Faithfulness Evaluation

arXiv cs.CL ↗ · 2026-06-25 Cached

This paper presents a red teaming framework for LLMs that uses a multi-role architecture to systematically uncover vulnerabilities, particularly in faithfulness. The framework demonstrated a 7.9% increase in attack success rate in QA tasks and highlights the impact of architectural choices over parameter scaling on model safety.

0 favorites 0 likes

cross-lingual

Submit Feedback