Stochasticity in Tokenization Improves Robustness

arXiv cs.CL 04/20/26, 04:00 AM Papers

Summary

This paper demonstrates that training large language models with stochastic tokenization instead of deterministic canonical tokenization significantly improves robustness to adversarial attacks and random perturbations, with improvements shown across pre-training, fine-tuning, and in-context learning without increasing inference costs.

arXiv:2604.16037v1 Announce Type: new Abstract: The widespread adoption of large language models (LLMs) has increased concerns about their robustness. Vulnerabilities in perturbations of tokenization of the input indicate that models trained with a deterministic canonical tokenization can be brittle to adversarial attacks. Recent studies suggest that stochastic tokenization can deliver internal representations that are less sensitive to perturbations. In this paper, we analyze how stochastic tokenization affects robustness to adversarial attacks and random perturbations. We systematically study this over a range of learning regimes (pre-training, supervised fine-tuning, and in-context learning), datasets, and model architectures. We show that pre-training and fine-tuning with uniformly sampled stochastic tokenization improve robustness to random and adversarial perturbations. Evaluating on uniformly sampled non-canonical tokenizations reduces the accuracy of a canonically trained Llama-1b model by 29.8%. We find that training with stochastic tokenization preserves accuracy without increasing inference cost.

Original Article

View Cached Full Text

Cached at: 04/20/26, 08:29 AM

# Stochasticity in Tokenisation Improves Robustness
Source: https://arxiv.org/abs/2604.16037
View PDF (https://arxiv.org/pdf/2604.16037)

> Abstract: The widespread adoption of large language models (LLMs) has increased concerns about their robustness. Vulnerabilities in perturbations of tokenisation of the input indicate that models trained with a deterministic canonical tokenisation can be brittle to adversarial attacks. Recent studies suggest that stochastic tokenisation can deliver internal representations that are less sensitive to perturbations. In this paper, we analyse how stochastic tokenisations affect robustness to adversarial attacks and random perturbations. We systematically study this over a range of learning regimes (pre-training, supervised fine-tuning, and in-context learning), datasets, and model architectures. We show that pre-training and fine-tuning with uniformly sampled stochastic tokenisations improve robustness to random and adversarial perturbations. Evaluating on uniformly sampled non-canonical tokenisations reduces the accuracy of a canonically trained Llama-1b model by 29.8%. We find that training with stochastic tokenisation preserves accuracy without increasing inference cost.

## Submission history

From: Sophie Steger [view email (https://arxiv.org/show-email/c01e50c3/2604.16037)] **[v1]** Fri, 17 Apr 2026 13:05:46 UTC (88 KB)

Stochasticity in Tokenization Improves Robustness

Similar Articles

Probabilistic Attribution For Large Language Models

Demystifying Training-Time Augmentation for Data-Constrained Language Model Pretraining

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Emergent retokenization symmetry in large language models: phenomenology and applications

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation

Submit Feedback

Similar Articles

Probabilistic Attribution For Large Language Models

Demystifying Training-Time Augmentation for Data-Constrained Language Model Pretraining

Entropy-KL Divergence-based Token Masking: A Novel Approach for Selective Fine-tuning of Large Language Models

Emergent retokenization symmetry in large language models: phenomenology and applications

Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation