adversarial-training

Tag

Cards List
#adversarial-training

Robust Biomedical Publication Type and Study Design Classification with Knowledge-Guided Perturbations

arXiv cs.CL · 15h ago Cached

This paper presents a robust evaluation framework and training strategies for biomedical publication type and study design classification, using knowledge-guided perturbations to mitigate reliance on spurious features.

0 favorites 0 likes
#adversarial-training

FragileFlow: Spectral Control of Correct-but-Fragile Predictions for Foundation Model Robustness

arXiv cs.CL · yesterday Cached

This paper introduces FragileFlow, a plug-in regularizer that improves the robustness of LLMs and VLMs by controlling 'correct-but-fragile' predictions through spectral analysis and PAC-Bayes bounds.

0 favorites 0 likes
#adversarial-training

Information Theoretic Adversarial Training of Large Language Models

arXiv cs.LG · 5d ago Cached

This paper introduces WARDEN, a distributionally robust adversarial training framework for large language models that uses f-divergence to dynamically reweight adversarial examples, significantly reducing attack success rates while maintaining computational efficiency.

0 favorites 0 likes
#adversarial-training

Continuously hardening ChatGPT Atlas against prompt injection

OpenAI Blog · 2025-12-22 Cached

OpenAI announces security hardening of ChatGPT Atlas against prompt injection attacks through adversarial training and strengthened safeguards, including a rapid response loop for discovering and mitigating novel attack strategies before they appear in the wild.

0 favorites 0 likes
#adversarial-training

Adversarial training methods for semi-supervised text classification

OpenAI Blog · 2016-05-25 Cached

This paper presents adversarial and virtual adversarial training methods adapted for text classification by applying perturbations to word embeddings in RNNs rather than raw inputs. The approach achieves state-of-the-art results on semi-supervised and supervised text classification benchmarks while reducing overfitting.

0 favorites 0 likes
← Back to home

Submit Feedback