Tag
TPA-AD is a two-stage pseudo anomaly-guided method for bearing time-series anomaly detection that generates pseudo-anomalous windows near normal boundaries using reconstruction models and contrastive learning, then scores anomalies with KNN—without requiring real anomaly samples during training. It is evaluated on bearing fault and degradation datasets, including high-speed train axle-box bearing data.
This paper introduces CoughSense, a system that classifies cough recordings into five respiratory disease categories using a fine-tuned Whisper encoder with active-frame pooling, achieving 82.3% balanced accuracy and deployed as a real-time mobile app.
This paper proposes CL-DMDF, a dynamic multimodal data fusion model that uses contrastive learning and a dual-dimensional attention mechanism to handle missing modalities and improve discriminative learning.
This paper introduces StenCE, a pretraining framework that uses cross-modal contrastive learning between ECG and X-ray angiography representations to detect severe coronary stenosis from ECGs, achieving high performance and enabling early diagnosis even in asymptomatic patients.
BRepCLIP introduces contrastive multimodal pretraining on boundary representation (BRep) primitives for CAD understanding, aligning BRep geometry with language and image embeddings to achieve state-of-the-art retrieval and zero-shot classification.
CHAM-net introduces a contrastive hierarchical adaptive meta-network that captures site-specific and cross-year dynamics for robust global methane flux prediction, outperforming baseline methods on simulation and observational datasets.
The paper identifies a misalignment between the softmax-based InfoNCE loss and the normalized embedding setting in modern contrastive learning. It proposes WEINCE, a simple modification that blends softmax logits with an endpoint shortfall correction using extreme value theory, yielding consistent improvements across vision benchmarks.
This paper proposes semantic motion anchors, natural-language abstractions of gesture motion for co-speech gesture retrieval and synthesis. The method discretizes 3D gestures into body-hand motion primitives and grounds them in transcripts, achieving significant improvements in text-to-gesture retrieval and user preference in generation.
DOMINO is a novel framework that learns minimal sufficient domain representations from reference examples to synthesize domain-specific data for LLMs, improving code benchmark performance without requiring explicit domain descriptions.
This paper introduces Guidance Contrastive Policy Optimization (GCPO), a novel algorithm that enables per-token credit assignment in reinforcement learning by contrasting model predictions under positive and negative prompts, consistently outperforming GRPO and DAPO baselines on text-to-image generation and chain-of-thought reasoning benchmarks.
The SAVE framework improves reward model training by using value functions to grade on-policy responses and update models through contrastive objectives, achieving outperforming results across six benchmarks.
Hide-and-Seek is a framework that detects robot execution failures in VLA models by localizing failure-indicative actions through contrastive learning without step-level annotations, achieving state-of-the-art multi-task failure detection.
This paper proposes GASP, a framework that injects geometric priors into vision-language models via deep supervision with contrastive and depth consistency losses, achieving significant improvements on 3D spatial reasoning benchmarks without using 3D VQA data.
This paper introduces CARL, a method for offline hierarchical reinforcement learning that exploits local dynamics regularity to learn reusable skills. The approach clusters state-goal pairs requiring similar action sequences, enabling more effective skill reuse and improved performance on complex humanoid tasks.
Proposes CALAD, a channel-aware contrastive learning framework for multivariate time series anomaly detection that uses estimated channel relevance to construct contrastive samples, achieving state-of-the-art performance.
PEARL introduces a contrastive percentile approximation framework to mitigate behavioral intensity imbalance in recommender systems, achieving significant gains in engagement metrics in a production livestream platform serving billions of users.
Introduces the Temporal Contrastive Transformer (TCT), a self-supervised framework for learning temporal embeddings from financial transactions for fraud detection. Achieves AUC 0.8644 with embeddings alone but does not improve over strong engineered features (AUC 0.9205 vs 0.9245), indicating learned representations overlap with existing features.
This paper introduces PromptNCE, a method that uses large language models and contrastive prompts to estimate pointwise mutual information zero-shot, achieving high correlation with human-derived ground truth across three datasets.
Proposes DIVE, a compression adapter for embedding dimensionality reduction that uses self-limiting gradient updates and head-wise NT-Xent contrastive loss to prevent overfitting on small datasets, outperforming existing methods on BEIR benchmarks.
CEPO improves reinforcement learning with verifiable rewards by using contrastive signals from rejected rollouts to distinguish decisive reasoning steps from filler tokens, achieving higher accuracy on multimodal math reasoning benchmarks compared to GRPO.