Tag
Tesla promotes its Full Self-Driving Supervised feature, directing users to try the technology.
Jensen Huang reflects on his collaboration with Elon Musk in building early computer systems for Tesla vehicles like the Model S and Model 3, specifically supporting their autonomous driving initiatives.
Elon Musk explains that Tesla FSD utilizes AI photon count reconstruction rather than standard RGB, enabling superior performance in low-light and high-glare conditions.
A brief mention of Tesla AI Vision, referring to Tesla's computer vision-based approach to autonomous driving.
Elon Musk announces that Tesla's AI Vision system now deploys airbags before impact to reduce injury risk, a feature included free on all new vehicles.
Tesla announces its Vision system can detect unavoidable collisions and deploy airbags up to 70 milliseconds earlier, potentially making the difference between serious injury and walking away from a crash.
ReflectDrive-2 is a new discrete diffusion planner for autonomous driving that uses reinforcement learning to enable self-editing of trajectory tokens, achieving high performance and low latency on the NAVSIM benchmark.
This paper introduces HERMES++, a unified driving world model that integrates 3D scene understanding and future geometry prediction using BEV representation, LLM-enhanced queries, and joint geometric optimization.
OneVL is a unified vision-language-action framework that compresses chain-of-thought reasoning into latent tokens supervised by both language and visual world model decoders, achieving state-of-the-art trajectory prediction accuracy for autonomous driving at answer-only inference latency. It is the first latent CoT method to surpass explicit CoT across four benchmarks.
FlashDrive reduces reasoning vision-language-action model inference latency from 716 ms to 159 ms on RTX PRO 6000—up to 5.7× faster—with zero accuracy loss, enabling real-time autonomous applications.
RAD-2 presents a unified generator-discriminator framework for autonomous driving that combines diffusion-based trajectory generation with RL-optimized reranking, achieving 56% collision rate reduction compared to diffusion-based planners. The approach introduces techniques like Temporally Consistent Group Relative Policy Optimization and BEV-Warp simulation environment for efficient large-scale training.
Re2Pix is a hierarchical video prediction framework that improves future video generation by first predicting semantic representations using frozen vision foundation models, then conditioning a latent diffusion model on these predictions to generate photorealistic frames. The approach addresses train-test mismatches through nested dropout and mixed supervision strategies, achieving improved temporal semantic consistency and perceptual quality on autonomous driving benchmarks.