deep-learning

#deep-learning

@wsl8297: UC's Open Course on Reinforcement Learning for LLMs uses a 'theory + practice' approach to thoroughly explain key AI training techniques from the ground up, helping you systematically build a complete framework spanning from RL to LLM training. Comprehensive curriculum paired with complete resources: lecture slides, full videos, and practical exercises are all provided so you can start implementing right away…

X AI KOLs Timeline ↗ · 21h ago Cached

Assistant Professor Ernest K. Ryu at UCLA offers the open course 'Reinforcement Learning for Large Language Models,' comprehensively analyzing key LLM training techniques like RLHF, PPO, and DPO alongside their supporting resources through a blend of theory and practice. The course provides developers and researchers with a systematic learning path from foundational algorithms to practical deployment.

0 favorites 0 likes

#deep-learning

@ickma2311: Efficient AI Lecture 12: Transformer and LLM This lecture is not only about how LLMs work. It also explains the buildin…

X AI KOLs Timeline ↗ · 22h ago Cached

Lecture notes from an Efficient AI course covering Transformer and LLM fundamentals, including multi-head attention, positional encoding, KV cache, and the connection between model architecture and inference efficiency. The content explains how design choices in transformers affect memory, latency, and hardware efficiency.

0 favorites 0 likes

#deep-learning

@yifan_zhang_: Jane Street is the GOAT As Rohan @_arohan_ mentioned, good researchers respect other people’s work. Quantitative resear…

X AI KOLs Timeline ↗ · yesterday Cached

The article highlights Jane Street's contribution to pushing the frontiers of Deep Learning through quantitative research, emphasizing the respect good researchers have for such work.

0 favorites 0 likes

#deep-learning

@Ai_Tech_tool: ANDREJ KARPATHY COULD HAVE CHARGED $2,000 FOR THIS COURSE. He put it on YouTube. The full training stack. Tokenization.…

X AI KOLs Timeline ↗ · yesterday

Highlights Andrej Karpathy's free three-hour YouTube course covering LLM fundamentals, including tokenization, neural network internals, RLHF, and reinforcement learning. Emphasizes that understanding these core architectural principles offers major career advantages over simply knowing how to use off-the-shelf AI tools.

0 favorites 0 likes

#deep-learning

@neil_xbt: Andrej Karpathy could have charged $1,000 for this computer vision lecture! He put it on YouTube. The man who built Tes…

X AI KOLs Timeline ↗ · yesterday

Andrej Karpathy released a free computer vision lecture on YouTube covering image captioning, localization, segmentation and transfer learning from his production experience at Tesla and OpenAI.

0 favorites 1 likes

#deep-learning

@tom_doerr: Structured roadmaps for AI, ML, and LLM learning https://github.com/bishwaghimire/ai-learning-roadmaps…

X AI KOLs Timeline ↗ · yesterday Cached

A comprehensive, open-source GitHub repository providing structured learning roadmaps and curated resources for mastering AI, machine learning, deep learning, and large language models from beginner to advanced levels. Designed for students and professionals, it covers foundational concepts, programming frameworks, career tracks, and emerging AI topics.

0 favorites 0 likes

#deep-learning

@tetsuoai: Forty minutes of whiteboard. The full transformer architecture. Then open vim and write it in C.

X AI KOLs Timeline ↗ · yesterday Cached

A 40-minute walkthrough explains the complete Transformer architecture via whiteboard diagrams and demonstrates a practical implementation in C using Vim.

0 favorites 0 likes

#deep-learning

Knee Osteoarthritis Severity Grading Using Optimized Deep Learning and LLM-Driven Intelligent AI on Computationally Limited Systems

arXiv cs.AI ↗ · yesterday Cached

This paper presents an automated diagnostic system for grading knee osteoarthritis severity using an optimized ResNet-18 model deployed on edge devices via TensorFlow Lite. It integrates an LLM interface using Gemini 2.0 Flash to provide structured interpretive findings while maintaining offline capability for resource-constrained environments.

0 favorites 0 likes

#deep-learning

Intelligent CCTV for Urban Design: AI-Based Analysis of Soft Infrastructure at Intersections

arXiv cs.AI ↗ · yesterday Cached

This academic paper introduces an AI-enabled analytics framework using existing CCTV infrastructure to evaluate the impact of soft traffic interventions on vehicle speed and safety at urban intersections.

0 favorites 0 likes

#deep-learning

Accelerating LMO-Based Optimization via Implicit Gradient Transport

arXiv cs.LG ↗ · yesterday Cached

This paper proposes LMO-IGT, a new class of stochastic optimization methods that accelerates convergence using implicit gradient transport while maintaining a single-gradient-per-iteration structure. It introduces a unified theoretical framework and demonstrates improved performance over existing LMO-based optimizers like Muon.

0 favorites 0 likes

#deep-learning

Feature Starvation as Geometric Instability in Sparse Autoencoders

arXiv cs.LG ↗ · yesterday Cached

This paper identifies feature starvation in sparse autoencoders as a geometric instability and proposes adaptive elastic net SAEs (AEN-SAEs) to mitigate it without heuristics.

0 favorites 0 likes

#deep-learning

Evolutionary fine tuning of quantized convolution-based deep learning models

arXiv cs.LG ↗ · yesterday Cached

This paper proposes a neuroevolution-based fine-tuning method to improve the accuracy of quantized deep learning models, showing that nearest-neighbor rounding alone is suboptimal and that evolutionary mutation of weights can yield better results on architectures like VGG and ResNet.

0 favorites 0 likes

#deep-learning

Channel-Level Semantic Perturbations: Unlearnable Examples for Diverse Training Paradigms

arXiv cs.LG ↗ · yesterday Cached

This paper systematically investigates unlearnable examples under diverse training paradigms, revealing that pretrained weights weaken existing methods, and proposes Shallow Semantic Camouflage (SSC) to maintain unlearnability by generating perturbations in a semantically valid subspace.

0 favorites 0 likes

#deep-learning

@GoodfireAI: Neural networks might speak English, but they think in shapes. Understanding their rich neural geometry is key to und…

X AI KOLs Timeline ↗ · 2d ago Cached

Goodfire AI announces a new research agenda focused on neural geometry to improve the understanding, debugging, and control of neural networks.

0 favorites 0 likes

#deep-learning

The Scaling Properties of Implicit Deductive Reasoning in Transformers

Hugging Face Daily Papers ↗ · 4d ago Cached

This research examines how deep Transformers with bidirectional masking achieve implicit deductive reasoning comparable to explicit chain-of-thought methods. The study demonstrates that algorithmically aligned models can scale reasoning capabilities across diverse graph topologies and problem widths.

0 favorites 0 likes

#deep-learning

Making Sense of the Early Universe

NVIDIA Blog ↗ · 2026-04-23 Cached

This article highlights how NVIDIA GPUs and AI models like Morpheus are enabling astronomers at UC Santa Cruz to process massive datasets from the James Webb Space Telescope, accelerating the discovery and classification of early universe galaxies.

0 favorites 0 likes

#deep-learning

He presentado CTNet: una arquitectura donde el cómputo ocurre como evolución de un estado persistente [D]

Reddit r/MachineLearning ↗ · 2026-04-23

CTNet introduces a novel neural architecture where computation is framed as the evolution of a persistent state rather than successive rewrites, incorporating re-entrant memory, multi-scale coherence, and projective output.

0 favorites 0 likes

#deep-learning

The eighth-generation TPU: An architecture deep dive

Hacker News Top ↗ · 2026-04-22 Cached

Google unveils eighth-generation TPU 8t and TPU 8i, purpose-built for massive pre-training and inference with SparseCore, native FP4, and 9,600-chip superpods to power world models and agentic AI.

0 favorites 0 likes

#deep-learning

/yolo

Reddit r/LocalLLaMA ↗ · 2026-04-21

Article concerning YOLO, the widely used real-time object detection model family.

0 favorites 0 likes

#deep-learning

Accurate and scalable exchange-correlation with deep learning

Hugging Face Daily Papers ↗ · 2026-04-21 Cached

Microsoft Research releases Skala, a deep-learning exchange-correlation functional for DFT that achieves 2.8 kcal/mol accuracy on GMTKN55 at semi-local cost, outperforming traditional functionals across broad chemistry benchmarks.

0 favorites 0 likes

deep-learning

Submit Feedback