@chessMan786: Fundamentals of GPU Architecture

X AI KOLs Timeline 06/27/26, 11:15 AM News

Summary

A tweet shares a link to an article about the fundamentals of GPU architecture.

Fundamentals of GPU Architecture https://t.co/MYtgFKmdLq

Original Article

View Cached Full Text

Cached at: 06/27/26, 03:57 PM

Fundamentals of GPU Architecture https://t.co/MYtgFKmdLq

Similar Articles

@vivekgalatage: Best structured reference I've found for GPU optimization - 450 papers, 14 years of research. Some techniques will have…

X AI KOLs Timeline

A tweet shares a structured reference of 450 papers on GPU optimization spanning 14 years, noting that while some techniques evolve, the mental models remain useful. It also references a lecture on GPU architectures by Onur Mutlu.

@DanKornas: GPU engineering is too broad to learn from random tabs. Awesome GPU Engineering is a curated GitHub list of resources f…

X AI KOLs Timeline

A curated GitHub list of resources for learning GPU engineering, covering architecture, kernel programming, optimization, distributed systems, and AI acceleration with books, frameworks, profilers, and interview prep.

@snowboat84: https://x.com/snowboat84/status/2061962883651731602

X AI KOLs Timeline

This article is the first part of the AI Engineering Panorama series. From a historical perspective, it reviews the evolution of GPUs from gaming graphics cards to AI accelerators, the bold bet of CUDA, the independent path of Google's TPU, and why NVIDIA ultimately prevailed. It also provides a detailed analysis of the underlying logic of AI infrastructure such as chips, supply chain, networking, and power.

@goyal__pramod: Software is evolving, so should you! These are the best blogs I read to understand GPUs and CUDA!

X AI KOLs Timeline

Tweet recommending a collection of blogs to understand GPUs and CUDA, encouraging developers to improve their skills.

@ZhihuFrontier: GPU programming changed because Tensor Cores became too fast to feed Zhihu contributor THU-PACMAN实验室 shared a sharp bre…

X AI KOLs Timeline

A detailed analysis of how NVIDIA GPU programming evolved from Volta to Blackwell, highlighting the shift from synchronous thread models to asynchronous dataflow and the challenges of feeding Tensor Cores. The article discusses new hardware features like TMA, TMEM, and tcgen05 MMA, and shows how modern kernels like FlashAttention-3 and FlashMLA exploit these changes for higher utilization.

Similar Articles

@vivekgalatage: Best structured reference I've found for GPU optimization - 450 papers, 14 years of research. Some techniques will have…

@DanKornas: GPU engineering is too broad to learn from random tabs. Awesome GPU Engineering is a curated GitHub list of resources f…

@snowboat84: https://x.com/snowboat84/status/2061962883651731602

@goyal__pramod: Software is evolving, so should you! These are the best blogs I read to understand GPUs and CUDA!

@ZhihuFrontier: GPU programming changed because Tensor Cores became too fast to feed Zhihu contributor THU-PACMAN实验室 shared a sharp bre…

Submit Feedback