gpt-2

#gpt-2

gpt-2 is too dangerous to be released

Reddit r/singularity ↗ · 5h ago

OpenAI trained the GPT-2 language model but deemed it too dangerous to release to the public due to potential misuse.

0 favorites 0 likes

#gpt-2

Current World Models Lack a Persistent State Core

Hugging Face Daily Papers ↗ · 2026-06-18 Cached

This paper argues that current world models lack a persistent state core, proposing a hybrid approach that adds temporal-causal structure via η-pseudo-unitary operator dynamics to convert pretrained GPT-2 into a time-reasoning model.

0 favorites 0 likes

#gpt-2

Nothing from Something: Can a Language Model Discover 0?

arXiv cs.AI ↗ · 2026-06-17 Cached

This paper examines whether language models can independently discover the concept of zero as a form of out-of-distribution generalization, finding that GPT-2 sized models cannot at test time but improve with training on examples of zero, and that language pretraining reduces the number of required examples.

0 favorites 0 likes

#gpt-2

The Data Manifold under the Microscope

Hugging Face Daily Papers ↗ · 2026-06-14 Cached

This paper investigates converting pretrained GPT-2 into a time-reasoning model using η-pseudo-unitary operator dynamics, providing mathematical foundations and key findings on PT-breaking transitions and reversible/irreversible sequences.

0 favorites 0 likes

#gpt-2

Timeline of AI models since GPT-2. Model releases are accelerating over time.

Reddit r/ArtificialInteligence ↗ · 2026-06-01

An article chronicling the timeline of AI model releases since GPT-2, highlighting the accelerating pace of model launches over time.

0 favorites 0 likes

#gpt-2

MechRL: Reinforcement Learning Agents Perform Circuit Discovery for Mechanistic Interpretability

arXiv cs.LG ↗ · 2026-05-27 Cached

Proposes MechRL, a reinforcement learning approach to automate circuit discovery in transformer language models. A PPO agent trained on multiple tasks discovers attention head circuits that match known canonical circuits and generalizes to a held-out task.

0 favorites 0 likes

#gpt-2

I built a tool that shows you what GPT-2 is "thinking" in real-time as it generates 3D graph of concept activations per token [R]

Reddit r/MachineLearning ↗ · 2026-05-19

A developer built AXON, a tool that visualizes GPT-2's internal concept activations as a live 3D force graph using Sparse Autoencoders, allowing users to see interpretable features firing before token generation.

0 favorites 0 likes

#gpt-2

@Modular: The MAX-LLM book just made it even easier to build an LLM from scratch. The new notebook format lets you run the GPT-2 …

X AI KOLs Following ↗ · 2026-05-14 Cached

The MAX-LLM book now provides interactive Jupyter notebooks that walk through building a complete GPT-2 implementation from scratch using the MAX framework, enabling users to explore tensor shapes, run components, and generate text.

0 favorites 0 likes

#gpt-2

Can We Locate and Prevent Stereotypes in LLMs?

arXiv cs.CL ↗ · 2026-04-23 Cached

ArXiv preprint maps stereotype-encoding neurons and attention heads in GPT-2 Small and Llama 3.2, showing biases cluster in small neuron subsets yet ablating them barely reduces biased text generation.

0 favorites 0 likes

#gpt-2

The game specific meme potential on gpt image 2 is insane

Reddit r/singularity ↗ · 2026-04-21

Users are discovering strong meme-generation capabilities in GPT Image 2, particularly for game-specific humor.

0 favorites 0 likes

#gpt-2

Transformer Explainer: Interactive Learning of Text-Generative Models

Papers with Code Trending ↗ · 2024-08-08 Cached

Transformer Explainer is an interactive visualization tool that allows non-experts to understand the inner workings of the GPT-2 model through real-time experimentation and visualization in a web browser.

0 favorites 0 likes

#gpt-2

Image GPT

OpenAI Blog ↗ · 2020-06-17 Cached

OpenAI's Image GPT (iGPT) applies GPT-2 transformers to pixel sequences for image generation and classification, demonstrating that the same architecture used for language can learn coherent visual features in an unsupervised manner and achieve competitive performance on image classification benchmarks.

0 favorites 0 likes

#gpt-2

GPT-2: 1.5B release

OpenAI Blog ↗ · 2019-11-05 Cached

OpenAI releases GPT-2 1.5B model with analysis of human perception of credibility, potential for misuse through fine-tuning on extremist ideologies, and challenges in detecting synthetic text. Detection models achieve ~95% accuracy but require complementary approaches for practical deployment.

0 favorites 0 likes

#gpt-2

Fine-tuning GPT-2 from human preferences

OpenAI Blog ↗ · 2019-09-19 Cached

OpenAI demonstrates fine-tuning GPT-2 (774M parameters) using human preference feedback for text continuation and summarization tasks, requiring 5k labels for stylistic tasks and 60k for summarization, with models achieving 86-88% human preference rates though revealing labeler heuristic exploitation.

0 favorites 0 likes

#gpt-2

GPT-2: 6-month follow-up

OpenAI Blog ↗ · 2019-08-20 Cached

OpenAI discusses their 6-month follow-up to GPT-2 release, outlining plans to release the 1558M parameter model in a few months and emphasizing staged release and partnership-based sharing as key to responsible AI publication.

0 favorites 0 likes

#gpt-2

Better language models and their implications

OpenAI Blog ↗ · 2019-02-14 Cached

OpenAI introduces GPT-2, a 1.5 billion parameter transformer-based language model trained on 40GB of internet text that achieves state-of-the-art performance on language modeling benchmarks and demonstrates zero-shot capabilities in reading comprehension, translation, question answering, and summarization. Due to safety concerns, only a smaller model and technical paper are released publicly rather than the full trained model.

0 favorites 0 likes

gpt-2

Submit Feedback