language-model

#language-model

new MoE from ai2, EMO

Reddit r/LocalLLaMA ↗ · yesterday

AI2 released EMO, a Mixture of Experts language model with 1B active parameters out of 14B total, trained on 1 trillion tokens and featuring document-level routing where experts cluster around domains.

0 favorites 1 likes

#language-model

Construction of Knowledge Graph based on Language Model

arXiv cs.CL ↗ · 2026-04-22 Cached

Review paper from Kunming University surveys how pre-trained language models automate knowledge-graph construction and introduces LLHKG, a lightweight-LLM framework matching GPT-3.5 performance.

0 favorites 0 likes

#language-model

Remask, Don't Replace: Token-to-Mask Refinement in Masked Diffusion Language Models

arXiv cs.CL ↗ · 2026-04-22 Cached

Introduces Token-to-Mask (T2M) remasking to fix generation errors in masked diffusion LMs by resetting suspect tokens to mask state instead of overwriting, yielding up to +5.92 accuracy on CMATH without extra training or parameters.

0 favorites 0 likes

#language-model

Bulding my own Diffusion Language Model from scratch was easier than I thought [P]

Reddit r/MachineLearning ↗ · 2026-04-21

Developer shares a minimalist 7.5M-parameter diffusion language model trained from scratch on Shakespeare, releasing the code as a learning resource.

0 favorites 0 likes

#language-model

grok 4.3 beta: musk's ($300/month) megaphone

Reddit r/singularity ↗ · 2026-04-18

Grok 4.3 beta has been released, offering advanced AI capabilities through xAI's subscription service at $300/month, representing an incremental update to Elon Musk's AI assistant platform.

0 favorites 0 likes

#language-model

VaultGemma: The world's most capable differentially private LLM

Google DeepMind Blog ↗ · 2025-10-23 Cached

Google and DeepMind introduce VaultGemma, a 1B-parameter open-source language model trained with differential privacy, accompanied by new scaling laws research that characterizes the compute-privacy-utility trade-offs in differentially private LLM training.

0 favorites 0 likes

#language-model

Introducing gpt-oss

OpenAI Blog ↗ · 2025-08-05 Cached

OpenAI releases gpt-oss-120b and gpt-oss-20b, two state-of-the-art open-weight language models under Apache 2.0 license that achieve near-parity with proprietary models while being optimizable for consumer hardware and edge devices. Both models demonstrate strong reasoning and tool-use capabilities with comprehensive safety evaluations.

0 favorites 0 likes

#language-model

Introducing ChatGPT

OpenAI Blog ↗ · 2022-11-30 Cached

OpenAI introduces ChatGPT, a conversational AI model fine-tuned from GPT-3.5 using reinforcement learning from human feedback (RLHF). The model is designed to answer follow-up questions, admit mistakes, and reject inappropriate requests, with free access provided during the research preview.

0 favorites 0 likes

#language-model

GPT-3 powers the next generation of apps

OpenAI Blog ↗ · 2021-03-25 Cached

OpenAI announces that over 300 applications are now using GPT-3 through their API, nine months after launch, generating 4.5 billion words daily. Featured use cases include Viable for customer feedback analysis, Fable Studio for interactive storytelling, and Algolia for semantic search capabilities.

0 favorites 0 likes

#language-model

OpenAI API

OpenAI Blog ↗ · 2020-06-11 Cached

OpenAI announces the release of an API for accessing its AI models with a general-purpose text interface, launching in private beta with strict safety measures including mandatory production reviews and content restrictions to prevent harmful use cases.

0 favorites 0 likes

#language-model

GPT-2: 1.5B release

OpenAI Blog ↗ · 2019-11-05 Cached

OpenAI releases GPT-2 1.5B model with analysis of human perception of credibility, potential for misuse through fine-tuning on extremist ideologies, and challenges in detecting synthetic text. Detection models achieve ~95% accuracy but require complementary approaches for practical deployment.

0 favorites 0 likes

#language-model

Better language models and their implications

OpenAI Blog ↗ · 2019-02-14 Cached

OpenAI introduces GPT-2, a 1.5 billion parameter transformer-based language model trained on 40GB of internet text that achieves state-of-the-art performance on language modeling benchmarks and demonstrates zero-shot capabilities in reading comprehension, translation, question answering, and summarization. Due to safety concerns, only a smaller model and technical paper are released publicly rather than the full trained model.

0 favorites 0 likes

language-model

Submit Feedback