Tag
AI2 released EMO, a Mixture of Experts language model with 1B active parameters out of 14B total, trained on 1 trillion tokens and featuring document-level routing where experts cluster around domains.
Review paper from Kunming University surveys how pre-trained language models automate knowledge-graph construction and introduces LLHKG, a lightweight-LLM framework matching GPT-3.5 performance.
Introduces Token-to-Mask (T2M) remasking to fix generation errors in masked diffusion LMs by resetting suspect tokens to mask state instead of overwriting, yielding up to +5.92 accuracy on CMATH without extra training or parameters.
Developer shares a minimalist 7.5M-parameter diffusion language model trained from scratch on Shakespeare, releasing the code as a learning resource.
Grok 4.3 beta has been released, offering advanced AI capabilities through xAI's subscription service at $300/month, representing an incremental update to Elon Musk's AI assistant platform.
Google and DeepMind introduce VaultGemma, a 1B-parameter open-source language model trained with differential privacy, accompanied by new scaling laws research that characterizes the compute-privacy-utility trade-offs in differentially private LLM training.
OpenAI releases gpt-oss-120b and gpt-oss-20b, two state-of-the-art open-weight language models under Apache 2.0 license that achieve near-parity with proprietary models while being optimizable for consumer hardware and edge devices. Both models demonstrate strong reasoning and tool-use capabilities with comprehensive safety evaluations.
OpenAI introduces ChatGPT, a conversational AI model fine-tuned from GPT-3.5 using reinforcement learning from human feedback (RLHF). The model is designed to answer follow-up questions, admit mistakes, and reject inappropriate requests, with free access provided during the research preview.
OpenAI announces that over 300 applications are now using GPT-3 through their API, nine months after launch, generating 4.5 billion words daily. Featured use cases include Viable for customer feedback analysis, Fable Studio for interactive storytelling, and Algolia for semantic search capabilities.
OpenAI announces the release of an API for accessing its AI models with a general-purpose text interface, launching in private beta with strict safety measures including mandatory production reviews and content restrictions to prevent harmful use cases.
OpenAI releases GPT-2 1.5B model with analysis of human perception of credibility, potential for misuse through fine-tuning on extremist ideologies, and challenges in detecting synthetic text. Detection models achieve ~95% accuracy but require complementary approaches for practical deployment.
OpenAI introduces GPT-2, a 1.5 billion parameter transformer-based language model trained on 40GB of internet text that achieves state-of-the-art performance on language modeling benchmarks and demonstrates zero-shot capabilities in reading comprehension, translation, question answering, and summarization. Due to safety concerns, only a smaller model and technical paper are released publicly rather than the full trained model.