large-language-model

Tag

Cards List
#large-language-model

next MiniMax will be released in ~10 Days

Reddit r/LocalLLaMA · 2026-06-01

The next MiniMax AI model is expected to be released in about 10 days, but the tweeter anticipates it may be too large for their hardware.

0 favorites 0 likes
#large-language-model

@charles_irl: more vibe checks available from your friendly local lunatics at r/localllama https://reddit.com/r/LocalLLaMA/s/vqBVXvIT…

X AI KOLs Following · 2026-05-29 Cached

Modal announces day 0 support for Step 3.7 Flash, a 198B parameter MoE model with 256K context and native image/video understanding.

0 favorites 0 likes
#large-language-model

Liquid AI releases LFM2.5-8B-A1B

Reddit r/LocalLLaMA · 2026-05-29

Liquid AI released LFM2.5-8B-A1B, an edge model with a 128K context window, 38T tokens of pre-training, and large-scale reinforcement learning, capable of tool calling and complex tasks while fitting on an entry-level laptop.

0 favorites 0 likes
#large-language-model

Aryabhata 2: Scaling Reinforcement Learning for Advanced STEM Reasoning

arXiv cs.CL · 2026-05-29 Cached

Introduces Aryabhata 2, a reasoning-focused language model for competitive STEM exams, trained via reinforcement learning on PhysicsWallah's question banks, outperforming its base model with fewer tokens.

0 favorites 0 likes
#large-language-model

@mfpiccolo: Opus 4.8 is out. Here is the the verdict from @iiidevs lead engineer: did a stress test it’s just another llm can’t rea…

X AI KOLs Timeline · 2026-05-28 Cached

Anthropic released Claude Opus 4.8, an incremental update over Opus 4.7 with sharper judgment and longer autonomous work capability, though some engineers remain skeptical about its code generation without extensive guidance.

0 favorites 0 likes
#large-language-model

@XAMTO_AI: Stop bookmarking those flashy but useless AI tutorials. This 'Hands-On Large Models' is what you really need—open source, free, and code that runs. The book covers 12 chapters, guiding you step by step through the complete workflow of deploying large models: ① Language Model Basics ② Prompt Engineering ③ Semantic Search ④ Model Fine-Tuning ⑤ Multimodal…

X AI KOLs Timeline · 2026-05-28 Cached

Recommend an open-source free tutorial 'Hands-On Large Models', covering 12 chapters including language model basics, prompt engineering, semantic search, model fine-tuning, multimodal applications, etc. All code can be run directly in Colab.

0 favorites 0 likes
#large-language-model

ORCA: An End-to-End Interactive Copilot for Optimized Root Cause Analysis

arXiv cs.AI · 2026-05-27 Cached

ORCA is a copilot for end-to-end causal analysis that uses agents to guide users through workflows including causal discovery, effect estimation, and root cause analysis, with structured reports.

0 favorites 0 likes
#large-language-model

Personalizing Embodied Multimodal Large Language Model Agents over Long-term User Interactions

arXiv cs.AI · 2026-05-27 Cached

This paper proposes Polar, a multimodal memory-augmented framework for personalizing embodied MLLM agents over long-term user interactions, using a knowledge graph and episodic memory to ground user-intended instances from accumulated context.

0 favorites 0 likes
#large-language-model

Improving the Completeness and Comparability of Segment Disclosures: A Large Language Model Approach

arXiv cs.CL · 2026-05-26 Cached

This paper proposes an LLM-based framework to extract segment disclosures from 10-K filings, improving completeness and comparability through retrieval-augmented systems for longitudinal and cross-firm analysis.

0 favorites 0 likes
#large-language-model

Scientists trained an AI model using an IBM quantum computer — and it answered questions correctly that the base model couldn't

Reddit r/artificial · 2026-05-26 Cached

Researchers used an IBM quantum computer to reduce uncertainty in an AI model, achieving the first demonstration of quantum enhancement in a pretrained large language model, allowing it to answer questions correctly where the base model failed.

0 favorites 0 likes
#large-language-model

@elonmusk: Grok foundation model V9-Medium (1.5T) has finished training. Evals look good. A lot of Cursor data was added in supple…

X AI KOLs Following · 2026-05-25 Cached

Elon Musk announced that the Grok foundation model V9-Medium (1.5T parameters) has finished training with strong evaluations, and will be publicly released in 2-3 weeks after fine-tuning and reinforcement learning.

0 favorites 0 likes
#large-language-model

@KtAIFeed: Straight to the point, no fluff. The recently popular Qwen 3.6 (35B/43B) latest open-source 'uncensored' model on Hugging Face (over a million downloads per month) can run locally with just 6GB VRAM on a single GPU. It completely breaks the original model's moral preaching and safety restrictions—no censorship, it will answer whatever you ask...

X AI KOLs Timeline · 2026-05-25 Cached

Introduces the Qwen 3.6 (35B/43B) open-source uncensored model, removing official moral and safety restrictions. Requires only 6GB VRAM for local operation. Over a million downloads.

0 favorites 0 likes
#large-language-model

@percyliang: Not only do we want to train a good model, we want to know it'll be good before we even start training. About a month a…

X AI KOLs Following · 2026-05-24 Cached

The Marin team pre-registered a predicted loss of 2.252 for a 129B parameter MoE model training run, and the actual result landed at 2.234, demonstrating accurate loss prediction before training.

0 favorites 0 likes
#large-language-model

Macaron-A2UI: A Model for Generative UI in Personal Agents

Hugging Face Daily Papers · 2026-05-24 Cached

Presents Macaron-A2UI, a model for generative UI in personal agents that synthesizes dynamic interfaces with lightweight executable actions, moving beyond text-only chat. The paper introduces a large-scale corpus, the A2UI-Bench benchmark, and trains models up to 754B parameters using LoRA fine-tuning and reinforcement learning, achieving strong results.

0 favorites 0 likes
#large-language-model

DeepSeek makes the V4 Pro price discount permanent

Hacker News Top · 2026-05-22 Cached

DeepSeek has made the 75% discount on V4 Pro API pricing permanent, reducing input/output token costs significantly.

0 favorites 0 likes
#large-language-model

@lxfater: NetEase Youdao open-sourced ZiYue 4 model, within 27B parameters, SOTA in math and science. But what really interests me is its voice feature!! Cloning a voice is nothing new, ElevenLabs could do it long ago. But they all share a common flaw: cross-language accent. Take your Chinese voice and use it to speak Japanese — it has a Chinese accent, you can tell it's a foreigner struggling...

X AI KOLs Timeline · 2026-05-22 Cached

NetEase Youdao open-sourced the ZiYue 4 model with 27B parameters, achieving SOTA in math and science; its voice feature supports 3-second cross-language voice cloning across 14 languages with no accent issue, along with open-sourcing the all-scenario intelligent agent 'Longxia' (Lobster).

0 favorites 0 likes
#large-language-model

VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

arXiv cs.AI · 2026-05-22 Cached

This paper proposes VBFDD-Agent, a vehicle battery fault detection and diagnosis agent that uses descriptive text modeling of battery signals, large language models, and historical cases to generate interpretable diagnostic results and maintenance recommendations for electric vehicle batteries.

0 favorites 0 likes
#large-language-model

Re. what ever happened to Cohere’s Command-A series of models?

Reddit r/LocalLLaMA · 2026-05-20

Cohere launches Command A+, its first Mixture-of-Experts model, released under Apache 2.0 with efficient quantization for 1-2 GPU deployment, prioritizing practicality and open access for developers.

0 favorites 0 likes
#large-language-model

@ClementDelangue: Cohere is on such a great open-source trajectory lately. Beautiful Apache 2.0 model! https://huggingface.co/CohereLabs/…

X AI KOLs Following · 2026-05-20 Cached

Cohere has released Command A+, an open-source model with 25 billion active parameters and 218B total parameters under Apache 2.0, optimized for agentic, multilingual, and reasoning-heavy tasks.

0 favorites 0 likes
#large-language-model

CohereLabs/command-a-plus-05-2026-bf16 · Hugging Face

Reddit r/LocalLLaMA · 2026-05-20 Cached

Cohere releases Command A+, an open-source model with 25B active parameters (218B total) optimized for agentic, multilingual, and reasoning-heavy tasks, supporting vision inputs and 128K context under Apache 2.0.

0 favorites 0 likes
← Previous
Next →
← Back to home

Submit Feedback