large-language-models

Tag

Cards List
#large-language-models

GPT-5.5 may burn fewer tokens, but it always burns more cash

Reddit r/artificial · 3h ago Cached

OpenAI's GPT-5.5 costs 49–92% more than GPT-5.4 in practice despite claimed token efficiency improvements, while Anthropic's Claude Opus 4.7 also raised effective costs by 12–27% for longer prompts, reflecting a broader trend of rising frontier model prices as both companies face massive projected losses.

0 favorites 0 likes
#large-language-models

@tom_doerr: Fully open sources training data for 30B scale search agents https://github.com/PolarSeeker/OpenSeeker…

X AI KOLs Timeline · 6h ago Cached

OpenSeeker fully open-sources training data and models for 30B-scale ReAct-based search agents, achieving state-of-the-art performance on multiple benchmarks including BrowseComp and Humanity's Last Exam. It is the first purely academic project to reach frontier search benchmark performance while releasing complete training data.

0 favorites 0 likes
#large-language-models

@amitiitbhu: New article: LLM Routing Read here: https://outcomeschool.com/blog/llm-routing…

X AI KOLs Timeline · 9h ago Cached

A tutorial blog post explaining LLM Routing — the practice of directing user queries to the most appropriate LLM based on cost, latency, and quality. Covers routing strategies, anatomy of an LLM router, and comparisons with Mixture of Experts.

0 favorites 0 likes
#large-language-models

@wsl8297: UC's Open Course on Reinforcement Learning for LLMs uses a 'theory + practice' approach to thoroughly explain key AI training techniques from the ground up, helping you systematically build a complete framework spanning from RL to LLM training. Comprehensive curriculum paired with complete resources: lecture slides, full videos, and practical exercises are all provided so you can start implementing right away…

X AI KOLs Timeline · 11h ago Cached

Assistant Professor Ernest K. Ryu at UCLA offers the open course 'Reinforcement Learning for Large Language Models,' comprehensively analyzing key LLM training techniques like RLHF, PPO, and DPO alongside their supporting resources through a blend of theory and practice. The course provides developers and researchers with a systematic learning path from foundational algorithms to practical deployment.

0 favorites 0 likes
#large-language-models

@no_stp_on_snek: mrcr v2 8-needle at 1m, open weights stack, single rented mi300x. longctx directional 0.688 (n=30, mass-val rerun pendi…

X AI KOLs Following · 17h ago Cached

Shares early benchmark scores and evaluation metrics for an open-weight model stack run on a single AMD MI300X, noting competitive performance against closed-source alternatives.

0 favorites 0 likes
#large-language-models

@NFTCPS: Brothers, doing AI without large models is like doing nothing! Today I have to recommend an open-source masterpiece 'Foundations of LLMs' to you. Don't wait, just read it! This book doesn't beat around the bush—it goes deep from the start! From getting started with large language models to architectural evolution, and then it breaks down Prompt engineering, parameter-efficient fine-tuning, model editing, RAG (Retrieval-Augmented Generation) and other hardcore techniques in one go—a one-stop service.

X AI KOLs Timeline · yesterday Cached

This article promotes the open-source book 'Foundations of LLMs', which systematically explains knowledge about large language models, and introduces the multi-agent development framework Agent-Kernel.

0 favorites 0 likes
#large-language-models

Knee Osteoarthritis Severity Grading Using Optimized Deep Learning and LLM-Driven Intelligent AI on Computationally Limited Systems

arXiv cs.AI · yesterday Cached

This paper presents an automated diagnostic system for grading knee osteoarthritis severity using an optimized ResNet-18 model deployed on edge devices via TensorFlow Lite. It integrates an LLM interface using Gemini 2.0 Flash to provide structured interpretive findings while maintaining offline capability for resource-constrained environments.

0 favorites 0 likes
#large-language-models

Detecting Time Series Anomalies Like an Expert: A Multi-Agent LLM Framework with Specialized Analyzers

arXiv cs.AI · yesterday Cached

The article introduces SAGE, a multi-agent LLM framework for time-series anomaly detection that uses specialized analyzers to improve interpretability and reliability. It demonstrates superior performance over baselines on three benchmarks and enhances diagnostic reporting through structured evidence consolidation.

0 favorites 0 likes
#large-language-models

Saliency-Aware Regularized Quantization Calibration for Large Language Models

arXiv cs.AI · yesterday Cached

This paper proposes Saliency-Aware Regularized Quantization Calibration (SARQC), a unified framework that improves Post-Training Quantization (PTQ) for LLMs by adding a regularization term to preserve weight proximity, enhancing generalization and performance.

0 favorites 0 likes
#large-language-models

AgenticRAG: Agentic Retrieval for Enterprise Knowledge Bases

arXiv cs.AI · yesterday Cached

This paper introduces AgenticRAG, a framework from Microsoft that enhances enterprise knowledge base retrieval by equipping LLMs with tools for iterative search, document navigation, and analysis. It demonstrates significant improvements in recall and factuality over standard RAG pipelines on multiple benchmarks.

0 favorites 0 likes
#large-language-models

PRISM: Perception Reasoning Interleaved for Sequential Decision Making

arXiv cs.AI · yesterday Cached

This paper introduces PRISM, a framework that integrates Vision-Language Models and Large Language Models through a dynamic question-answering pipeline to improve sequential decision-making in embodied AI tasks.

0 favorites 0 likes
#large-language-models

When Helpfulness Becomes Sycophancy: Sycophancy is a Boundary Failure Between Social Alignment and Epistemic Integrity in Large Language Models

arXiv cs.AI · yesterday Cached

This position paper analyzes sycophancy in LLMs as a boundary failure between social alignment and epistemic integrity, proposing a new framework and taxonomy to classify and mitigate these behaviors.

0 favorites 0 likes
#large-language-models

Feature Starvation as Geometric Instability in Sparse Autoencoders

arXiv cs.LG · yesterday Cached

This paper identifies feature starvation in sparse autoencoders as a geometric instability and proposes adaptive elastic net SAEs (AEN-SAEs) to mitigate it without heuristics.

0 favorites 0 likes
#large-language-models

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

arXiv cs.CL · yesterday Cached

This paper presents PCNet, a probabilistic circuit trained as a tractable density estimator on LLM residual streams to detect hallucinations as geometric anomalies. It also introduces PC-LDCD, a dynamic correction method that only intervenes on hallucinated tokens, achieving near-perfect detection and reduced corruption rates.

0 favorites 0 likes
#large-language-models

Attribution-Guided Continual Learning for Large Language Models

arXiv cs.LG · yesterday Cached

This paper proposes an attribution-guided continual fine-tuning framework for large language models that estimates task-specific parameter importance in Transformer layers and modulates gradients accordingly, mitigating catastrophic forgetting while maintaining performance on new tasks.

0 favorites 0 likes
#large-language-models

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

arXiv cs.CL · yesterday Cached

This paper proposed Distribution-Aligned Adversarial Distillation (DisAAD), a method that uses a lightweight proxy model to estimate uncertainty in black-box LLMs with only 1% of the original model size, achieving reliable quantification without requiring internal parameters or multiple sampling.

0 favorites 0 likes
#large-language-models

Internalizing Outcome Supervision into Process Supervision: A New Paradigm for Reinforcement Learning for Reasoning

arXiv cs.LG · yesterday Cached

Introduces IOP, a framework that internalizes outcome supervision into process supervision for reasoning reinforcement learning, enabling fine-grained credit assignment without external annotations.

0 favorites 0 likes
#large-language-models

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

arXiv cs.CL · yesterday Cached

BioTool introduces a comprehensive biomedical tool-calling dataset with 34 tools and 7,040 human-verified query-API pairs, enabling fine-tuned LLMs to outperform GPT-5.1 on biomedical tool use and significantly enhance answer quality.

0 favorites 0 likes
#large-language-models

Decomposing the Basic Abilities of Large Language Models: Mitigating Cross-Task Interference in Multi-Task Instruct-Tuning

arXiv cs.CL · yesterday Cached

This paper proposes Badit, a method that decomposes large language model parameters into orthogonal high-singular-value LoRA experts to mitigate cross-task interference during multi-task instruction tuning.

0 favorites 0 likes
#large-language-models

@GoSailGlobal: Nathan Lambert visited all of China's top AI labs — Moonshot, Zhipu, Meituan, Xiaomi, Qwen / Ant Ling, http://01.AI — and wrote a piece titled Notes from inside Chi…

X AI KOLs Timeline · yesterday Cached

Nathan Lambert shares observations from visiting top Chinese AI labs, highlighting cultural differences in research focus and ego compared to US counterparts, while noting parity in hardware and model capabilities.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback