llama

#llama

AI can finally pass the Turing Test better than a human, study warns

Reddit r/ArtificialInteligence ↗ · 2026-05-20 Cached

A new study published in PNAS shows that advanced LLMs like GPT-4.5 can pass the Turing Test, with participants finding them more human than actual humans, prompting a reevaluation of what the test measures.

0 favorites 0 likes

#llama

@dair_ai: NEW paper from Meta: Agentic Discovery of Neural Architectures. This is a hot new area of research! Keep an eye on it.

X AI KOLs Following ↗ · 2026-05-18 Cached

Meta's new paper presents an agentic system that autonomously discovers neural architectures outperforming Llama 3.2 at 350M, 1B, and 3B scales within a 24-hour compute budget.

0 favorites 0 likes

#llama

MTP PR Merged!!!

Reddit r/LocalLLaMA ↗ · 2026-05-16

A pull request for MTP (likely a model training pipeline or similar) related to LLaMA models has been merged, marking a milestone.

0 favorites 0 likes

#llama

I Let a Small Model Train on Its Own Mistakes. It Reached 80% on HumanEval and Beat GPT-3.5 on Math

Reddit r/LocalLLaMA ↗ · 2026-05-14

A researcher trained small language models on their own self-generated coding mistakes and corrections, achieving 80% on HumanEval and surpassing GPT-3.5 on math, demonstrating effective self-improvement with minimal resources.

0 favorites 0 likes

#llama

Introducing cyankiwi AWQ 4-bit Quantization — 26.05 update

Reddit r/LocalLLaMA ↗ · 2026-05-14

Cyankiwi introduced an updated version of their AWQ 4-bit quantization method that jointly optimizes scales and quantization ranges, achieving lower KL divergence than existing methods on Llama-3 models.

0 favorites 0 likes

#llama

MLX 16/8/4/2-bit quants of nvidia/llama-embed-nemotron-8b

Reddit r/LocalLLaMA ↗ · 2026-05-14

The user converted Nvidia's Llama-Embed-Nemotron-8B model to MLX format with fp16, 8-bit, 4-bit, and 2-bit quantizations, enabling in-process embedding loading on Apple Silicon via mlx-embeddings.

0 favorites 0 likes

#llama

Mark Zuckerberg ‘Personally Authorized and Actively Encouraged’ Meta’s Massive Copyright Infringement to Train AI Systems, Publishers and Scott Turow Allege in Lawsuit

Reddit r/singularity ↗ · 2026-05-11 Cached

Book publishers and author Scott Turow have filed a class-action lawsuit against Meta and CEO Mark Zuckerberg, alleging the company illegally copied millions of copyrighted works to train its Llama AI models, circumventing licensing and copyright protections.

0 favorites 0 likes

#llama

The Granularity Axis: A Micro-to-Macro Latent Direction for Social Roles in Language Models

Hugging Face Daily Papers ↗ · 2026-05-07 Cached

This research paper investigates how Large Language Models encode social role granularity as a structured latent dimension. It demonstrates that this 'Granularity Axis' is consistent across architectures like Qwen3 and Llama-3, and can be causally manipulated via activation steering.

0 favorites 0 likes

#llama

UniPool: A Globally Shared Expert Pool for Mixture-of-Experts

Hugging Face Daily Papers ↗ · 2026-05-07 Cached

UniPool introduces a shared expert pool architecture for Mixture-of-Experts models, reducing parameter growth with depth while improving efficiency and performance over standard MoE baselines.

0 favorites 0 likes

#llama

When are we getting consumer inference chips?

Reddit r/LocalLLaMA ↗ · 2026-04-23

Post questions why no startup has shipped a $200-300 consumer inference chip with Llama 3 baked in, suggesting the industry prefers API subscription revenue over one-time hardware sales.

0 favorites 0 likes

#llama

CBRS: Cognitive Blood Request System with Bilingual Dataset and Dual-Layer Filtering for Multi-Platform Social Streams

arXiv cs.CL ↗ · 2026-04-21 Cached

Researchers from Bangladesh University of Engineering and Technology present CBRS, a multi-platform framework that filters and parses blood donation requests from social media using a dual-layer architecture and a novel 11K bilingual dataset in Bengali and English. Their LoRA fine-tuned Llama-3.2-3B model achieves 99% filtering accuracy and 92% zero-shot parsing accuracy, outperforming GPT-4o-mini and other LLMs with 35× reduced token usage.

0 favorites 0 likes

#llama

QWEN3.6 + ik_llama is fast af

Reddit r/LocalLLaMA ↗ · 2026-04-19

User reports successful deployment of Qwen 3.6 with ik_llama quantization achieving 50+ tokens/second on consumer hardware (16GB VRAM, 32GB RAM) with 200k context window.

0 favorites 0 likes

llama

Submit Feedback