llama

Tag

Cards List
#llama

@jerryjliu0: Fully solving document parsing includes covering every point on the Pareto curve of accuracy, cost, and latency: High-a…

X AI KOLs Timeline · 5h ago Cached

Jerry Liu presents a framework for document parsing across accuracy, cost, and latency tradeoffs, introducing LiteParse as an open-source, low-latency parsing tool for AI agent loops, along with LlamaParse for high-accuracy modes.

0 favorites 0 likes
#llama

Meta was secretly running on Google's Gemini the whole time and then got cut off for using too much

Reddit r/artificial · yesterday

Meta was secretly using Google's Gemini for customer service, ad tools, and content moderation because it outperformed their own Llama models, until Google cut off access due to excessive capacity usage.

0 favorites 0 likes
#llama

A Tree-of-Thoughts Inspired Hybrid Approach for Legal Case Judgement Summarization using LLMs

arXiv cs.CL · yesterday Cached

Proposes a tree-of-thoughts inspired extractive-abstractive approach for legal case judgement summarization using LLMs, with experiments on DeepSeek and LLama showing improved summaries over extractive or abstractive methods alone.

0 favorites 0 likes
#llama

From Black-Box to Clinical Insight: A Multi-Stage Explainable Framework for Speech-Based Cognitive Impairment Detection

arXiv cs.CL · yesterday Cached

This paper presents a multi-stage explainable framework that combines SHAP-based token attribution, theory-informed linguistic features, and LLaMA-3.1-70B-Instruct LLM reasoning to interpret transformer-based speech models for cognitive impairment detection, achieving strong clinical alignment and high usability scores.

0 favorites 0 likes
#llama

Political bias in AI: Where the AI models stand

Hacker News Top · 5d ago Cached

An analysis of political leanings in six major AI models, showing that 4 out of 6 lean left of center on the economic axis, with some models being unaware of their own bias.

0 favorites 0 likes
#llama

Llama bench and real performance wayy different(Help)

Reddit r/LocalLLaMA · 2026-06-18

Discussion about the significant gap between Llama model benchmark scores and actual real-world performance, with the author seeking assistance.

0 favorites 0 likes
#llama

@Akashi203: i open-sourced automegakernel -- compiles any huggingface model into a single persistent megakernel batch-1 decode is b…

X AI KOLs Timeline · 2026-06-17 Cached

AutoMegaKernel is an open-source agent harness that compiles any HuggingFace model into a single persistent megakernel, fusing the entire forward pass into one GPU launch to reduce overhead. It achieves up to 1.33x speedup over CUDA-graphed cuBLAS on inference-class GPUs like L4 and L40S, while proving schedules deadlock- and race-free.

0 favorites 0 likes
#llama

Frame-Conditioned Moral Computation in LLaMA 3.1-8B-Instruct: A Mechanistic Interpretability Audit of Ethical Reasoning

arXiv cs.AI · 2026-06-16 Cached

This paper uses mechanistic interpretability to audit ethical reasoning in LLaMA 3.1-8B-Instruct, finding a 'Situational Anchor Effect' where domain-specific representations dominate moral computation, and proposing 'Mechanistic Alignment' as a research program.

0 favorites 0 likes
#llama

@rewind02: A Stanford professor just gave a public lecture on exactly how GPT, Claude, and LLaMA are built under the hood no insid…

X AI KOLs Timeline · 2026-06-14 Cached

A Stanford professor delivered a public lecture providing a comprehensive breakdown of how modern LLMs like GPT, Claude, and LLaMA are built under the hood, making advanced architecture accessible to the public.

0 favorites 0 likes
#llama

Open sourcing InfiniteKV: a KV cache that files old tokens as 104-byte searchable records in RAM or on disk instead of deleting them. Mistral-7B answered from token 76,747, 2.3x past its trained window. Colab demo

Reddit r/LocalLLaMA · 2026-06-12

InfiniteKV is an open-source KV cache technique that compresses old tokens into 104-byte searchable records stored in RAM or on disk, enabling models to handle million-token contexts beyond their trained window without discarding data. Verified working with Mistral-7B and SmolLM2.

0 favorites 0 likes
#llama

The Order Matters: Sequential Fine-Tuning of LLaMA for Coherent Automated Essay Scoring

arXiv cs.CL · 2026-06-10 Cached

This paper investigates sequential fine-tuning of LLaMA-3.1-8B for automated essay scoring using a curriculum aligned with discourse structure, showing improved coherence and performance compared to independent or randomized training.

0 favorites 0 likes
#llama

Meta Abandons Llama for Muse Spark — The End of Open-Source AI's Biggest Champion

Reddit r/AI_Agents · 2026-06-08

Meta has abandoned its open-weight Llama model family in favor of a fully proprietary model called Muse Spark, developed by Alexandr Wang's team, marking the end of Meta's role as a champion of open-source AI.

0 favorites 0 likes
#llama

ImmigrationQA: A Source-Grounded Dataset and Small-Model Adaptation for U.S. Immigration Law

arXiv cs.CL · 2026-06-01 Cached

This paper presents ImmigrationQA, a source-grounded dataset of 17,058 QA pairs for U.S. immigration law, and fine-tunes a Llama 3.2 3B model using LoRA, achieving a 27% improvement over the base model on a held-out evaluation set.

0 favorites 0 likes
#llama

Llama Surgery: Continuous Sparsification of Pre-Trained Language Models via Differentiable Ultrametric Topology Injection

Reddit r/artificial · 2026-05-31

Llama Surgery injects learned block-sparse attention topologies into pre-trained Llama 3.1 8B without retraining from scratch, using a Dynamic Topology Router with Gumbel-Softmax routing, temperature annealing, and a Straight-Through Estimator to avoid gradient collapse, achieving stable convergence and coherent output.

0 favorites 0 likes
#llama

World-State Transformations for Neuro-symbolic Interactive Storytelling

arXiv cs.CL · 2026-05-26 Cached

This paper explores using LLMs to predict state changes within rule-based interactive storytelling systems, aiming to improve coherence and player expression. Experiments with Llama 3 70B and Gemini 1.5 Flash show that world-state transformations can maintain consistency while encouraging creative player input.

0 favorites 0 likes
#llama

@steeve: Progress: 26 tok/s (llama 3.1 3b) .@tenstorrent claims 33 tok/s so we’re not far off

X AI KOLs Following · 2026-05-22 Cached

Steeve Morin reports running Llama 3.1 3B on Tenstorrent hardware via ZML, achieving 26 tok/s, close to Tenstorrent's claimed 33 tok/s.

0 favorites 0 likes
#llama

LLM de-censorer Heretic has been served a legal notice by Facebook ("Meta")

Reddit r/singularity · 2026-05-21 Cached

The Heretic LLM de-censorship project received a legal notice from Meta, leading to the removal of derivative Llama models; the project has since moved to a Codeberg mirror and plans technological measures to preserve access.

0 favorites 0 likes
#llama

Heretic has been served a legal notice by Meta, Inc.

Reddit r/LocalLLaMA · 2026-05-21

Meta served a legal notice to the Heretic Project over derivatives of its Llama AI models, prompting the project to remove the weights and announce plans to diversify infrastructure with an official Codeberg mirror.

0 favorites 0 likes
#llama

MisoLabs/MisoTTS

Hugging Face Models Trending · 2026-05-21 Cached

Miso Labs releases Miso TTS 8B, a text-to-speech model based on the Sesame CSM architecture with a Llama 3.2-style backbone, designed for high-quality conversational speech generation and voice continuation.

0 favorites 0 likes
#llama

AI can finally pass the Turing Test better than a human, study warns

Reddit r/ArtificialInteligence · 2026-05-20 Cached

A new study published in PNAS shows that advanced LLMs like GPT-4.5 can pass the Turing Test, with participants finding them more human than actual humans, prompting a reevaluation of what the test measures.

0 favorites 0 likes
Next →
← Back to home

Submit Feedback