edge-ai

#edge-ai

@zhixianio: After receiving the new machine, I began an 'ascetic' practice of forcing myself to use local models for common tasks. I thought it would be painful, but both speed and quality greatly exceeded my expectations: Model: Qwen3.6-35B-A3B-oQ6-fp16-mtp, Running: oMLX, with N…

X AI KOLs Timeline ↗ · 2026-06-03 Cached

The author uses the Qwen3.6-35B-A3B model and oMLX tool on the new local machine for daily tasks, finding that both speed and quality far exceed expectations, even outperforming remote LLMs in PA and coding scenarios, demonstrating a significant improvement in on-device AI capabilities.

0 favorites 0 likes

#edge-ai

Tiny LLM Benchmark: Jetson Orin Nano Super 8GB - Four Power Modes × Eight Models

Reddit r/LocalLLaMA ↗ · 2026-06-02

A deep benchmark of 8 tiny LLMs (135M to 1B parameters) on a $250 Jetson Orin Nano Super across four power modes finds 25W to be Pareto-optimal, with SmolLM2-135M achieving 165.1 tok/s and best efficiency.

0 favorites 0 likes

#edge-ai

NVIDIA Jetson Brings Agentic AI to the Physical World

NVIDIA Blog ↗ · 2026-06-02 Cached

NVIDIA announces JetPack 7.2 and NemoClaw support on Jetson, bringing agentic AI capabilities to edge devices like robotics and industrial automation, with performance boosts and new developer tools.

0 favorites 0 likes

#edge-ai

NVIDIA Factory Operations Blueprint Gives Factories a New AI Brain

NVIDIA Blog ↗ · 2026-06-01 Cached

NVIDIA announced the Factory Operations Blueprint (FOX), a reference design for building autonomous factory manager AI agents that integrate real-time data, automate model training, and orchestrate specialized agents, with early adoption by major manufacturers.

0 favorites 0 likes

#edge-ai

@abidlabs: Remarkable for an 8B model! Check out the @Gradio app here: https://huggingface.co/spaces/LiquidAI/LFM2.5-8B-A1B…

X AI KOLs Following ↗ · 2026-05-28 Cached

Liquid AI releases LFM2.5-8B-A1B, an 8B MoE model with 1.5B active parameters and 128K context, optimized for edge devices.

0 favorites 0 likes

#edge-ai

A Tiny Open-Source Self-Driving AI That Runs on a Phone [P]

Reddit r/MachineLearning ↗ · 2026-05-27

A 7MB open-source L4 self-driving AI is trained to learn navigation, lane following, and drift recovery from visual and sensor input, designed to run on phones and embedded devices without heavy infrastructure.

0 favorites 0 likes

#edge-ai

@Soranlan: https://x.com/starmexxx/status/2058933808406130855/video/1… Huang Renxun sells a $249 AI computer on stage that can replace your $200 monthly OpenAI bill. The video has 217,000 likes This box…

X AI KOLs Timeline ↗ · 2026-05-25 Cached

NVIDIA has launched the $249 Jetson Orin Nano Super developer kit, an AI computer that runs large models like Llama 3 and Mistral locally, cutting monthly OpenAI costs from $200 to just $22 in electricity.

0 favorites 0 likes

#edge-ai

DCGAN inference on a microcontroller: 12.6M parameters, 512KB SRAM, 26-second generation, pure C [P]

Reddit r/MachineLearning ↗ · 2026-05-25

Demonstrates running a DCGAN with 12.6M int8 quantized parameters on a low-cost RISC-V microcontroller (CH32H417), generating 64x64 cat faces in 26 seconds using pure C inference and quantum entropy sampling.

0 favorites 0 likes

#edge-ai

@heyshrutimishra: Full-sized AI models now run on phones. That's BitCPM, a new open-source model from ModelBest, Tsinghua, and OpenBMB. T…

X AI KOLs Following ↗ · 2026-05-25 Cached

BitCPM is a new open-source model from ModelBest, Tsinghua, and OpenBMB that uses ternary weights (-1,0,1) to run full-sized AI models on phones.

0 favorites 0 likes

#edge-ai

Wrote a custom C++ engine for MiniCPM-V 4.6 on Orange Pi AIPro (Ascend 310B) to bypass framework overhead

Reddit r/LocalLLaMA ↗ · 2026-05-25

Developed a custom C++ inference engine for MiniCPM-V 4.6 on Orange Pi AIPro (Ascend 310B NPU), achieving 2x speedup over stock framework by writing optimized AscendC kernels for matmul and causal-conv1d, reaching 5.90 tokens/s.

0 favorites 0 likes

#edge-ai

@rohanpaul_ai: BitCPM-CANN just became the world’s first open-sourced 1.58-bit ternary LLM trained entirely on Chinese-developed AI in…

X AI KOLs Following ↗ · 2026-05-22 Cached

BitCPM-CANN is the first open-source 1.58-bit ternary LLM trained entirely on Chinese-developed AI infrastructure (Huawei Ascend 910B), offering extreme memory reduction for edge deployment.

0 favorites 0 likes

#edge-ai

@ycombinator: General Instinct (@gen_instinct) deploys frontier AI models onto constrained edge hardware, helping robotics and physic…

X AI KOLs Following ↗ · 2026-05-19

General Instinct launches a deployment layer that enables frontier AI models to run on constrained edge hardware like Jetsons and mobile NPUs, helping robotics and physical AI teams achieve low-latency offline inference.

0 favorites 0 likes

#edge-ai

@paulabartabajo_: The next AI boom won't be bigger data centers. It'll be compact intelligence running on the edge. You (and the planet) …

X AI KOLs Timeline ↗ · 2026-05-19 Cached

A tweet argues the next AI boom will be compact intelligence on edge devices rather than larger data centers, with Liquid AI supporting the vision of running AI on phones, cars, and everyday devices.

0 favorites 0 likes

#edge-ai

How do you handle firmware updates for AI models on devices deployed in places with no reliable connectivity, do you wait for a technician visit or accept the model staying stale?

Reddit r/AI_Agents ↗ · 2026-05-19

A detailed examination of the real-world challenges faced when updating AI models on edge devices deployed in remote or disconnected environments, covering strategies like connectivity windows, technician visits, mesh propagation, and accepting staleness.

0 favorites 0 likes

#edge-ai

Edge-AI-Driven Learning-to-Rank for Decentralized Task Allocation in Circular Smart Manufacturing

arXiv cs.LG ↗ · 2026-05-19 Cached

This paper proposes an Edge-AI-driven decentralized task allocation framework for circular smart manufacturing that uses learning-to-rank to align with the ordering-based nature of winner selection. Simulation results show improved delay, deadline adherence, and energy efficiency under high-load and tight-deadline scenarios.

0 favorites 0 likes

#edge-ai

Sustainable Intelligence for the Wild: Democratizing Ecological Monitoring via Knowledge-Adaptive Edge Expert Agents

arXiv cs.AI ↗ · 2026-05-19 Cached

This paper proposes a knowledge-adaptive edge expert agent architecture for ecological monitoring, separating visual perception from reasoning to reduce reliance on cloud resources and enable sustainable on-device AI in remote deployments.

0 favorites 0 likes

#edge-ai

@PyTorch: Big congrats to the ExecuTorch team. Their paper just won the Best Industry Paper Award at @MLSysConf 2026. ExecuTorch …

X AI KOLs Following ↗ · 2026-05-15 Cached

ExecuTorch, PyTorch's on-device AI deployment framework, won the Best Industry Paper Award at MLSysConf 2026. The paper introduces a unified solution for running models on diverse hardware, from microcontrollers to SoCs.

0 favorites 0 likes

#edge-ai

Gemma 4 + LiteRT-LM on mobile: much better memory/perf than my llama.cpp setup

Reddit r/LocalLLaMA ↗ · 2026-05-15

A user shares a hands-on comparison of running Gemma 4 with LiteRT-LM on mobile devices versus their previous llama.cpp setup, noting significantly better memory usage (1.5-2 GB vs 4-5 GB) and faster inference (2-4 seconds vs 7-10 seconds) on smartphones like Samsung S25 Ultra and iPhone 13 Pro Max.

0 favorites 0 likes

#edge-ai

Sipeed's K3 RISC-V SBCs can run 30B-parameter LLMs 60 TOPS (INT4), Supports BF16/FP16/INT4

Reddit r/LocalLLaMA ↗ · 2026-05-13

Sipeed's new K3 RISC-V single-board computers feature 32GB LPDDR5 and a 60 TOPS NPU, enabling local inference of large language models at up to 15 tokens per second.

0 favorites 0 likes

#edge-ai

@_lewtun: You can now have an AI researcher running on your laptop 24/7 for free! Running Qwen3-35B-A3B with llama.cpp and a 4-bi…

X AI KOLs Timeline ↗ · 2026-05-13 Cached

The article highlights the ability to run Qwen3-35B-A3B locally on a laptop for free using llama.cpp and Unsloth 4-bit quantization.

0 favorites 0 likes

edge-ai

Submit Feedback