local-inference

#local-inference

Diffusion Model that can turn any Image into a Playable Hallucination! BUT LOCALLY, NOT ON DATACENTER

Reddit r/ArtificialInteligence ↗ · 20h ago

A diffusion model that can transform any image into an interactive, playable hallucination, running locally on user hardware.

0 favorites 0 likes

#local-inference

Unsloth GLM-5.2 – How to Run Locally

Hacker News Top ↗ · yesterday Cached

A guide on running Z.ai's open model GLM-5.2 locally using Unsloth Dynamic GGUFs. The model features 744B total parameters (40B active) and a 1M context window, with quantized versions reducing memory to 239GB for 2-bit, enabling local inference on 256GB Macs.

0 favorites 0 likes

#local-inference

Local LLM Inference Optimization: The Complete Guide

Reddit r/LocalLLaMA ↗ · 2d ago Cached

A comprehensive guide to optimizing local LLM inference on consumer hardware, covering tools like llama.cpp, vLLM, and LM Studio, with practical advice on memory hierarchy, layer placement, and common failure modes.

0 favorites 0 likes

#local-inference

@QuixiAI: https://x.com/QuixiAI/status/2068776183102067086

X AI KOLs Following ↗ · 2d ago Cached

DwarfStar is a self-contained native inference engine optimized for DeepSeek V4 Flash and PRO models, supporting Metal, CUDA, and ROCm backends, with a focus on high-end personal machines and Mac Studios.

0 favorites 0 likes

#local-inference

@antirez: First kinda working implementation of GLM 5.2 in DwarfStar. Will take some time to be good enough, but it is a promisin…

X AI KOLs Following ↗ · 2d ago Cached

Antirez reports the first working implementation of GLM 5.2 in DwarfStar, using a 433 GB GGUF file on an M3 Ultra with 512GB RAM, though it needs further refinement.

0 favorites 0 likes

#local-inference

GLM 5.2: 98% of max level intelligence with less than half of tokens usage

Reddit r/LocalLLaMA ↗ · 4d ago

GLM 5.2 offers improved token efficiency, allowing users to achieve 98% of max-level intelligence using less than half the tokens. The model's 'high' effort level provides a practical alternative for day-to-day use compared to the resource-intensive 'max' level.

0 favorites 0 likes

#local-inference

GLM-5.2 can now run locally in llama.cpp and Unsloth Studio.

Reddit r/LocalLLaMA ↗ · 5d ago

GLM-5.2 is now supported for local execution via llama.cpp and Unsloth Studio.

0 favorites 0 likes

#local-inference

@10xmylife: Unsloth 成功将 2-bit 版本的 GLM-5.2 部署在了 256GB 的 Mac 上

X AI KOLs Following ↗ · 5d ago Cached

Unsloth 成功将 GLM-5.2 模型以 2-bit 量化压缩至 238GB，可在 256GB Mac 上本地运行，保留约 82% 的准确率。

0 favorites 0 likes

#local-inference

Giving GLM-5.2 a spin locally on CPU only! (poor man's rig for big models)

Reddit r/LocalLLaMA ↗ · 5d ago

A user runs GLM-5.2 locally on CPU only, demonstrating how to run a large model on a modest setup.

0 favorites 0 likes

#local-inference

@MaximeRivest: glm 5.2 is good (enough) and this is important. glm 5.2 is good enough to change information technology in very fundame…

X AI KOLs Following ↗ · 6d ago Cached

GLM 5.2 is an open-weights LLM that is sufficiently capable to allow businesses to manage their IT needs locally on affordable hardware, potentially transforming small/medium enterprise data management.

0 favorites 0 likes

#local-inference