I turned an Android phone into a Vulkan-accelerated local LLM node (GGUF + LiteLLM + Tailscale)

Reddit r/LocalLLaMA 06/03/26, 11:15 PM Tools

android vulkan llm gguf litellm tailscale self-hosted inference

Summary

An Android phone is repurposed as a portable GGUF inference server with Vulkan acceleration, exposing an OpenAI-compatible endpoint via LiteLLM and Tailscale mesh for integration into a self-hosted AI cluster.

Hey everyone — I’ve been working on something that finally reached a stable enough point to share. I’ve been experimenting with using an Android device as a local inference node inside a self-hosted AI mesh. The goal wasn’t “run a chatbot on Android,” but to make the phone behave like a portable GGUF inference server that plugs into an existing cluster. \## What it currently does \- Loads GGUF models locally on-device \- Uses Vulkan for mobile GPU acceleration \- Exposes an OpenAI-compatible endpoint on the mesh \- Routes through LiteLLM like any other backend \- Joins the cluster through Tailscale \- Supports fallback routing to larger local nodes \- Can run standalone when the rest of the mesh is unavailable \## Architecture \`\`\`text \[Android Pocket Node / Z Fold 6\] GGUF + Vulkan (gpu\_layers=89) llama.cpp JNI/NDK bridge OpenAI-compatible local endpoint ↓ \[Tailscale Mesh\] ↓ \[Edge Gate on neo-x510uar\] request pre-flight battery / thermal / prompt-size routing ↓ \[LiteLLM Router on neo-x510uar\] OpenAI-compatible gateway model aliases fallback routing ↓ \[Fallback Nodes\] sheens-mac-studio — heavier reasoning / judge models moolah — RTX box for GPU-heavy workloads

Original Article

I turned an Android phone into a Vulkan-accelerated local LLM node (GGUF + LiteLLM + Tailscale)

Similar Articles

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

What impedes apps using AI to make the user’s device the server running a local LLM?

We built an app that runs AI completely offline on your phone (Local LLMs). Perfect for flights, camping, or dead zones.

OpenClaw controlling an Android phone?

@DivyanshT91162: Everyone is distracted by AI agents in the cloud… Meanwhile, some people quietly turned their laptops into autonomous A…

Submit Feedback

Similar Articles

@leopardracer: https://x.com/leopardracer/status/2055341758523883631

What impedes apps using AI to make the user’s device the server running a local LLM?

We built an app that runs AI completely offline on your phone (Local LLMs). Perfect for flights, camping, or dead zones.

OpenClaw controlling an Android phone?

@DivyanshT91162: Everyone is distracted by AI agents in the cloud… Meanwhile, some people quietly turned their laptops into autonomous A…