how would you set up a local llm server for a business of 7 people?

Reddit r/LocalLLaMA 05/15/26, 04:14 PM Tools

local-llm llm-server business scaling hardware self-hosting rag

Summary

A user asks for advice on setting up a local LLM server for a 7-person business, considering models like Gemma 4 and Qwen 3.6, hardware options like a 5090 or MacBook Pro, and scaling with concurrent users.

Okay so i've been stalking this sub for some time and i run the occasional small 2-8b model on my laptop (not the best) for fun but say my role at a company is to set up a local LLM since we obviously don't want confidential data going to other companies etc / main use case would be queries, rag, general use nothing crazy except for maybe 1 or 2 people using it for programming purposes. i was thinking of gemma 4 26/31 or qwen 3.6 27/35. how do these models scale with concurrent users? i know i could run one of these on a 5090 and some extra or a 48gb macbook pro w unified memory but not sure how these scales with multiple users.

Original Article

Similar Articles

@songjunkr: Sharing my local LLM setup for personal use: Equipment: MacStudio M2 Ultra 64gb Model on load - SuperQwen3.6 35b mlx 4b…

X AI KOLs Timeline

A user shared their personal local LLM stack running on a MacStudio M2 Ultra 64 GB, combining SuperQwen3.6-35b-mlx-4bit, Ernie Image Turbo, and multiple helper models for coding and chat.

Choosing a Mac Mini for local LLMs — what would YOU actually buy?

Reddit r/LocalLLaMA

A community discussion post seeking advice on which Mac Mini configuration (M4, M2 Pro, or M1 Max) to purchase for running local LLMs with Ollama and coding assistants, with the decision complicated by rumored M5 releases and current supply shortages.

how would you set up a local llm server for a business of 7 people?

Similar Articles

@songjunkr: Sharing my local LLM setup for personal use: Equipment: MacStudio M2 Ultra 64gb Model on load - SuperQwen3.6 35b mlx 4b…

Choosing a Mac Mini for local LLMs — what would YOU actually buy?

I've seen a lot of folks ask "can local LLMs actually do anything useful?"

Is a high-end private local LLM setup worth it?

Local LLM autocomplete + agentic coding on a single 16GB GPU + 64GB RAM

Submit Feedback