how would you set up a local llm server for a business of 7 people?

Reddit r/LocalLLaMA Tools

Summary

A user asks for advice on setting up a local LLM server for a 7-person business, considering models like Gemma 4 and Qwen 3.6, hardware options like a 5090 or MacBook Pro, and scaling with concurrent users.

Okay so i've been stalking this sub for some time and i run the occasional small 2-8b model on my laptop (not the best) for fun but say my role at a company is to set up a local LLM since we obviously don't want confidential data going to other companies etc / main use case would be queries, rag, general use nothing crazy except for maybe 1 or 2 people using it for programming purposes. i was thinking of gemma 4 26/31 or qwen 3.6 27/35. how do these models scale with concurrent users? i know i could run one of these on a 5090 and some extra or a 48gb macbook pro w unified memory but not sure how these scales with multiple users.
Original Article

Similar Articles

Choosing a Mac Mini for local LLMs — what would YOU actually buy?

Reddit r/LocalLLaMA

A community discussion post seeking advice on which Mac Mini configuration (M4, M2 Pro, or M1 Max) to purchase for running local LLMs with Ollama and coding assistants, with the decision complicated by rumored M5 releases and current supply shortages.