Cached at:
05/11/26, 07:26 AM
TL;DR: Factorio employs a Deterministic Lockstep architecture rather than traditional snapshot synchronization. By shifting the computational load from network bandwidth to the CPU, it achieves efficient synchronization of millions of entities. This article dissects the implementation details used to resolve issues of non-determinism and network latency, including a custom math library, standardized PRNGs, latency state buffers, and server-side "Mega Packets."
## From Snapshot Architecture to Lockstep Architecture
In traditional FPS games, servers typically use a "Snapshot Architecture," sending changes to the game state to clients every millisecond via delta compression. To save bandwidth, servers calculate a Potentially Visible Set (PVS), transmitting only data within the player's field of view. This architecture aims to address the network trilemma of bandwidth, latency, and reliability.
However, *Factorio* adopts a radically different **Deterministic Lockstep Architecture**.
Under this architecture, players send only **inputs** (such as mouse clicks) to the server or other players, rather than game state. All participants in the session then simulate the game progression independently based on the same inputs. This implies:
1. **Extremely Low Bandwidth Usage**: The server does not care about the position or status of every single machine among hundreds of thousands; it only cares about player input commands.
2. **Load Shifting**: The computational burden shifts from network bandwidth and server load to the players' CPUs.
3. **Input Lag**: Since all members must collect all inputs before simulating each frame, the visual display lags behind recent actions, resulting in inherent input latency.
## The Prerequisite for Determinism: Strict Consistency
The core premise of the lockstep architecture is **Determinism**. Given the same inputs, all participants must calculate exactly the same result for every tick (game logic frame). If discrepancies arise, the simulation will desynchronize, causing players to see entirely different game worlds.
### The Challenge of Floating-Point Arithmetic
Floating-point operations often vary across different platforms, compilers, and instruction sets, especially when involving vectorized instructions. For example:
* **RCPPS Instruction**: Computes the reciprocal of four floating-point numbers at once.
* **RSQRT Instruction**: Computes the reciprocal square root of a set of numbers.
Although both Intel and AMD state that their relative error is less than $1.5 \times 2^{-12}$, these approximate implementations can still lead to minute deviations. In long-term simulations, these tiny differences accumulate and cause severe desynchronization. While AMD's average precision is usually higher, this uncertainty is fatal for a lockstep architecture.
Many games address this by enabling the compiler's "strict mode" to limit aggressive optimizations, or by switching entirely to fixed-point arithmetic (e.g., *StarCraft II*). However, Factorio chose a more complex path: **retaining floating-point numbers for their precision and range, while standardizing every mathematical operation.**
The development team implemented a custom trigonometric function library and ensured that all mathematical calculations are executed in a strictly defined order, bypassing the system standard library. This ensures **bitwise consistency** in calculation results, whether on x86 or ARM architectures. This represents a significant engineering investment but guarantees the precision and determinism required for large-scale simulations.
### Standardization of Random Numbers
Another source of non-determinism is the Pseudo-Random Number Generator (PRNG). Computers cannot generate true random numbers; instead, they generate sequences of seemingly random numbers based on a seed. As long as all players start with the same seed, they should extract the same sequence of random numbers.
However, implementation details can cause problems. In many programming languages, the evaluation order of function parameters is not strictly defined. For example, a PRNG call receiving two parameters might be evaluated left-to-right or right-to-left by different compilers. This subtle shift in sequence causes the PRNG state to diverge, ultimately leading to simulation desynchronization. Factorio eliminated this uncertainty caused by compiler behavior by strictly standardizing code logic.
## Fixed Tick Rate and Latency Handling
To match the simulation rhythm among players, Factorio uses a fixed **60 UPS** (Updates Per Second), which also corresponds to a 60 FPS cap. Even if hardware performance allows for higher frame rates, the game is locked to this limit. For an automation game that does not require rapid reactions, this is an acceptable trade-off aimed at ensuring network synchronization stability.
To address the input lag inherent in the lockstep architecture, Factorio uses a **Latency State Buffer**.
* **Client Prediction**: By copying the current game state and applying local inputs, it provides instant feedback to the player, masking network latency.
* **Buffer Overflow**: If the buffer fills up, it may indicate that the CPU cannot keep up with the simulation speed, or that synchronization time is too long, leading to insufficient network throughput.
## From P2P to Server-Client Model
Factorio initially adopted a Peer-to-Peer (P2P) architecture, but this presented numerous challenges:
1. **Difficulty in Topology Maintenance**: It required handling NAT UDP traversal, relay server connections, and dynamic join/leave events.
2. **Code Complexity**: As Terry Davis said, "Fools appreciate complexity, geniuses appreciate simplicity." Complex P2P code is prone to introducing difficult-to-debug bugs.
3. **State Synchronization Burden**: In P2P, it is difficult to efficiently store and transmit the complete game state when new players join.
4. **Inefficient Broadcasting**: In P2P, inputs generated by each player must be broadcast to everyone. As the number of players increases, the number of data packets grows exponentially, contradicting the original goal of "low bandwidth."
In the FFF 147 update, Factorio rewrote its multiplayer system, shifting to a **Server-Client Model** (though some P2P features, such as player-hosted servers, were retained). The new architecture introduced the following key mechanisms:
### Mega Packet
The server collects inputs from all players in each tick, packages them into an optimized data structure, and broadcasts this "source of truth" to all clients at once. This transforms chaotic mesh connections into a clean, synchronized star topology. Regardless of how many engineers are online, each client only needs to communicate with the server.
### Connection Handshake and Heartbeat
* **Heartbeat Mechanism**: Players periodically send heartbeat packets to keep the connection active.
* **Handshake Process**: Includes connection request, connection response, client acknowledgment of the response, and final server confirmation. This process not only establishes the connection but also serves as a basic measure against DDoS attacks.
### Tick Closures
Tick closures allow clients to declare that they have generated a fixed number of inputs within the current tick. This provides the server with a clear verification window to confirm whether the client has completed its simulation tasks.
### Cyclic Redundancy Check (CRC)
To prevent cheating or synchronization errors, each client must calculate a hash (checksum) of the current game state.
* **Heuristic Optimization**: Since traversing the full state of millions of entities takes too long, CRC calculates only critical parts (e.g., logic networks, inventory states).
* **Failure Handling**: If the check fails, the client must re-download the entire game state, just like a newly joined player.
## Conclusion: The Silent Network
Factorio's network architecture demonstrates the power of "mathematically supported trust." Instead of transmitting massive state data over the network, it transmits clear **intentions** (inputs). This discipline allows the game to handle thousands of active entities without overwhelming the network connection.
In a game genre where scale typically leads to desynchronization and lag, Factorio shows that large-scale simulations can maintain perfect synchronization for hundreds of hours, provided the foundation is sufficiently deterministic. Its core philosophy, as stated at the end of the video, is: **Sometimes, the most effective network is the one that remains as silent as possible.**
Source: How Factorio Syncs A Million Objects over the network (https://www.youtube.com/watch?v=0FHSZ1hani0)