π High-Level Goal
Support a 64 v 64 (128 total) βHell-Let-Looseβstyleβ FPS with Godot clients and an authoritative Rust server, while keeping latency low (< 80 ms RTT budget) and bandwidth reasonable for both clients (< 250 kbps) and the server box (< 25 Mbps).
ββββββββββββββββββββββββββββββββββββββββ
- Core Design Pillars
ββββββββββββββββββββββββββββββββββββββββ
β’ Authoritative server β no trust in clients
β’ UDP first, with a light reliability/ordering layer (think ENet/Laminar/QUIC)
β’ Fixed-rate server simulation tick, client-side prediction + interpolation
β’ Delta-compressed, relevance-filtered snapshots (a.k.a. interest management)
β’ Multi-threaded ECS simulation on the server; network I/O kept lock-free
β’ Single box for 128 players, but layout is shard-friendly if we ever split
ββββββββββββββββββββββββββββββββββββββββ 2. Top-Level Architecture ββββββββββββββββββββββββββββββββββββββββ Godot Client <-UDP/QUIC-> Rust βGame-Coreβ (authoritative) <-TCP-> Lobby / DB
ββββββββββ inputs βββββββββββββββ events ββββββββββββ
β Godot βββββββββββββΊβ Net Front ββββββββββββΊβ Match β
β Client ββββββββββββββ Gate (IO) βββββββββββββ Lobby β
ββββββββββ snapshots ββββββββ¬βββββββ ββββββββββββ
β
lock-free
channels
β
βββββββΌββββββ
β Game ECS β
β (Bevy?) β
βββββββ¬ββββββ
β
βββββββΌββββββ
β Worker β
β Threads β
βββββββββββββ
Why two layers inside the server?
β’ Net Front Gate = purely async I/O, packet (de)frag, (de)crypt, acks.
β’ Game ECS = deterministic world updated at fixed Ξt, batch-consumes inputs, emits snapshots.
ββββββββββββββββββββββββββββββββββββββββ
3. Transport & Packet Layout
ββββββββββββββββββββββββββββββββββββββββ
Transport: UDP (or QUIC if you want built-in encryption + congestion control).
Max safe MTU: 1200 bytes (fits inside most home NAT-MTUs).
Packet Header (7 bytes):
uint16 seq_id
uint16 ack_of_remote
uint32 ack_bitfield (32 earlier acks)
uint8 flags (bit0=reliable, bit1=frag, bit2=controlβ¦)
Payload = 1βN βmessagesβ TLVed inside the datagram:
Msg-Types (1 byte id + 1 byte len if <256):
00 Heartbeat / ping
01 InputCmd (bitfield buttons 2B + 3Γpos32 or delta16 + uint8 tick)
02 SnapshotDelta (compressed)
03 SnapshotBaseline (full state if delta lost)
04 Event/RPC (grenade exploded, chat, UI)
05 StreamFrag (map chunk, voice, etc.)
Reliability:
β’ βreliableβ flag + sliding window resends.
β’ Unreliable for InputCmds (they become obsolete quickly).
β’ Semi-reliable for SnapshotBaselines.
ββββββββββββββββββββββββββββββββββββββββ
4. Tick & Time Model
ββββββββββββββββββββββββββββββββββββββββ
Simulation tick = 60 Hz (Ξt = 16.66 ms)
Networking tick = 20 Hz (every 3rd sim tick we send a snapshot)
Client Render (144 Hz)
βββββββββββββββββββββββββββββββββββββ
Timeline β βI I IβI I IβI I Iβ β¦ (inputs @ 60) β βββ¬ββ¬ββ΄ββ¬ββ¬ββ΄ββ¬ββ¬ββ΄ββ¬ββββββββββββββββ€ Server Sim βS βS βS βS βS βS βS β¦ (60 Hz) β ββββββββββ¬βββββββββ¬βββββββββ¬βββββββββ Snapshot Tx β² β² β² (20 Hz) Interpolation buf. 2.5 ticks β 40 ms
Client-side:
β’ Sends InputCmd every render frame (ideally 60 Hz limit).
β’ Predicts locally.
β’ Keeps 100 ms of history; on mismatch vs authoritative state β smooth rewind/correct.
Server:
β’ Collects all inputs with tick ID β€ current-tick.
β’ Simulates physics, hit-scan.
β’ Serializes state diff vs. last ACKed snapshot per client.
β’ Runs interest mgmt: spatial hash + LOS + team filter.
ββββββββββββββββββββββββββββββββββββββββ
5. Interest / Relevance Management
ββββββββββββββββββββββββββββββββββββββββ
World split into 3-D grid cells (e.g. 32 m cubes).
For each client we only ship entities inside a radius of R = 250 m in front 120Β° FOV + team markers.
Typical relevant entity count:
β’ Players: β 40
β’ Projectiles (bullets & tracers): β 30 (fade quickly)
β’ Grenades / effects: 10
β’ Buildables / vehicles: 20
TOTAL β 100 entities / player on average.
Entity State Quantization per delta entry
id (uint16) 2 B
position (x,y,z int16) 6 B (centimeter accuracy inside 2 km map)
yaw/pitch (2Γint16) 4 B
velocity (packed int16Γ3) 6 B
state bits (1 byte) 1 B
TOTAL ~19 B β delta often ~10 B after XOR & RLE
Bandwidth per client (down): 100 entities Γ 10 B Γ 20 Hz = 20 kB/s β 160 kbps
Bandwidth per client (up): InputCmd 8 B Γ 60 Hz = 480 B/s β 4 kbps
Server aggregate:
Down: 20 kB/s Γ 128 = 2.5 MB/s β 20 Mbps
Up: 0.48 kB/s Γ 128 = 61 kB/s β 0.5 Mbps
Well within a single gig-E NIC.
ββββββββββββββββββββββββββββββββββββββββ
6. Server Threading & Scaling
ββββββββββββββββββββββββββββββββββββββββ
CPU Budget (per tick):
β’ Physics + ECS: ~100 Β΅s per player β 128 Γ 100 Β΅s = 12.8 ms
β’ Overhead / pathing / extras β 2.0 ms
Total β 14.8 ms < 16.6 ms budget π
Implementation:
β’ 1 async thread (Tokio/Quinn) for recv/send (zero-copy to/from mpsc channel).
β’ N-1 worker threads (rayon or Bevy_schedule) own ECS; partition by entity or system.
β’ End of tick = barrier; snapshot builder runs, pushes bytes back to Net thread.
Memory:
Baseline entity (archetype) ~256 B; 5 000 live entities β 1.3 MB.
Plenty of headroom; 32 GB RAM box is luxurious.
ββββββββββββββββββββββββββββββββββββββββ 7. Will It Still Work for 128 Players? ββββββββββββββββββββββββββββββββββββββββ We already designed for 128 total. Stress test scenario: everybody in one courtyard.
Entity count might double to 200 relevant.
Bandwidth per client β 40 kB/s (β320 kbps) still OK.
Server outbound β 5 MB/s (β40 Mbps) still < 1/20th gig-E.
CPU: bullet spam could spike physics to 25 ms β mitigation:
β’ cap projectile simulation (hitscan on server, clients draw fake tracers)
β’ off-thread async jobs for explosions etc.
So yes, still viable on one modern 8-core (Ryzen 5 7600, Xeon E-2288G, etc.). For >128, youβd shard or open βregion serversβ (same exe, different port).
ββββββββββββββββββββββββββββββββββββββββ
8. Special Topics / Trade-offs
ββββββββββββββββββββββββββββββββββββββββ
Anti-cheat:
β’ Server validates hits; client only raycasts for FX.
β’ CRC on resources, obfuscation of packet opcodes.
β’ Optional: kernel driver not covered here.
Matchmaking & Persistence:
β’ Separate micro-service; server receives a βSpawnBlobβ (loadout, cosmetics).
β’ At end of match flush stats via TCP to DB.
Voice:
Donβt mix in main data path; use separate SFU or Vivox-like relay.
Tick vs. Event-Driven alternative?
If you wish to ditch fixed 60 Hz, you could go βsnapshot-basedβ variable Ξt (Apex approach) but youβll complicate determinism & physics. For indie scope, classic fixed-tick is safer.
ββββββββββββββββββββββββββββββββββββββββ
9. Checklist Summary
ββββββββββββββββββββββββββββββββββββββββ
β
UDP + reliability layer (seq/acks)
β
60 Hz sim / 20 Hz snapshots, client interp 100 ms
β
Entity relevance + delta compression to keep < 320 kbps per user
β
Rust: ECS (Bevy/Legion/Shipyard) + Tokio/Quinn net I/O
β
8-core box, 40 Mbps peak outbound, < 16 ms/frame CPU
β
Scales to 128 players; >128 β shard or stream sections to sub-servers
You now have a concrete yet implementation-agnostic blueprint for building the server & protocol. Happy fragging! π«
Of course. Let's orchestrate the high-level design into a practical, phased project plan. This approach is designed to deliver value incrementally, allowing for testing and validation at each stage. We'll structure this as a series of milestones, each building upon the last.
Guiding Principles for Orchestration
Project Orchestration Plan
Milestone 0: The Foundation (Pre-Production)
This is the "sharpen the axe" phase before you write the first line of game logic.
Task 0.1: Define the "Networked Structs" Contract.
game-protocol
orshared
. This crate will be a dependency for both the Rust server and (via bindings) the Godot client.struct InputCmd { ... }
(e.g., movement axes, buttons as a bitmask, look angles).struct PlayerState { ... }
(e.g., entity_id, position, velocity, health).enum ServerMessage { Snapshot(Vec<PlayerState>), Event(...), ... }
enum ClientMessage { Input(InputCmd), ... }
serde
and a binary format likebincode
for easy serialization.Task 0.2: Choose Your Core Libraries.
tokio
(the industry standard).bevy_ecs
is a great choice. It's mature, data-oriented, and its scheduler is designed for parallelism. Alternatively,legion
orshipyard
.tokio::net::UdpSocket
directly for maximum control initially.UDPServer
/UDPPeer
or a C#/.NETUdpClient
.Task 0.3: Basic Project Scaffolding.
./rust-server/
(a Cargo workspace)./godot-client/
(a Godot project)./shared-protocol/
(the Rust crate from Task 0.1)Milestone 1: The "Moving Cube" (Proof of Connection)
Goal: Prove that the client can connect to the server and see a single object move based on server state. No player input yet.
Task 1.1: Server - The Unblinking Eye.
Position
component.pos.x += 0.1
).Task 1.2: Client - The Observer.
Node3D
(representing the player) and aMeshInstance3D
(the cube)._ready
, start a UDP listener on a background thread._process
, update the cube'sglobal_transform.origin
to match the received position.β Verifiable Outcome: You run the server, then the client. A cube smoothly slides across the screen in Godot. You have successfully built the fundamental client-server link.
Milestone 2: Player Control & Authoritative Movement
Goal: The player can control their cube. This introduces the core concepts of prediction and reconciliation.
Task 2.1: Client - Sending Inputs & Predicting.
InputCmd
struct (from Milestone 0) and send it to the server at 60Hz.Task 2.2: Server - Accepting Inputs & Reconciliation.
InputCmd
packets.Task 2.3: Client - Correction & Interpolation.
lerp(old_pos, new_pos, delta)
). This ensures other players move smoothly, hiding network jitter.β Verifiable Outcome: You can move a character around. It feels instant. When you introduce artificial packet loss, your character might jitter and correct itself, while other players continue to move smoothly.
Milestone 3: Scaling to 128 Players (The Real Architecture)
Goal: Refactor the prototype to handle the target player count and map size.
Task 3.1: Server - Parallelize the ECS.
bevy_ecs
Schedule
andWorld
.Systems
(e.g.,apply_inputs_system
,physics_system
,broadcast_state_system
).Task 3.2: Server - Interest Management (Relevance).
Task 3.3: Server & Client - Delta Compression.
snapshot_id
that each client has acknowledged.β Verifiable Outcome: Run a headless simulation on the server with 128 "bots" running around. Measure CPU usage (should be distributed across cores) and the size of outgoing packets (should be small). The architecture is now proven to scale.
Milestone 4: Gameplay Implementation
Goal: Turn the tech demo into a game.
Task 4.1: Hit Detection.
PlayerFired
RPC.PlayerFired
, perform an authoritative raycast in the ECS world. If it hits another entity, reduce itsHealth
component.PlayerWasHit
event in the next snapshot to all relevant clients.PlayerWasHit
, play a blood splatter VFX and update the UI.Task 4.2: Game State & Objectives.
MatchTimer
,TeamScores
).β Verifiable Outcome: You can run around and shoot other players. Health bars go down. A game clock ticks down. It's a game!
Milestone 5: Production Readiness
Goal: Prepare the server for real-world deployment and operation.
Task 5.1: Matchmaking & Lobby Flow.
Task 5.2: Observability.
tracing
). Log critical events like player joins, leaves, and server errors.prometheus
crate) to track players online, tick duration, and bandwidth usage.Task 5.3: Deployment.
β Verifiable Outcome: You can run a command that automatically builds, packages, and deploys a new server version. You can view its logs and performance on a dashboard. The system is robust and manageable.
This phased plan takes you from zero to a fully-featured, scalable, and deployable 128-player game server, validating the architecture at each critical step. π