π High-Level Goal
Support a 64 v 64 (128 total) βHell-Let-Looseβstyleβ FPS with Godot clients and an authoritative Rust server, while keeping latency low (< 80 ms RTT budget) and bandwidth reasonable for both clients (< 250 kbps) and the server box (< 25 Mbps).
ββββββββββββββββββββββββββββββββββββββββ
- Core Design Pillars
ββββββββββββββββββββββββββββββββββββββββ
β’ Authoritative server β no trust in clients
β’ UDP first, with a light reliability/ordering layer (think ENet/Laminar/QUIC)
β’ Fixed-rate server simulation tick, client-side prediction + interpolation
β’ Delta-compressed, relevance-filtered snapshots (a.k.a. interest management)
β’ Multi-threaded ECS simulation on the server; network I/O kept lock-free
β’ Single box for 128 players, but layout is shard-friendly if we ever split
ββββββββββββββββββββββββββββββββββββββββ 2. Top-Level Architecture ββββββββββββββββββββββββββββββββββββββββ Godot Client <-UDP/QUIC-> Rust βGame-Coreβ (authoritative) <-TCP-> Lobby / DB
ββββββββββ inputs βββββββββββββββ events ββββββββββββ
β Godot βββββββββββββΊβ Net Front ββββββββββββΊβ Match β
β Client ββββββββββββββ Gate (IO) βββββββββββββ Lobby β
ββββββββββ snapshots ββββββββ¬βββββββ ββββββββββββ
β
lock-free
channels
β
βββββββΌββββββ
β Game ECS β
β (Bevy?) β
βββββββ¬ββββββ
β
βββββββΌββββββ
β Worker β
β Threads β
βββββββββββββ
Why two layers inside the server?
β’ Net Front Gate = purely async I/O, packet (de)frag, (de)crypt, acks.
β’ Game ECS = deterministic world updated at fixed Ξt, batch-consumes inputs, emits snapshots.
ββββββββββββββββββββββββββββββββββββββββ
3. Transport & Packet Layout
ββββββββββββββββββββββββββββββββββββββββ
Transport: UDP (or QUIC if you want built-in encryption + congestion control).
Max safe MTU: 1200 bytes (fits inside most home NAT-MTUs).
Packet Header (7 bytes):
uint16 seq_id
uint16 ack_of_remote
uint32 ack_bitfield (32 earlier acks)
uint8 flags (bit0=reliable, bit1=frag, bit2=controlβ¦)
Payload = 1βN βmessagesβ TLVed inside the datagram:
Msg-Types (1 byte id + 1 byte len if <256):
00 Heartbeat / ping
01 InputCmd (bitfield buttons 2B + 3Γpos32 or delta16 + uint8 tick)
02 SnapshotDelta (compressed)
03 SnapshotBaseline (full state if delta lost)
04 Event/RPC (grenade exploded, chat, UI)
05 StreamFrag (map chunk, voice, etc.)
Reliability:
β’ βreliableβ flag + sliding window resends.
β’ Unreliable for InputCmds (they become obsolete quickly).
β’ Semi-reliable for SnapshotBaselines.
ββββββββββββββββββββββββββββββββββββββββ
4. Tick & Time Model
ββββββββββββββββββββββββββββββββββββββββ
Simulation tick = 60 Hz (Ξt = 16.66 ms)
Networking tick = 20 Hz (every 3rd sim tick we send a snapshot)
Client Render (144 Hz)
βββββββββββββββββββββββββββββββββββββ
Timeline β βI I IβI I IβI I Iβ β¦ (inputs @ 60) β βββ¬ββ¬ββ΄ββ¬ββ¬ββ΄ββ¬ββ¬ββ΄ββ¬ββββββββββββββββ€ Server Sim βS βS βS βS βS βS βS β¦ (60 Hz) β ββββββββββ¬βββββββββ¬βββββββββ¬βββββββββ Snapshot Tx β² β² β² (20 Hz) Interpolation buf. 2.5 ticks β 40 ms
Client-side:
β’ Sends InputCmd every render frame (ideally 60 Hz limit).
β’ Predicts locally.
β’ Keeps 100 ms of history; on mismatch vs authoritative state β smooth rewind/correct.
Server:
β’ Collects all inputs with tick ID β€ current-tick.
β’ Simulates physics, hit-scan.
β’ Serializes state diff vs. last ACKed snapshot per client.
β’ Runs interest mgmt: spatial hash + LOS + team filter.
ββββββββββββββββββββββββββββββββββββββββ
5. Interest / Relevance Management
ββββββββββββββββββββββββββββββββββββββββ
World split into 3-D grid cells (e.g. 32 m cubes).
For each client we only ship entities inside a radius of R = 250 m in front 120Β° FOV + team markers.
Typical relevant entity count:
β’ Players: β 40
β’ Projectiles (bullets & tracers): β 30 (fade quickly)
β’ Grenades / effects: 10
β’ Buildables / vehicles: 20
TOTAL β 100 entities / player on average.
Entity State Quantization per delta entry
id (uint16) 2 B
position (x,y,z int16) 6 B (centimeter accuracy inside 2 km map)
yaw/pitch (2Γint16) 4 B
velocity (packed int16Γ3) 6 B
state bits (1 byte) 1 B
TOTAL ~19 B β delta often ~10 B after XOR & RLE
Bandwidth per client (down): 100 entities Γ 10 B Γ 20 Hz = 20 kB/s β 160 kbps
Bandwidth per client (up): InputCmd 8 B Γ 60 Hz = 480 B/s β 4 kbps
Server aggregate:
Down: 20 kB/s Γ 128 = 2.5 MB/s β 20 Mbps
Up: 0.48 kB/s Γ 128 = 61 kB/s β 0.5 Mbps
Well within a single gig-E NIC.
ββββββββββββββββββββββββββββββββββββββββ
6. Server Threading & Scaling
ββββββββββββββββββββββββββββββββββββββββ
CPU Budget (per tick):
β’ Physics + ECS: ~100 Β΅s per player β 128 Γ 100 Β΅s = 12.8 ms
β’ Overhead / pathing / extras β 2.0 ms
Total β 14.8 ms < 16.6 ms budget π
Implementation:
β’ 1 async thread (Tokio/Quinn) for recv/send (zero-copy to/from mpsc channel).
β’ N-1 worker threads (rayon or Bevy_schedule) own ECS; partition by entity or system.
β’ End of tick = barrier; snapshot builder runs, pushes bytes back to Net thread.
Memory:
Baseline entity (archetype) ~256 B; 5 000 live entities β 1.3 MB.
Plenty of headroom; 32 GB RAM box is luxurious.
ββββββββββββββββββββββββββββββββββββββββ 7. Will It Still Work for 128 Players? ββββββββββββββββββββββββββββββββββββββββ We already designed for 128 total. Stress test scenario: everybody in one courtyard.
Entity count might double to 200 relevant.
Bandwidth per client β 40 kB/s (β320 kbps) still OK.
Server outbound β 5 MB/s (β40 Mbps) still < 1/20th gig-E.
CPU: bullet spam could spike physics to 25 ms β mitigation:
β’ cap projectile simulation (hitscan on server, clients draw fake tracers)
β’ off-thread async jobs for explosions etc.
So yes, still viable on one modern 8-core (Ryzen 5 7600, Xeon E-2288G, etc.). For >128, youβd shard or open βregion serversβ (same exe, different port).
ββββββββββββββββββββββββββββββββββββββββ
8. Special Topics / Trade-offs
ββββββββββββββββββββββββββββββββββββββββ
Anti-cheat:
β’ Server validates hits; client only raycasts for FX.
β’ CRC on resources, obfuscation of packet opcodes.
β’ Optional: kernel driver not covered here.
Matchmaking & Persistence:
β’ Separate micro-service; server receives a βSpawnBlobβ (loadout, cosmetics).
β’ At end of match flush stats via TCP to DB.
Voice:
Donβt mix in main data path; use separate SFU or Vivox-like relay.
Tick vs. Event-Driven alternative?
If you wish to ditch fixed 60 Hz, you could go βsnapshot-basedβ variable Ξt (Apex approach) but youβll complicate determinism & physics. For indie scope, classic fixed-tick is safer.
ββββββββββββββββββββββββββββββββββββββββ
9. Checklist Summary
ββββββββββββββββββββββββββββββββββββββββ
β
UDP + reliability layer (seq/acks)
β
60 Hz sim / 20 Hz snapshots, client interp 100 ms
β
Entity relevance + delta compression to keep < 320 kbps per user
β
Rust: ECS (Bevy/Legion/Shipyard) + Tokio/Quinn net I/O
β
8-core box, 40 Mbps peak outbound, < 16 ms/frame CPU
β
Scales to 128 players; >128 β shard or stream sections to sub-servers
You now have a concrete yet implementation-agnostic blueprint for building the server & protocol. Happy fragging! π«
ποΈββοΈ Authoritative Physics & Collision Handling
(what runs on the Rust server every 60 Hz tick)
ββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββ
β’ 100 % authoritative: only the server decides βwhat touched whatβ.
β’ Fixed Ξt (16.66 ms) to keep integration stable and deterministic.
β’ Keep it cheap (β€ 2 ms of the 16 ms budget) β no full-blown rigid-body chaos, only what an infantry-centric FPS really needs.
β’ Give clients a mirror-lite version for prediction; small divergences are OK because reconciliation corrects them.
ββββββββββββββββββββββββββββββββββββββββ
2. Physics Scope for an HLL-Style FPS
ββββββββββββββββββββββββββββββββββββββββ
A. Player locomotion β capsule vs. static level geometry, jump, step-up, ladder.
B. Bullets / hitscan β instant ray checks (99 % of shots).
C. Projectiles β grenades, rockets: parabolic flight + explosion radius.
D. Environment β static meshes, trigger volumes, no destructibles (keep first release simple).
E. Vehicles β if/when added, approximate with single convex hull, no wheel suspension simulation initially.
ββββββββββββββββββββββββββββββββββββββββ
3. Tech Choice
ββββββββββββββββββββββββββββββββββββββββ
Use the Rapier3D crate (MIT-licensed, by Dimforge). Reasons:
β’ Pure Rust β perf & FFI-friendly.
β’ Deterministic when you pin the same compiler flags and CPU float mode (no SSEβvs-AVX divergence).
β’ Already has broad-phase (SAP), narrow-phase (GJK/EPA) and CCD.
β’ Integrates cleanly with Bevy ECS (via bevy_rapier) or any custom ECS.
Alternative: roll your own capsule-only solver β even faster, but higher upfront cost. Start with Rapier, profile, replace later if necessary.
ββββββββββββββββββββββββββββββββββββββββ
4. Collision Pipeline per Tick
ββββββββββββββββββββββββββββββββββββββββ
Average cost on Ryzen 5 5600:
β’ 128 dynamic capsules + 200 static colliders β 0.4 ms
β’ 2 000 raycasts (1 full-auto MG burst) β 0.3 ms
β’ 50 grenade projectiles β 0.2 ms
TOTAL β 0.9 ms
So weβre safely below 2 ms.
ββββββββββββββββββββββββββββββββββββββββ
5. Bullets β Rigid Bodies
ββββββββββββββββββββββββββββββββββββββββ
β’ 99 % of weapons modeled as hitscan:
β Collect all fire events this tick.
β Raycast from muzzle to muzzle + range in Rapierβs query API (no insertion of dynamic bodies).
β First intersection decides hit; store βHitEventβ component, later resolved into damage & FX.
β’ Tracers are purely cosmetic on the client (draw a ribbon between start & impact point after server response).
Benefits: zero per-frame memory churn, no tunneling issues, trivial network traffic (only send HitEvent).
ββββββββββββββββββββββββββββββββββββββββ
6. Grenades / Rockets (Slow Movers)
ββββββββββββββββββββββββββββββββββββββββ
β’ Insert as lightweight
RigidBodyType::KinematicPositionBased
.β’ Integrate with gravity:
pos += vel * dt; vel += g * dt
.β’ Continuous Collision Detection enabled so they donβt clip through walls when frame-offset.
β’ On impact OR fuse-timeout β spawn
ExplosionEvent
.β’ Explosion = overlap query of spheres within radius β O(#entities in that cell).
Network payload: only grenade spawn (reliable) + grenade despawn/explosion event (reliable). Intermediate positions are not networked; clients lerp.
ββββββββββββββββββββββββββββββββββββββββ
7. Static World Representation
ββββββββββββββββββββββββββββββββββββββββ
β’ Export level geometry from Godot as aggregate triangle mesh; pre-baked into Rapierβs
TriMesh
on server start.β’ For broad-phase culling the mesh is internally split into BVH nodes; no per-tick cost.
β’ Doors / bridges that move? Represent as separate kinematic bodies switched by gameplay scripts.
Memory footprint: 2 Γ compressed mesh size (BVH + verts). Typical 1 kmΒ² map ~ 30 MB β fine.
ββββββββββββββββββββββββββββββββββββββββ
8. Player Prediction on Client
ββββββββββββββββββββββββββββββββββββββββ
Server uses Rapier.
Client ships with subset of the same code compiled to WebAssembly or GDExtension:
β’ Step-up height, slope limit, gravity must match server constants.
β’ Disable expensive CCD & contact manifold generation client-side (not needed for prediction).
β’ Divergence <2 cm over 200 ms is usually unnoticeable; when it exceeds threshold β reconciliation.
To ensure numeric parity:
on both builds, or ship custom fixed-point math module just for locomotion.
ββββββββββββββββββββββββββββββββββββββββ
9. Determinism vs βGood Enoughβ
ββββββββββββββββββββββββββββββββββββββββ
We do not need lock-step determinism across all hardware, only βserver as the single source of truthβ. Therefore:
β’ Clients may drift a bit; server snaps them back.
β’ Spectator replay uses server log, so always correct.
β’ Future e-sport / anti-cheat hardening β move to fixed-point math to make server re-simulation easier in the cloud; not a V1 requirement.
ββββββββββββββββββββββββββββββββββββββββ
10. Profiling & Regression
ββββββββββββββββββββββββββββββββββββββββ
β’ Benchmarks (cargo criterion) that run the physics step with recorded input traces β catch perf regressions.
β’ Integration test: spawn 128 dummy capsules + 5 000 random raycasts, assert no panics and tick <2 ms on CIβs m5zn.metal reference machine.
ββββββββββββββββββββββββββββββββββββββββ
11. Extensibility Hooks
ββββββββββββββββββββββββββββββββββββββββ
β’ Vehicles later? Stick a convex hull collider + apply engine force; still fits.
β’ Destructible walls? Spawn new static collider chunks and mark them βdestroyedβ after HP β€0 β update BVH once.
ββββββββββββββββββββββββββββββββββββββββ
12. Recap Cheat-Sheet
ββββββββββββββββββββββββββββββββββββββββ
Physics engine: Rapier3D (server) + stripped client mirror
Tick: 60 Hz fixed step
Collision path: Broad Phase β Narrow Phase β Island Solver
Bullets: instant raycasts, no rigid bodies
Projectiles: kinematic, CCD on, overlap query on explode
Static map: baked triangle mesh BVH
Perf budget: < 1 ms of CPU per tick for 128 players
This gives you authoritative, efficient, and maintainable physics that stays within your latency and CPU budgets while scaling cleanly to 64 v 64 battles.