Unity FPSSample - Takeaways from the "Deep dive into networking" talk

Posted on January 27, 2019

Table of Contents


Intro

I recently watched Peter Andreasen’s talk about the Unity FPSSample (Deep dive into networking for Unity’s FPS Sample game).

FPSSample is a reference implementation of a classic Quake-like multiplayer FPS game. The source code is available at https://github.com/Unity-Technologies/FPSSample.

It was super valuable to learn about networking techniques I had never even heard about. Also I appreciate the many pieces of wisdom sprinkled throughout the session!

This post has a few purposes:

  • To highlight the techniques I haven’t seen discussed anywhere else
  • To summarize the talk
  • To note some of my own remaining questions - for later investigation

Overview

FPSSample uses the classic Quake model - where clients send only their inputs and the server sends regular snapshots of the game state.

The server updates the game state and acts on all players’ inputs simultaneously at a fixed rate - this approach was chosen for its simplicity relative to the one where the server manages a separate game state for each player.

New stuff

Here’s stuff I learned and that I haven’t seen discussed elsewhere.

Asymmetric tick rates

Asymmetric update rates between client and server isn’t news but Peter discusses this topic in depth.

So yeah, a client can have a tick rate that is different from the server’s, and a render rate that is different from its tick rate.

A few paragraphs below expand on that.

Delta encoding using frame prediction

Glenn Fiedler has a few articles that explain delta encoding for snapshot compression.

The idea is to express all values that we need to replicate as a difference from the ones in the last snapshot that the client acknowledged. It reduces the magnitude of values involved so that smaller data types can be used (and eliminates values that didn’t change).

FPSSample develops this concept one step further. Instead of using the last acknowledged snapshot as the baseline for encoding, it extrapolates a snapshot using the past three snapshots and uses that as the baseline. Thanks to shared code on the server and client, the client is also able to construct this same baseline on its end.

This is a very powerful optimization. Consider just position and velocity replication for example: assuming velocity is unchanged, changes in position don’t need to be sent anymore!

Data stream structure

In data serialization the structure of the data is omitted. Only the data itself gets sent.

So for example a naive serialized structure for a position update sent by the server might look like:

Player {
    ID: 1
    position: (x, y, z)
}
Player {
    ID: 2,
    position: (x, y, z)
}

But the client already knows about the existence of Players 1 and 2 from a previously received snapshot where they were created. So a serialized structure like the following is sufficient:

(x, y, z)
(x, y, z)

The client was able to re-construct the structure of the data using its knowledge of the game state. Look at all the bandwidth that was saved!

RTT measurement

I’m not 100% confident but here’s my understanding of the sample’s RTT measurement:

  • Host sends packet, saves local time
  • Peer receives packet, saves local time
  • Peer processes packet, sends a response packet which includes the elapsed time between reception and response (processing time)
  • Host receives response, uses saved time to approximate RTT + substracts peer’s reported processing time

The important detail is that the peer transmits the processing time to the host.

58/62 FPS trick

Players often run games at multiples of 30 FPS (30, 60, 120 are common).

If we take for example a client and a server both running at 60 FPS, we will observe flip-flopping behavior where some server tick will receive no player input, and the next tick will receive two (if the client and server are ticking in sync).

Server sleep time

The team implemented a tick rate mechanism that gives ample time for the host machine to sleep (to save on compute costs).

It uses a low-resolution timer and nudges the next frame’s timing based on this frame’s performance.

Though I wonder how the irregularity in tick rate affects the smoothness on the clients. Theoretically it should be practically insignificant thanks to the interpolation buffers.

Clock drift

As of the time of the talk:

The team found that Unity’s framerate limiter accumulated drift over long periods of time.

This may have been addressed by the framerate nudging system described in the previous paragraph, or fixed some other way.

Fractional ticks on client

I don’t understand this technique yet.

The overall idea is that when the client has a tick rate which is different from the server’s tick rate, it still tries to match it to the server’s rate for smoothness by “catching up the small leftover”.

Classic

Here’s topics that have been well described on the net by popular docs from Glenn Fiedler, Gabriel Gambetta, and Valve to name just a few.

Reliable UDP implementation

Peter compares UDP and TCP, describes the dangers of UDP, and describes the sample’s reliable UDP implementation (for example: sequence IDs and ack bitfields).

Past input redundancy

Old but gold! The sample sends the past 3 input commands alongside the current one to minimize the impact of a dropped packet.

Lag compensation for hitscan

When the server checks for hit registration it rewinds the world around the player by [RTT + player’s interpolation time]. The client regularly updates the server about its interpolation time, that’s how the server knows.

Client-side interpolation

Used for framerate-independant smooth rendering and to de-jitter incoming snapshots.

Delta encoding compression

There was a summary of the delta encoding technique (against the last snapshot that was acknowledged by the client). Delta encoding through frame prediction was used instead in the end.

Open questions

Here’s stuff that wasn’t mentioned or discussed at length but that I’m curious about.

Server reconciliation

Peter mentions that if the server doesn’t receive a player’s input command(s) for a given tick, then that player is assumed to have done nothing during that period. This is only one of the many possibilities for de-sync to happen between client and server.

It’s also mentioned that a full rollback is performed on the client at the beginning of every frame.

I’m curious if/how desyncs are smoothed out.

Client-side interpolation time

I’m curious how the client determines what its interpolation time should be, and how exactly it’s shared with the server.

Multiple client ticks for a single server tick

So given that client and server can have different tick rates, the client might send multiple inputs for a single server tick.

Are the client inputs “merged” together and executed in one shot? Or does the server step through them sequentially? Or something else entirely?

Fractional ticks

How does this work?