
| Networking: Bane of Networked Simulations | Home |
A simulation running on a single machine has fine control over how objects interact and are displayed to the player: poll for input, advance the sprites, check for collisions, and draw the frame. Multithreaded engines are a bit more more involved, but synchronization primitives are well understood and very fast. The same doesn't hold true for networked simulations. (Some might argue that RPC is sufficient for network games, but it's basically a wrapper on top of TCP and subject to the same limitations).
Network programming itself is actually quite simple, just another API to learn. The problem isn't learning how to send data over the wire, but coping with unpredictable and relatively slow communication while trying to maintain a stable simulation. Some developers think they can just stick TCP streams underneath a deterministic or multithreaded engine and they will be done, but it's a well-documented fallacy.
| A basic file or web server compares to a real time networked simulation as the family sedan compares to a racecar. Both operate on the same principles of combustion and acceleration, but one is optimized for speed. Getting a driver's license does not mean that one can hop into a racecar and compete in a race. The same way that racing requires exceptional driving skills not taught in driver's training, designing a fast and responsive network simulation requires more than just knowing how to send and receive network data. |
There is no magic bullet, no function that just makes the simulation operate quickly and reliably over any network. Reducing latency is an interactive part of game design, specifically choosing features that work well (or can at least cope) with poor communication. Rigid separation between game logic and network code is a recipe for poor performance.
The following should be the mantra of every multiplayer game designer:
Read it again. Tell a friend. Print some t-shirts. Latency affects games at the most basic level: any action that would change the state of the world must be communicated to the other players, ensuring (a) that the action is still possible and (b) that the appropriate consequences follow. If the round trip for a transaction is longer than 300 milliseconds, there will be noticable jumps and pauses during gameplay.
Players today expect responsive controls, smooth graphics, and accurate collisions. A game simulation typically updates 20-30 times per second, approximately 35 to 50 milliseconds between frames. Given that 100ms is a common Internet round trip time, and that modem compression can add another 200ms, a simulation could easily stall for 5 or more frames waiting for data. Add in some typical Internet packet loss and these delays can kill a poorly designed game.
There are four places where latency affects networked simulations. The first is the physical network connection between hosts. Electric and optical pulses only travel at a fraction of the speed of light, and the electronics that send and receive these pulses are even slower. Modems, ethernet cards, and routers all take time to process information. We call this link layer latency, and it affects every piece of data sent. This delay is measured as the round trip time of a ping packet (datagram) bounced off the remote host.
Retry latencycomes from performing reliable communication over an inherently unreliable medium. Compensating for packet loss is a process of waiting a specified time and resending data until it is acknowledged. The problem is that many operations simply need to be reliable and fast so that a simulation can be synchronized between computers. This type of latency is very dependent on the quality of the network connection and the selected timeout values.
Networking based on popular stream protocols, like TCP, are affected by another kind of latency, called flow control. These protocols use buffering techniques that are optimized for continous streams of data, but are not suited for the fast exchange of short data bursts. This type of latency is very hard to measure, but can usually be reduced by disabling various options in the protocol (eg, Nagle's algorithm).
Finally, transactional latency is the software overhead of buffering and processing network data. Once the message is buffered by the network card, the application is informed and will read it at the earliest chance. The delay may be small if the networking and game threads run preemptively, or large if messages are only processed between graphics frames. Transactional latency can be measured by posting an empty transaction to the authority and checking the elapsed time between sending the request and receiving the response.
While there is nothing you can do that will eliminate latency, there are design strategies for coping with latency. Techniques include testing performance (to measure latency), reducing dependencies (to avoid latency), and improving prediction algorithms (to hide latency). We'll examine some of these techniques in the next few articles about game design.
| Copyright (c) 1999-2003 Matt Slot and Ambrosia Software, Inc. |