Parsec Networking Architecture

© by Markus Hadwiger and Andreas Varga in 1998

1. Introduction

1.1 Goals

1.2 The Big Picture

1.3 Code Structure

1.4 Dynamic Subsystem Binding

2. Networking Subsystem Structure

2.1 Game-Code Interface Layer

2.2 Remote Events

2.3 Parsec Protocol Layer

2.3.1 Peer-to-Peer Protocol

2.3.2 Slotserver Protocol

2.3.2.1 Connecting to a Slotserver

2.3.2.2 Disconnecting from a Slotserver

2.3.2.3 Changing your name on a Slotserver

2.3.2.4 Client-list Updates

2.3.3 Gameserver Protocol

2.4 Packet API Layer

2.4.1 User Datagram Protocol (UDP)

2.4.2 Internetwork Packet Exchange (IPX)

3. Parsec Masterserver

3.1 Masterserver Protocol

4. Portability

1. Introduction

The purpose of this document is to describe the networking architecture of Parsec. Parsec is a 3-D space-fight computer game that we are currently developing. The main focus of Parsec is network game play and therefore network play performance and smoothness have been a very high priority design criterion right from the start. Another important technical design criterion for Parsec was that it should be easily portable between different host platforms. Our current implementation supports DOS, Win32, MacOS, and Linux; other platforms could be supported relatively easily, however, by just implementing the platform-specific subsystems for the desired platform.
This document describes the concepts and implementation of the networking architecture we have designed to support smooth game play on local area networks as well as on the Internet. The actual implementation is hidden behind several abstract layers each exporting a uniform interface. The lowest of these layers encapsulates the transport protocol provided by the host system. We currently support Novell's IPX for playing on LANs and the TCP/IP protocol suite for playing on the Internet (or any other network supporting TCP/IP, including LANs, of course). The protocols of the TCP/IP protocol suite we actually use are TCP for connection management, which is not time-critical and only performed when connecting to or disconnecting from a game server, and UDP for transmission of game state, which is performance-critical and done periodically with frequencies between five and sixty Hertz, depending on the available bandwidth.

In this document we first give an overview of the goals we wanted to achieve in the development of Parsec's networking architecture. Next, we describe how network game play in Parsec works in general. After that, we introduce the general coding convention we use to ensure flexibility, configurability, and portability. This encompasses the notion of abstract subsystems that can be bound dynamically and changed on-the-fly from within Parsec's command console.
Starting with section 2 (Networking Subsystem Structure) we describe the actual networking subsystem in detail.

1.1 Goals

This section names some of our major goals we wanted to achieve in the design and implementation of Parsec's networking subsystem. Some of these goals (say, smooth game play) are quite huge issues, while others (say, dynamic HTML page generation by servers) are rather small details that nevertheless are also very important.

The following list contains those goals that are directly visible to the user, i.e., player of Parsec:

Smooth game play even on low-bandwidth connections (i.e., 28.8kbps)
Easy to handle and set up peer-to-peer protocol for playing on small LANs (one subnet)
Client/server protocol for playing over the Internet and large LANs (more than one subnet)
Dedicated game servers for simultaneous handling of up to 16 players each (current version)
Master server for registration of all currently running game servers
Easy location of a nearby game server via HTML pages the master server generates dynamically
List of currently logged in players also available on HTML pages the game servers generate

The following goals are related to development, maintenance, and porting. Thus, they are not as important to users as they are to the developers. Nevertheless, they have a strong influence on a lot of things that also affect the average user, say, functionality, or number of supported platforms.

Easily portable to other host systems and networking protocols
Layered structure with abstract interface definitions
Encapsulation of system details behind these interfaces
Transparent on-the-fly switching of subsystems

The last requirement is a bit of an exception here, as this capability is exported in Parsec's command console where the user can switch protocols on-the-fly. This concerns both low-level protocols (IPX, UDP, etc.) and game protocols (peer-to-peer, client-server). Note that although this document almost exclusively mentions aspects of the networking subsystem, these goals equally apply to all other subsystems. For example, the video subsystem may also be changed dynamically.

The issues brought up by the design goals listed in this section will be elaborated upon in the remainder of this document.

1.2 The Big Picture

In this section we try to provide a basic understanding of how everything else that will be described in detail later on fits into the overall picture, that is, into the actual computer game. We describe what network game play is all about, how users interact with Parsec and its networking, and other related issues like client- and server-side simulation.

There are a number of issues and corresponding problems when integrating network game play into a computer game. Naturally, these depend tremendously on the kind of game it is, what players can and cannot do, how the world is represented and visualized, and so on. In Parsec, nearly all interaction is done via a single spaceship that the player navigates through outer-space. Apart from small objects that can be collected and do not move about by themselves (e.g., extras for more energy, a missile pack, etc.), there is a restricted number of different kinds of moving objects. The most important moving objects are the players' spaceships. Each player controls his/her spaceship and is able to create other moving (e.g., missiles) and static (e.g., mines) objects. All these things have to be replicated over the network at all the other players. Note that in later sections players will always be referred to as clients, whose actions are comprised by everything the player does as well as everything the networking subsystem of his Parsec process does.

Generally, the information created at each client can be categorized into state information and events. State information subsumes the current state of everything that all other clients always need to know about. That is, in the case of Parsec, the local player's position and orientation. These are mostly unconstrained, i.e., there are no walls the player can bump into or other artificial limits, apart from collision detection with other ships. The current state is part of every network packet Parsec sends to other players or the server, respectively.
Everything else is done via the notion of events (see the section on remote-events). First, this is economical since the player always has to do something special in order to create an event. This could be something like pressing the button of a joystick to fire the ship's laser, or using the 'name' command in the command console to change the player's name, for example. If the player doesn't do anything there will be no events. Second, it's also both very convenient and versatile since the game server can process events differently, say, according to different priorities with respect to latency, for instance.
State, on the other hand, will normally change every single frame if the ship is moving or rotating at all. State and events are fundamentally different, one is the continuous change of a certain parameter like position over time, and the other is the occurrence of, well, events at discrete instants in time. Nevertheless, bandwidth can also be saved when transmitting state information, for instance by using dead-reckoning.
In general, dead-reckoning describes an approach to state transmission where prediction of a client's actions in conjunction with extrapolation/interpolation is used to cut down on the amount of information that actually needs to be sent. In the case of a spaceship, we can send the current velocity and/or acceleration along with the current position and the server can extrapolate the position at any time later on, that is, predict the client's position, provided the client doesn't change its velocity. Normally, both client and server perform the same prediction algorithm in order to determine whether the deviation of the predicted position from the actual position is high enough to justify sending another state update.

In Parsec we use a very simple approach to dead-reckoning. Currently, state updates are only omitted from sending if the state hasn't changed at all. Apart from simplicity, there are two reasons for this. First, the maximum number of concurrent players in a single game (16) is currently very low. You can't play a death-match kind of game with a very high number of players anyway, since the game play from a user perspective simply won't work. The second, very important reason is that if a player does anything at all (i.e., he is not currently typing funny comments in the command console) state changes are nearly always too important not to be sent. That is, the error threshold would be constantly exceeded. And, furthermore, we already have to use interpolation and smoothing code to get a smooth appearance of a player flying (and navigating) at 40 frames per second but whose packet send frequency is below, say, 20 Hz.
To summarize, if the state changes at all, state updates are sent at the highest possible frequency, that is the frequency possible for the current bandwidth. This packet send frequency is user-selectable. This is necessary to achieve smooth game play which is a very important issue and should really be provided to the player, most importantly since the number of players is never very high (at least with respect to scenarios where a sophisticated dead-reckoning approach becomes crucially important).

Another very important issue is, simply put, who is responsible for what. That is, in a client-server game, who performs the actual simulation, the client, the server, or both a part of it? For example, a user presses a button to fire something nasty at one of his colleagues.
In, say, scenario one the client sends a request for the actual event to the server, the server creates the corresponding object (e.g., a missile) and sends the object's state back to the client. Every time the state of the missile changes (significantly) a state update is sent from server to client. The crucial thing about this approach is latency. If the latency is not extremely low the user will experience a perceptible delay between his action (pressing the button), and the event's occurrence (the missile rushing away at breakneck speed). We will dub this approach server-side simulation. The same problem applies to movement, that is, if the server determines whether a certain movement is allowed after performing collision detection the user will perceive considerable lag if the latency is too high. A possible remedy for the problem of perceived latency would be to allow the client to move in advance and try to recover if the server sends back something different later on, say, the movement hasn't actually been possible due to a collision with another spaceship.
In scenario two, which we will call client-side simulation, the client is basically allowed to do everything without consulting the server first and only has to inform the server of what he does/has done and gets information from the server about what the others do. In this case there is the problem of considerable inconsistencies since there is no central server ensuring race-conditions and other conflicts are resolved properly.
There also is the issue of who should calculate the actions of non-player-controlled objects like mines, energy-fields and the like. Both approaches of the scenarios above could be combined. Say, some things could be simulated on the server side and other things on the client side.

In Parsec we currently use a strict client-side simulation approach. That is, there is no perceptible delay between an action and the corresponding event. The client informs the server of what events occurred, the server forwards this information to the other clients who then replicate the events. Normally, at least some things would have to be simulated on the server and our simple approach is currently only feasible due to the simplicity of the world. In the future we are certainly going to introduce certain aspects of server-side simulation, although the local player's actions will almost certainly always be simulated on the client-side to avoid very disturbing lag. After all, a frame rate of 40 Hz is of no big use if user actions can only cause events a few hundred milliseconds later on.

1.3 Code Structure

All networking code is written in a C-like subset of C++. This means that we use things like C++ comments and inline variable declaration but we don't use classes, templates, and the like. In order to still be able to determine which subsystem a, say, function belongs to we encode this information in the function name itself.

We use a rather strict naming convention for module and function names. Every module name of the networking subsystem starts with the NET_ prefix (e.g., NET_RMEV.C, which contains all remote-event functions) if its implementation is the same for all target systems. If the module implements a specific functionality but the implementation differs from system to system its name starts with the Nx_ prefix where x denotes the target system. For example, UDP function implementations for Win32 are contained in NW_UDP.C, whereas the corresponding implementations for the MacOS are contained in NM_UDP.C. These modules export the exact same interface, that is, the caller always uses the same function name, although the actual implementation will be quite different for each system.

Every interface function of the networking subsystem starts with either the NET_ or the NETs_ prefix.
The implementation of NET_ functions is the same for every system, i.e., these functions are not system-dependent. Their implementation is portable code that is used for every target system. For this reason these functions are always bound statically. In C++ NET_ functions would be non-virtual member functions. For example, utility functions like NET_FetchPlayerName() belong to this category.
The implementation of NETs_ functions, however, differs from system to system and this fact is already announced by their function name. The caller doesn't know which implementation will actually be called at run-time. First, for each system the implementations will always be different, although the function name will be the same. Furthermore, if dynamic binding is enabled (via a compile-time switch) the implementation may be switched at run-time. For example, the abstract interface specifies a NETs_SendPacket() function whose implementation is different on, say, Win32 and Linux, and, moreover, whose implementation is different depending on whether a UDP or an IPX packet should be sent. Nevertheless, this is entirely transparent for the caller. Wherever a call to NETs_SendPacket() is specified this will work correctly on every system for whatever subsystem implementation may be currently active at run-time, say, Win32/UDP, Win32/IPX, or Linux/UDP. NETs_ functions are specifically named to announce this system-dependence to the caller. If dynamic binding is enabled NETs_ calls will be routed through a jump-table, although this may be disabled transparently at compile-time if on-the-fly subsystem switching is not desired and the overhead for the indirect function-calls is considered a performance issue. In C++ NETs_ functions would be virtual member-functions.

1.4 Dynamic Subsystem Binding

As already mentioned in the preceding section, subsystems in Parsec can be bound dynamically in order to allow their changing at run-time (on-the-fly). All interface functions of system-dependent subsystems are called through a jump-table (provided dynamic binding has been enabled at compile-time). For the networking subsystem there is exactly one such jump-table containing the addresses of the currently bound implementation for each function of the abstract interface. If a subsystem is bound or re-bound this jump-table will need to be updated.

On-the-fly networking subsystem switching is very easy:

1. Call NETs_KillAPI(): this de-inits the old system

2. Update jump-table: this re-binds the entire interface

3. Call NETs_InitAPI(): this inits the new system (implementation has already been switched!)

As will be detailed in section 2 (especially sections 2.3 and 2.4) two important parts of the networking subsystem are the Parsec Protocol Layer, whose implementation depends on the currently active high-level game protocol (the server-less peer-to-peer protocol, and the slotserver and gameserver protocols which are two different client-server protocols) and the Packet API Layer, which encapsulates the currently used transportation layer provided by the host system (say, UDP or IPX). These two components of the networking subsystem can be changed independently during run-time. That is, the connection to the server can be broken and the protocol can be reconfigured to be peer-to-peer instead of client-server, and so on. These issues will become more clear in the next section.

2. Networking Subsystem Structure

Parsec's networking subsystem basically consists of three distinct layers and their respective abstract programming interfaces. The relationship between these layers is not strictly one-on-top-of-the-other, as can be seen in Figure 1.

Network subsystem structure

Figure 1: Networking subsystem structure

First, there is the Game-Code Interface Layer that decouples the actual game-code from all networking-code details. Apart from a few general utility functions everything in this layer is done via the notion of remote-events. If the game-code creates an object, for instance, it invokes the corresponding function of the game-code interface to notify the networking-code of this event's occurrence. All game events that have to be replicated at other clients are dubbed remote-events in this context.
Second, there is the Parsec Protocol Layer that relates to the way clients communicate with each other. Details of whether this happens in a peer-to-peer or client-server fashion, for example, are entirely hidden in this layer. Since connection management (e.g., establishing and breaking connections) is contained in this layer it is both used directly by the game-code as well as by the game-code interface layer.
Third, there is the lowest-level layer that is responsible for actually sending already composed packets over the network, using UDP or IPX, for instance. We call this the Packet API Layer.

2.1 Game-Code Interface Layer

When developing a network game there is the problem of how the game-code should interface to the networking code. First, the game-code should be entirely decoupled from all network implementation issues. Second, there should be a uniform and abstract interface between the game-code and the networking-code. In Parsec this is done through the game-code interface layer. This layer decouples the game-code from all networking issues that are not directly related to game play. Note that, as can be seen in figure 1, the game-code also directly interfaces to the Protocol API which mostly lies beneath the game-code interface layer. Nevertheless, this is limited to a few functions the game-code invokes to establish and break a connection and join and unjoin a game, respectively. Everything else the protocol layer provides is exclusively used by the game-code interface layer itself.
The decoupling of game-code and networking-code is primarily achieved through the notion of remote-events. Each remote-event corresponds to an event in the game that needs to be replicated at the other clients (see section 2.2). The game-code interface mostly is a collection of functions each corresponding to a specific remote-event. One can imagine these functions as being callbacks from the game into the networking-code. Every time an event occurs inside the game, say, a missile is fired or a particle object is created, the game calls the corresponding remote-event function to inform the networking-code of this event's occurrence. These functions basically just record all events and the next time when a packet is sent the accumulated events are transmitted and thereafter replicated at the other clients. After events have been sent they can be deleted at the client that created them.
Apart from remote-event functions, the game-code interface also includes a couple of utility functions like NET_FetchPlayerName() to determine the name of a specific player (for example, to display the name in the heads-up display), and NET_FetchOwnersShip() to acquire a pointer to the spaceship of a player who owns another object (say, a missile or a mine). Ownership of objects is needed to determine who shot down whom, for instance.

2.2 Remote Events

Apart from a fixed-size header containing static game state - basically just position and orientation of the local player's spaceship - the content of every network packet is a list of remote-events. Remote-events describe everything happening in the game-engine that needs to be replicated at the other clients. The preceding section describes how the game-code creates these remote-events. This section describes how remote-events are handled by the networking-code.
All functions that create remote-events share the NET_RmEv prefix. If the player fires a missile, for example, and therefore indirectly creates a RE_CREATEMISSILE remote-event, the game-code invokes the NET_RmEvCreateMissile() function. This function creates the actual event. That is, it allocates and initializes the data structure and appends it to the global list of remote-events. This list contains all remote-events that have accumulated since the last time a packet has been sent in the order of their temporal occurrence. The low-level networking-code (Packet API Layer) retrieves the events stored in the global list and stores them into the network packet immediately before it is sent. For those remote-events that are time-critical (e.g., missile creation, RE_CREATEMISSILE) the position of the contained object is updated to the current position at the latest possible time (i.e., immediately before sending) to ensure that the position is up-to-date at the time of sending.The following remote-events are currently defined:

RE_CREATEOBJECT An object of general type has been created.

RE_CREATELASER A laser object has been created (i.e., fired by the local player).

RE_CREATEMISSILE A missile object has been created (i.e., fired by the local player).

RE_CREATEEXTRA An extra has been created (e.g., a package containing five missiles).

RE_KILLOBJECT A specific object should be removed at all other clients.

RE_SENDTEXT This can be used to send arbitrary text (usually entered in the console).

RE_PLAYERNAME The local player has changed his name, inform other clients.

RE_PARTICLEOBJECT A particle object has been created (e.g., an energy-field).

RE_PLAYERLIST list of currently connected clients (needed by peer-to-peer protocol).

RE_CONNECTQUEUE list of clients currently trying to connect (needed by peer-to-peer protocol).

The most important remote-events are those that create objects (RE_CREATEOBJECT, RE_CREATELASER, RE_CREATEMISSILE, RE_CREATEEXTRA, RE_PARTICLEOBJECT). When another client receives such a remote-event it will replicate the object described within the event structure. For example, it will create a missile at a certain position, flying with a certain speed in a certain direction.
Players can talk to each other by typing text in Parsec's command console. This text will be transmitted as part of the RE_SENDTEXT remote-event. Name changes create RE_PLAYERNAME events. The RE_PLAYERLIST and RE_CONNECTQUEUE remote-events are only used by the peer-to-peer startup phase (see Peer-To-Peer Protocol). They contain a list of all clients currently connected or currently trying to connect, respectively. These remote-events are needed to guarantee predictable behavior during startup when race conditions occur.
Remote-events can be processed by the server and be altered to suit the current bandwidth. For example, the server can drop non-critical events or buffer up events that don't need to have minimum latency in favor of the transmission of other events that are latency-critical.
An advantage of remote-events that is not directly related to game play is that they can be recorded to file and then be played back at a later time (demo recording and playback). If a stream of remote-events is played back everything will behave exactly the same as when the events were originally created. Since nearly everything is encapsulated within its corresponding remote-event it's quite easy to, say, also play back everything that the players 'said' to each other during the original session. This is not really a fundamental property of remote-events but they provide a nice, transparent way to do things like that.

2.3 Parsec Protocol Layer

This layer is not simply named 'protocol layer' because it has nothing to do with the networking protocols provided by the host system. These (e.g., UDP) are encapsulated in the Packet API Layer. The Parsec Protocol Layer, however, encapsulates Parsec's proprietary game protocols.
The most important functions of this layer are NETs_Connect() and NETs_Disconnect(), to establish and break a connection to a server, respectively, and NETs_Join() and NETs_Unjoin() to join and leave a game, respectively. In a sense these functions are game-code interface functions since they are also called directly by the game-code.
There is an important distinction between connecting/disconnecting and joining/unjoining, even when the peer-to-peer protocol is used. Whenever a client wants to join a game it first has to establish a connection. This doesn't necessarily mean an explicit connection in the sense of a client-server connection. First, it means that a client determines whether there are other players already playing. Second, by establishing a connection a client reserves a so-called slot. A slot is an allocated position in a specific game. All protocols have certain limits to the number of concurrent clients, that is, their number of available slots. If all slots are already assigned to other clients the new client cannot connect to this game. As soon as a connection is established (this may be an actual connection to a server but it need not be) the client has logged into the game. This means that the actions of all other players will be received periodically from this time on. However, the client has not yet already joined the game. The game can now be joined/unjoined at any time and the slot will stay allocated the entire time until the connection is broken. When a client disconnects it logs out of the corresponding game and its slot will become available for other clients. Also, although we have not yet implemented it, connecting to a game would allow to watch the game in a special spectator mode. Since a spectator cannot join the game it is invisible to the other players and only receives packets instead of sending and receiving them (apart from infrequent NODE_ALIVE packets to keep the connection up).
All protocols share the property that there is no explicit creation of a game session which all others need to join before the game can be started. Clients can join and unjoin a specific game at any time as long as the limit to the number of concurrent clients is not exceeded.

2.3.1 Peer-To-Peer Protocol

When this protocol is active all clients communicate in a strictly peer-to-peer manner. That is, there is no dedicated server (nor is there a client also acting as server). During startup a client broadcasts a connect request to find other clients that are already playing. If no other clients are found during this startup phase the client decides that it is on its own and enters the game alone. Others can then connect at any time later on. There cannot be more than four players in a single game at any one time, though.
Packets that transmit the current game-state to other clients are sent to all other clients in a unicast fashion. Theoretically, it would be ideal to use multicasting for this purpose, but in this way the underlying network is not required to support any type of multicasting. Since the number of clients for peer-to-peer play is limited to four, this is perfectly feasible.
The most demanding problem of the peer-to-peer protocol is the startup phase (this part of the protocol is referred to as the startup protocol). Since there is no server there are a lot of potential problems with race conditions when two or more clients start up at approximately the same time. We have designed a quite involved startup protocol to resolve all of these problems and achieve predictable and consistent results in all cases.

Basically, in the peer-to-peer protocol clients find each other by broadcasting an initial connect request (multiple times if necessary) and waiting whether they get a reply by other clients that are currently running and therefore listening to such connect requests. If after a certain amount of time no reply has been received a client decides that it is alone in its world and enters without any further ado.

The following state diagram illustrates the peer-to-peer startup protocol using pseudo-code to describe what is done in each state:

Figure 2: Peer-to-peer startup protocol state diagram

There are two fundamental approaches used here. First, conflicts of more than one client starting up at approximately the same time (which means at least on client is starting up during the startup phase of another client) are resolved by comparing the clients' node addresses. Since these should be unique, in case of conflict the client with the higher address wins. Second, clients that are starting up always queue connect requests they receive by other clients. When a client becomes the slave of another client that has just broadcast its connect request it forwards its queue to that client. This is necessary to ensure no connect requests get lost. Even though connect requests are sent multiple times if no reply has been received, this alone is not sufficient to guarantee the entire request cannot be lost when yielding to another client without queuing up requests and forwarding them when necessary.

There are three major loops that are of importance to us here. Two of them are shown in the state diagram above and are part of the startup protocol.
First, there is the connect-loop. Immediately after starting up a client enters this loop. It will be exited if this client either yields to another client with higher status (a client that is already connected, or a client also just starting up but with higher node address), or a certain amount of time has elapsed in which no clients of higher status have identified themselves. If this happens a client sends connect replies to all queued clients of lower status and enters the game-loop (the third loop, see below).
Second, there is the slot-request loop. A client enters this loop when it either implicitly decides for itself that it is a slave (a connect request of a client with higher node address has been received), or when it receives a SUBDUE_SLAVE notification, thus explicitly becoming a slave. A client being in the slot-request loop waits there until it receives a message by the master allocating a slot for the client. Normally, it is guaranteed that this will always happen for clients being in this loop; at least as long as the current (temporary) master doesn't crash in the middle of the startup phase while another client is in its slot-request loop. If this should happen, though, the client will time out, display an error message, and the user has to issue another connect command.

Either way, after the startup phase a client will enter the game-loop. (This loop is the actual game (playing, rendering, etc.), as far as the networking-code is concerned.) There are two ways in which this is achieved. Most clients enter the game-loop as slaves, that is, someone already running has assigned them a slot. A client can also enter the game-loop as temporary master. Temporary in this case means that there actually is no master/slave relation as soon as all clients are in the game-loop. Nevertheless, during startup a temporary master is elected who assigns a slot to itself and each of the slaves in order to ensure race conditions cannot occur. After all slots have been assigned all clients are equal.

There is a very good reason not to retain the already established master/slave relationship once everyone is in the game-loop. If the master exits another master has to be reelected. First, this is not quite trivial to do in a fail-safe way, and, second, there are all sorts of problems when clients try to start up while no new master has been elected yet.

Clients in the game-loop react to connect requests they receive in the following way. First, they look at their list of peers (including themselves) and establish a unique ranking order using the node addresses. They then wait their rank (starting with 0) times a certain wait-interval. If they are not able to snoop a connect reply of another client during this period, they will send a reply themselves. This approach ensures that somebody will reply to the connect request, even if clients that are still in the list have already exited without sending a notification (i.e., crashed, or their host has been turned off without exiting). It also ensures that there are no conflicts, since the replicated client list can only be inconsistent with respect to crashed clients, but there cannot be clients that are connected but not part of the list. Therefore, no two clients will try to reply at the same time.
In essence, this approach amounts to some kind of dynamic master-reelection protocol each time a new client tries to connect.

2.3.2 Slotserver Protocol

This protocol is basically a peer-to-peer protocol, although startup is handled by a dedicated server. This approach remedies the main problem of the peer-to-peer protocol: the need for broadcasting capabilities in the startup phase.
If all players are on a single LAN consisting of just one Ethernet segment, the peer-to-peer startup phase is a working but complex method to find other players. This is true with the IPX and the UDP packet API.
If however players are located on different networks, broadcasting is no longer possible with UDP, because it is limited to a single subnet. A solution to this problem is a method in which all clients query a certain well-known authoritative server to find other players. This server is called the slotserver. Its task is to allocate slots for each player that connects to the server, and to transmit the current list of connected clients to each new client. If a client disconnects from the slotserver, its allocated slot gets freed again, allowing another client to occupy this slot. Whenever a client is added to the list or removed from it, the list gets updated at each other client, so that every client knows each other client at any time.
The server generates a HTML file of this list, depicted in Figure 3, which can be viewed with any web browser. This allows the potential player to take a look at who is currently connected to the server without the need to start up the game, assuming that an http server is running on the same host as the slotserver.

Figure 3: Taking a look at the players connected to a server

The rest of the slotserver protocol is identical to the peer-to-peer protocol, which means that packets are exchanged between all clients. Note, that no game packets are received by or sent from the slotserver itself. Therefore the slotserver does not actually know what is happening in the game, and the speed of the connection between clients and server does not affect gameplay.

The current implementation of the slotserver protocol uses TCP/IP to let clients and servers communicate with each other. This was chosen because a reliable protocol is preferred (to avoid the same problems that occur in the peer-to-peer startup phase), and slotserver connections are not time-critical.

2.3.2.1 Connecting to a Slotserver

The slotserver process listens on TCP port 6582. After a TCP connection is established with the server, the client may send the following protocol string to register for a game:

joinline ::= 'version:' versionspec 'join me:' playername 'os:' osname

    versionspec ::= digit '.' digit
    playername   ::= 31 character string containing the playername
    osname       ::= 31 character string containing the name of the clients' operating system

for example:

"version: 1.0 join me: sid os: powerpc-apple-macos"

This will cause the server to allocate a slot for the player named "sid" that's is using the 1.0 version of the protocol to connect with the MacOS client of Parsec.

The server might then reply with the following message:

"join from sid accepted, slot: 2, server name: TestServer"

This is the acknowledgment for the client that the connection was accepted from the server and that a slot (in this example the slot with the number 2) was successfully allocated. The server also tells the client its name, which the client will display on the screen.
After a successful join, the server will send the client four message strings. These strings are configurable by the server admin, and can contain a welcome message, a special rule for the game, or some other text, that new players on the server should read.

However the client might also receive one of the following messages:

"server full"

No free slot was found, try to connect again later.

"you sent an invalid request"

The server could not parse the protocol string you sent. This should not happen normally. Try connecting again.

"server version incompatible"

The server uses a different protocol version than the client, and a connection is not possible. This means that either the server or the client are old and should be updated with a newer version.

"you're banned from this server"

The server does not allow connections from your IP address, because the server admin has banned you. This feature can be used to restrict access to the server for people that don't know how to behave.

2.3.2.2 Disconnecting from a Slotserver

If the player decides that he wants to leave the server he is connected to, he might either issue a DISCONNECT command on the Parsec command console, or he might quit the game. In both cases the client has to send a remove message to the server, to inform it of this change. It does this by simple sending the protocol string:

"remove me"

to the slotserver.
The server will in most cases reply with:

"removing you"

if no problems occurred.
However it also might reply with one of the following protocol strings:

"you sent an invalid request"

The server could not parse the protocol string you sent. This should not happen normally.

"you're not even joined"

This message will be sent if the server does not know that you are connected. This should not happen under normal circumstances, because it indicates an inconsistency in the client-list on the server.

2.3.2.3 Changing your name on a Slotserver

If a player wants to change his playername, he can simply send the following protocol string to the slotserver:

"my new name: maverick"

The server will send either of the following reply messages:

"new name accepted"

This is the expected acknowledgment.

"name already taken"

This is the reply string that indicates that another player already uses this name. There must not be two players with the same name on the server.

"you sent an invalid request"

The server could not parse the protocol string you sent. This should not happen normally.

If the name change was successful, the server will update its client-list and send a name update message (see next section) to all other clients.

2.3.2.4 Client-list Updates

After the client has connected to the slotserver and the connection was successfully accepted, the client will receive client-list updates from the server, if other players connect, disconnect or if one of them changes his/her name.

The server might send a protocol string like this:

"add ip: 192.168.1.1 name: msh in 0"

This will tell the client that a player named "msh" has connected to the server, from a host with the IP address 192.168.1.1, and was assigned slot 0 by the server.
The client now has to enter that information into its own local copy of the client-list, and will start to send game packets to that new client.

Later on, the player "msh" might disconnect from the server. In this case the server will send all other connected clients a remove update, which might look like this:

"remove ip: 192.168.1.1"

This will tell all clients that the client connected from 192.168.1.1 has left the server, and should no longer receive packets.

If a player decides to change his/her playername, the server will inform all other clients of this change by sending a name update:

"name ip: 192.168.1.1 name: maverick"

The player from 192.168.1.1 has changed his name to "maverick".
NOTE: since the slotserver shares common code with the peer-to-peer protocol, playername updates are also distributed via the RE_PLAYERNAME remote event between clients. The slotserver redundantly sends the name update via the TCP connection again. The only case were this makes sense is if the packet containing the remote event got lost accidentally.

2.3.3 Gameserver Protocol

This is Parsec's full-fledged client-server protocol.
The communication between client and server is exactly the same as in the slotserver protocol, so everything in section 2.3.2 also applies to the gameserver protocol, except for the fact that the gameserver process listens on port 6581 instead.

If a client is connected to a gameserver, it has to send all packets to the server. The server receives the packets, and decides which information should be forwarded to which of the other clients.
This method greatly reduces the necessary bandwidth requirement for the client's connection to the Internet. Especially if a player is connected using a standard analog modem and wants to play with many other people, it makes a big difference if packets have to be sent to n-1 other clients (as is the case with the peer-to-peer protocol), or if a packet has to be sent to just one other host, the gameserver.

Furthermore, the gameserver can use special algorithms to minimize bandwidth requirements even more, by filtering certain packets. For example, the gameserver keeps track of the distance between each pair of players, and only sends a gamestate update from player A to player B if the distance between the two players is smaller than a certain constant value. So if the two players are very far away, and they can't see each other anyway, no unnecessary packets are sent.

Another way in which the gameserver tries to reduce packet traffic, is by filtering PKTP_NODE_ALIVE packets. Node-alive packets are constantly sent by players that are connected but are not joined yet. Each client maintains a node-alive counter that is decreased constantly and reset to its initial value if a node-alive packet is received. This allows the client to determine if another client has crashed (because it is not sending node-alive packets anymore, and the counter has reached zero). Since the gameserver keeps track of each player in a gameserver-based game, there's no need to forward node-alive packets to all clients, and the node-alive counter is maintained at the gameserver, not at each client.

Tests have shown that packets often reach the gameserver in bursts. In a burst of packets both old and new remote events might be present. Certain remote events are more time-critical than others (for example a RE_CREATEMISSILE event has a much higher priority than a RE_SENDTEXT event) and therefore these remote events deserve special treatment by the gameserver. If a burst of packets is received, all remote events contained in these packets are buffered, and special time-critical remote events (like laser and missile creations) are sent first, while outdated ones will be dropped and others will be delayed.

2.4 Packet API Layer

This layer encapsulates the networking protocol (transport protocol) provided by the host system. First, this means abstract functions for sending and receiving packets. Second, this layer must also encapsulate the physical network address format. For example, for UDP this means IP addresses (even though these will be mapped to MAC addresses by the host system, e.g., ethernet addresses); IPX uses ethernet addresses directly.
Network addresses are encapsulated using the abstract node_t type. Variables of type node_t may be copied and stored, but their actual meaning is only known by functions of the packet API layer which take parameters of type node_t.

2.4.1 User Datagram Protocol (UDP)

We support UDP on Win32, MacOS, and Linux. Currently, the client-server connection also uses TCP for a limited number of critical tasks. Connections are established and broken using TCP. Mostly, though, we use connection-less unreliable transport for performance reasons, especially for everything that must be sent periodically (PKTP_GAME_STATE, PKTP_NODE_ALIVE). If UDP is selected as Packet API this actually means using the TCP/IP protocol suite. As already mentioned, we use both UDP and TCP. This slight inconsistency in terminology is due to the fact that UDP is used ninety-nine per cent of the time (i.e., after the connection is established until it is broken once again).
Clients are listening for UDP packets on port 6526, while the gameserver listens for UDP packets on port 6522.

2.4.2 Internetwork Packet Exchange (IPX)

Under DOS, Win32, and MacOS we support Novell's IPX (internetwork packet exchange) protocol. IPX uses a four-byte network address to identify the destination network along with the six-byte ethernet address of the destination host to identify a specific node. Currently it's only possible to play using IPX if all clients are connected to the same physical network (subnet).
A very important restriction of the IPX support is that it doesn't support client-server protocols. The peer-to-peer protocol must be used if IPX transport is desired.

3. Parsec Masterserver

Peer-to-peer is the protocol of choice if all players are located on the same Ethernet segment. In this case the peer-to-peer game will perform slightly better than the gameserver protocol.
If the players are not on the same Ethernet segment, but on a big LAN (for instance the TUNET), the slotserver protocol should be used, because it has the same performance properties as the peer-to-peer protocol, but does not suffer from the same limitations.
In these two scenarios all players are geographically close to each other. Scheduling a game beginning time, and finding other players for the game is usually not an issue.
However, the most common situation is a single player that just wants to play the game. He might be anywhere on the globe, and needs to find other players that want to play too. One possible solution to this problem would be the introduction of chatrooms or a public forum, for example Internet Relay Chat (IRC). However this would require the player to have an IRC client installed, and the necessary knowledge to use it.

A different solution that is commonly used is the concept of masterservers. These are special servers, that normal gameservers connect to, to announce their availability. The masterserver can therefore maintain a list of currently running gameservers, and provide potential players with this list. The player can then choose a gameserver from the list, and connect there to play with the other players on that server.
Parsec's network architecture includes a fully functional masterserver. It uses a TCP-based protocol that is very similar to the slot/gameserver protocol described in section 2.3.2. Whenever a gameserver starts up or shuts down it sends a message to the masterserver to inform it about its existence. It also tells the masterserver how many players are currently connected and how many players are allowed as a maximum.
The masterserver maintains an internal list of all gameservers, and creates an HTML file of this list, which is dynamically updated. An http server should be running on the masterserver host, to enable users to download the list with any standard web browser. Figure 4 shows a screenshot of this list page.

Figure 4: Finding a server with just a web browser

Since http is such a simple protocol, it would also be possible to download the masterserver list from within the game itself, or using a special stand-alone tool.

3.1 Masterserver Protocol

As described in the previous section, the masterserver provides a protocol that's very similar to the slot/gameserver protocol. In this context the slot- or gameservers are basically clients to the masterserver, and communicate with the masterserver via TCP, like the game clients communicate with slot- and gameservers.
The masterserver listens at TCP port 6502. When a slot- or gameserver starts up it connects to the masterserver at this port and sends a join protocol string, which is defined like this:

joinline ::= 'version:' versionspec 'join me:' servername 'os:' osname 'maxplayers:' maxplayercount url: urlspec

    versionspec    ::= digit '.' digit
    servername     ::= 31 character string containing the name of the server
    osname         ::= 31 character string containing the name of the servers' operating system
    maxplayercount ::= digit
    urlspec        ::= 127 character string containing a valid URL to a WWW page describing the server

for example:

"version: 1.0 join me: Gamma_Quadrant os: i586-pc-linux-gnu maxplayers: 16 url: http://gammaquadrant.parsec.org/"

The join string contains a protocol version just like the slot/gameserver protocol, followed by its name (a fictive name for the server should be chosen by the server administrator), its operating system, the maximum number of players that the server can handle and an URL pointing to a server status page (this URL will be used to create a link in the list page that the masterserver generates).

The server will then reply with one of the following messages:

"join accepted"

This indicates that the join was successful, and that the slot- or gameserver has been successfully entered into the masterservers list.

"masterserver full"

The masterserver has an internal limit on the number of servers it can handle (usually 256). If this reply message is received you should try to start the server again later.

"illegal request not processed"

The masterserver was not able to parse your request. This should not happen normally.

"masterserver version incompatible"

The masterserver is using an incompatible protocol version. Either the slot/gameserver or the masterserver is old, and should be updated.

If a slot/gameserver quits, it send a remove message to the masterserver to inform it about this event. This is achieved by sending the protocol string:

"remove me"

The masterserver will then most likely reply with:

"removing you"

If the masterserver receives a remove message from a server that is not in its server-list, it will reply with:

"you're not even joined"

This should not happen under normal circumstances.

If the masterserver was not able to parse the request it will reply with:

"illegal request not processed"

If a player connects to or disconnects from a slot/gameserver, the number of current players at the server must be sent to the masterserver, so that the masterserver can insert that information into its HTML list. The server does this by sending a message like:

"players: 3 max: 16"

This indicates that the server now has three players connected, out of a maximum of 16 players. The masterserver displays this as "3/16" in its HTML list.
After the players update has been received by the masterserver it will reply with:

"players acknowledged."

or:

"illegal request not processed"

if the request could not be parsed correctly.

4. Portability

One of the major technical properties of Parsec is its high portability. Using the code naming convention described in section 1.3 it is possible to separate system-dependent code from fully portable code. For example, the modules containing the implementation of the packet API for IPX are called ND_IPX.C for DOS, NW_IPX.C for Win32 and NM_IPX.C for MacOS. Since these three operating systems use different APIs to provide access to IPX services, all three modules implement the same functions, but differently.
This approach worked fine for IPX, because IPX is only used for the peer-to-peer protocol, and the IPX module for DOS was finished before it was ported to Win32 and MacOS.
However the other parts of Parsec's networking code that use the TCP/IP suite of protocols where written for Win32, MacOS and Linux simultaneously. These three operating systems use different APIs to provide access to IP services (WinSock on Win32, OpenTransport on MacOS and BSD sockets on Linux). For testing purposes during development it was necessary to have different OS versions of the network code available (the server code was written under Linux, and the clients where tested with Win32 and MacOS).
Maintaining three different versions of the packet and protocol API code during development would have been a very demanding task. Therefore we decided to use a common API for all IP-based code. We used a compatibility library that implements the WinSock API on MacOS, and a library containing wrapper functions to work around differences between BSD sockets and WinSock.

For example, the BSD sockets API uses the close() system call to close a socket, however WinSock requires the programmer to use the function closesocket() to do the same thing. A wrapper-function called Close() (note the upper-case character) was written, that transparently calls the correct function to close a socket.

Using this method we were able to write code that immediately runs on all three operating systems, with only very minor differences.

1.	Introduction
1.1	Goals
1.2	The Big Picture
1.3	Code Structure
1.4	Dynamic Subsystem Binding
2.	Networking Subsystem Structure
2.1	Game-Code Interface Layer
2.2	Remote Events
2.3	Parsec Protocol Layer
2.3.1	Peer-to-Peer Protocol
2.3.2	Slotserver Protocol
2.3.2.1	Connecting to a Slotserver
2.3.2.2	Disconnecting from a Slotserver
2.3.2.3	Changing your name on a Slotserver
2.3.2.4	Client-list Updates
2.3.3	Gameserver Protocol
2.4	Packet API Layer
2.4.1	User Datagram Protocol (UDP)
2.4.2	Internetwork Packet Exchange (IPX)
3.	Parsec Masterserver
3.1	Masterserver Protocol
4.	Portability

RE_CREATEOBJECT	An object of general type has been created.
RE_CREATELASER	A laser object has been created (i.e., fired by the local player).
RE_CREATEMISSILE	A missile object has been created (i.e., fired by the local player).
RE_CREATEEXTRA	An extra has been created (e.g., a package containing five missiles).
RE_KILLOBJECT	A specific object should be removed at all other clients.
RE_SENDTEXT	This can be used to send arbitrary text (usually entered in the console).
RE_PLAYERNAME	The local player has changed his name, inform other clients.
RE_PARTICLEOBJECT	A particle object has been created (e.g., an energy-field).
RE_PLAYERLIST	list of currently connected clients (needed by peer-to-peer protocol).
RE_CONNECTQUEUE	list of clients currently trying to connect (needed by peer-to-peer protocol).

Parsec Networking Architecture

© by Markus Hadwiger and Andreas Varga in 1998

Contents

1. Introduction

1.1 Goals

1.2 The Big Picture

1.3 Code Structure

1.4 Dynamic Subsystem Binding

2. Networking Subsystem Structure

2.1 Game-Code Interface Layer

2.2 Remote Events

2.3 Parsec Protocol Layer

2.3.1 Peer-To-Peer Protocol

2.3.2 Slotserver Protocol

2.3.2.1 Connecting to a Slotserver

2.3.2.2 Disconnecting from a Slotserver

2.3.2.3 Changing your name on a Slotserver

2.3.2.4 Client-list Updates

2.3.3 Gameserver Protocol

2.4 Packet API Layer

2.4.1 User Datagram Protocol (UDP)

2.4.2 Internetwork Packet Exchange (IPX)

3. Parsec Masterserver

3.1 Masterserver Protocol

4. Portability