Rollback Netcode

Brian Duncker (500782160) – Gameplay Engineering

Introduction

On June 22nd, 2020 a new feature was released for the emulated version of “Super Smash Bros. Melee”, a GameCube game released back on November 21st, 2001, as part of a community project called Project Slippi. The new feature improved the game’s netplay when played on an emulator, which is local multiplayer played over an internet connection. In other words, using netplay players are able to play with each other online, while the game thinks you are playing together locally, as it didn’t have any online features of itself. The newest installment of the series “Super Smash Bros. Ultimate”, which came out on December 7th, 2018, was criticised for its online multiplayer, as written in a review of the game by The Guardian. They describe the online experience to turn into a “slow moving-slideshow” because of the lag, while Nintendo just started to charge players to use the online functionalities at all.

The new feature that made the old game from 2001 have such a better online experience than the new game from 2018 is called rollback netcode. Thanks to that, players felt like they were sitting next to the other person on a couch, even though they were using an online connection with kilometers between them. Over the years, the technology of rollback netcode has become more and more popular in fighting games. The reason why it is not yet used in all of them, is because it’s not that easy to implement correctly. Still, rollback netcode has a lot of potential, even in genres outside of fighting.

The goal of this research is to further explore what rollback netcode is, how it works and how to implement it in a simple demo of a MOBA game. There are not any premade options available for this, which means the netcode will have to be custom written. First, I will further look into rollback netcode until I understand how it works. Then, I will prepare a demo of a MOBA game, built with multiplayer functionality, and apply artificial lag to the network. Finally, the goal is to solve those connection issues with a custom implementation of rollback netcode. The research question I will be using is the following: “How can rollback netcode be applied to a MOBA game to improve the multiplayer experience during lag?”.

Preparation

The first step of answering my research question is to increase my knowledge of rollback netcode. I will first need to understand how the technology works, before I will be able to create a demo and implement it. To achieve that, I will search online for sources until I am able to explain what rollback netcode is, how it works and why to use it. It is also important to know how to properly use rollback netcode as well as what downsides it has compared to possible alternatives.

When explaining rollback netcode, it is often compared to delay-based netcode, which was used in fighting games before rollback netcode existed. In delay-based netcode, the game will wait for the next input, and delay the local player’s input for that same amount of time. The problem with this, is that networks are not stable, which means that this delay will increase and decrease based on the connection quality, which worsens the player’s experience. Furthermore, in a MOBA game where there are usually more than two players involved, this becomes an even bigger problem. A second problem that delay-based netcode has is that the game will start to stutter or even freeze altogether if the delay becomes too unstable. There is also a limit to how big the delay can be, otherwise the player would have to wait a second or two before their input would be processed. This limits the scope of matchmaking the game can use, in order to provide a feasible connection.

Rollback tries to solve these problems by predicting the other player’s input when it is not received. When the information about the input later arrives, together with the frame that input was intended for, rollback will return the game state to how it was at that intended frame and then process the new game state again from that point, to the present frame. This will all happen in the background over the course of one frame. Frame-by-frame this will visually result in a small jump of the rollbacked player occurring for the local player. However, those are usually close to invisible to the naked eye. To the player, it will feel like they are playing locally with no network latency at all. I created a diagram of this process to better illustrate how rollback netcode works:

However, there are some problems that come with rollback, especially if it is not properly implemented. In reality, when input is sent over the network, it will never arrive on the exact frame it was intended for. This will mean that rollback should always occur, which would cost a lot of processing power. To solve this problem, rollback netcode should be combined with delay-based netcode. By adding a small delay to the input, it gives the network time to send over the input of the other player. If it did not, the game predicts the other player’s input and rollback will occur.

Another problem that might occur with rollback netcode is desynchronization of the players. If either player is looking at different frames, bigger rollbacks than intended might occur, which visually might show as another player ‘teleporting’ around. An easy way to solve this is to temporarily pause the player who is ahead for one or two frames, so the other can catch up. Other points of attention when implementing rollback netcode are object creation and destruction, as well as audio. Creating and destroying objects all the time is often unwanted for performance reasons. As for audio, when rollback occurs and a certain action gets canceled, it is important to stop the audio from playing as well, to avoid audio bugs.

After this research I now have a clearer view of what my implementation of rollback netcode is supposed to look like. I also know several points of attention to avoid making mistakes, as well as to also add some form of input delay to save performance. With this information, I can start working on the demo of the MOBA game I want to use, implement multiplayer in it and add some artificial latency. After that, I can apply the knowledge I obtained here to implement my own version of rollback netcode.

Demo

The first step is to think about what features the demo needs. The most important thing to consider is the time. With less than four weeks to spend, the actual rollback implementation will require the most of that. Which means that the demo will not be a complete game with a start and finish, but more of a simple sandbox to test the rollback in. In the most basic form, this will need a field to play on, and an object to represent the player, the ability to move, and finally some interaction between players to see if the rollback system works with that.

Considering the “game” is meant to be a MOBA, controlling the player will be done through commands: clicking with the mouse will move the player or attack an enemy depending on where the mouse click occurred. When attacking, the player object will throw an object to another player. Because networking will not be part of this first step, a dummy object will serve as a target instead.

Building the objects takes a short amount of time. A simple plain with a green color for the field, a player object using rectangles and a circle as indicator of the position that has been clicked with the mouse. After that, I implement the walking by moving the player object to the position of the mouse indicator after a position has been clicked. While the player is moving, a walking animation is also played.

With the walking portion finished, the next action to create the attacking action. For the target dummy I can reuse the same object that I used for the player. A small sphere object is used for the projectile. When the enemy is clicked with the mouse, the player plays a throwing animation, which spawns a projectile moving to the targeted enemy.

With those two things the demo environment is finished. If there were more time the demo could be expanded to make it a game instead of a sandbox environment. The next step in the development process will be implementing the networking, which hopefully does not require too much refactoring work for the code I wrote so far.

Networking

Out of time considerations choosing a network solution to use in Unity should not require too much time. It should be easy to use, but also allow more freedom in sending custom messages instead of just automatic synchronization for the rollback system at a later stage. Usually Unity’s own UNet would be the option to go for in this scenario, but that has been marked deprecated recently. So instead the Mirror networking library is the one I will use for this project, as it is most similar to UNet, which I have experience working with in the past.

The requirements for the networking are as follows: the players have to be able to connect to each other’s game, using a host-client architecture, so there is no server involved. For the rollback it does not matter if there is a server or not. So, there should be a menu for hosting or joining the game. Then, the actions that are executed by players should be executed by themselves immediately on their own instance of the game, and then sent to the other player. This is already part of the rollback system as was stated during the preparation step: players immediately execute their own actions to provide a latency-free experience. One last small thing to consider is to tell the players apart, as they all use the same model in the demo. An easy solution to that problem is to give the players a color based on the player’s number in the game.

First to connect the players I made a simple menu with a host and join button. After all players are connected, the host can start the game with a ‘play’ button. This is where the first challenge with custom messages came in: in order to later track the players to know which player to send the messages to, I needed to keep track of which player is which. Mirror does not really have a premade way of doing this, as usually you would just synchronize everything. Luckily, it did have a way to differentiate between different players, so I synchronized this information by sending messages during the process of moving the players from the lobby to the game environment. At this moment I also assign objects, their color and starting position to the players.

Even though the players are able to connect to each other now, they still cannot see the actions of each other in their instance of the game. One of the rules of the rollback is that the game sends the input of the player to the other players, and not the state of that. It took a bit of effort to understand exactly what “sending the input” means, especially when talking about sending the input from the mouse. Initially I tried sending the position of the mouse when there was a click and then made the clients process the input upon receiving the message, but this would cause desynchronization problems with the raycasting of that mouse click into the 3D space. After several iterations I came to the conclusion that sending not the hardware input but the game’s input, or the action a player has done, is the solution. Instead of sending “player 1 has clicked on position (x, y).”, the message should contain “player 1 moved to position (x, y).” This requires refactoring to the initial way I created the action, as they should now be separated into their own actions, with their own behaviour for the information that has to be sent or the way the action should be executed. One obstacle that still came across was receiving the messages from other players. Initially I let all the players, including the sender, execute the message, using the logic that the sender wouldn’t move as they already finished the action. This caused some weird behavior on the sender’s side. After fixing this, the players can connect to each other and see each other’s actions.

The players are able to connect to each other and see each other’s actions being executed. The player sending an action immediately executes it themselves. Every player also gets their own color to differentiate them from each other. Testing with three (or four) players is also working as expected.

However, all of the testing I have done so far has been on my own machine, with no network latency whatsoever. The next step is to “break” the game by adding lag artificially and see what happens. It will also be necessary for testing if the rollback actually works after that is implemented.

Lag

To add artificial lag to the testing environment, I was told about a handy tool called clumsy, which allows for easy artificial latency to the network. To find out if this works, I will also need a way to see the ping in the game itself.

Using the clumsy tool was fairly straightforward, I ticked the Lag checkbox and put the amount of Lag time in milliseconds in the textbox. To see the ping in game I also did not need to put in any extra effort, Mirror has the ping display already built in. One thing that I did have to change was to change the network IP to connect to from “localhost” to “127.0.0.1”, otherwise Mirror would not use the network to connect. Now I can test the game under conditions with network lag. This would, however, introduce another problem where certain actions wouldn’t get executed properly, because of the lag.

After looking for the cause of this problem for a while I found out that the base of player actions and executing them is the reason. Unlike like time where I just had to make small adjustments, this time I had to rework the entire system of the player actions. Even though this made me worry for the amount of time I would have left for the rollback system itself, it was important to fix this now so this would not cause problems for that part of the project. When reworking that was finished, the demo was working properly again, even without lag.

With all these changes I was worried the version without lag would not work properly anymore, but luckily there were no problems there to be found.

Even though I can now test the game with lag, the step required much more time than I initially planned because of the faulty system I discovered. Now that is finished, there is one more step before I can finally implement the rollback, which is serialization.

Serialization

An important aspect of the rollback system is the ability to save and load game states on the spot. When the game gets rolled back, it should be able to go back and forth between game states, as well as process new states with the new input. To achieve this, I will have to keep track of the number of the frame where actions have occurred, as well as the current frame number in the game. To test this, I need the ability to pause the game and jump between these different states. However, I shouldn’t save too many frames as that will cause the game to use too much memory. It’s also unnecessary to have too many frames as the rollback should never realistically have a huge amount of frames to rollback.

When saving and loading states, it should not take much effort to add new objects to the serialization process, so a system where objects register themselves to the serialization sounds like a good idea. Not all objects and properties have to be serialized, just the ones that will get changed while the game is played. For example, the position of the players or their current action.

The first thing to do is to deal with the frame data. The challenge here is to keep track of the number of frames in the game, as I can’t just take the built in number, because this might be different for each player depending on how much time they spend in the connection menu. To solve this, I simply have to keep track of the amount of frames manually, from the moment the actual game scene starts. After this, adding the frame data to the messages of the actions is an easy thing to do.

For the serialization itself, I let the objects I want to serialize register themselves. Then, I serialize all of those objects every frame and add all that data to the game state of that frame. To go the other way around, all the registered objects also have a restore function.

Lastly, I created the functionality to “pause” the game, and jump back and forth between game states by loading those that were saved. This allows me to test if the serialization is working properly. This should only be used for testing purposes, as using this will greatly desynchronize the players in the game.

Now that the ability to save and load game states exist, and is even easily expanded with new objects in the future, I have all the components I need to finally build the rollback system itself and bring it all together.

Rollback

To fully apply the theory from the preparation step, connect all the systems I have made so far and finish the rollback system, the first question to ask is: “What is still missing?”. The answer to that question is that I need to know if a certain action was received earlier or later than the current frame of the receiving player. Depending on that, I either have to execute it right away, delay it to execute it later, or perform a rollback to execute it in the past. At this point I am not quite sure what the result is supposed to look like, but the interesting part is that with rollback, if nothing seems off, it’s working, as was told in the GDC talk about rollback in Injustice.

As expected, the rollback system proved to be quite the challenge. The first problem I came across was shown in the counting of frames: the standalone version was playing at a different frame rate from the editor version, which caused huge rollbacks to happen all the time, and become bigger over time as the gap between frame numbers grew. The first version using rollback broke the entire game on both the player doing the action as well as receiving it, as seen below. What happened is that the system was rolling back on the side of the player executing the action, even though that player should never have to rollback it’s own actions.

After the issues were fixed, my initial plan of how the system would work turned out to be correct. The preparation step of the project turned out to make developing this a lot easier. When input comes in, the system compares the frame numbers of the input as well as the current player’s frame number to decide what to do with the input. If it came too late, the game restores the game state back to the game state where the input was supposed to come in, processes the input, and then re-simulates the game state to the point where it left off. If the input is supposed to come in later, it is delayed until that frame is reached, then processed.

With the rollback system finished, the time has come to test how far it can go. The first thing to do is to test it under extreme circumstances, using lag numbers which would realistically never appear. This shows the power of the rollback system quite well: even though the lagging character seems to make a small jump in the beginning, because of the extreme conditions, it almost seems like there is no lag at all.

Under more realistic amounts of lag, the small jump is still there, but barely visible to the naked eye. For the player’s themselves there is no input delay whatsoever, and in terms of gameplay it is like the other player is playing locally instead of online.

Future Thoughts

Working on this project has taught me a lot about rollback netcode, how it works and how to use it. It also showed me why most developers still default to the classic delay-based netcode, as it is way easier to implement. Implementing rollback netcode takes a lot of time and I feel like the four weeks I had were not nearly enough to make the robust system I wanted to make. The demo is really small with only a few actions to worry about, and I wonder if the system could hold up in a scenario where there are many different actions with multiple players interacting with each other all the time, which would all need to be rolled back.

I think that rollback netcode has a lot of potential for use in MOBA games as long as the difference in ping doesn’t become too big, because even the small jumps I saw under the most extreme circumstances would become an irritation during fast paced MOBA games. I had hoped to make a rollback system that was independent of the game or networking library, but was unable to do so given the limited amount of the project. I also often found myself struggling with using the networking library for sending custom messages, as much documentation and functionality of the library uses the synchronization of variables and gamestates. That makes me feel like it’s better to build rollback on a custom networking solution for use in a game, or use one with rollback already built in.

Another part I would have wanted to do better initially is the base on which I built the player actions, as I had to rewrite that several times. One way to achieve this would be to better plan this ahead of time. Alternatively, by immediately starting off with networking everything as I make them instead of starting with a single, offline version of the demo I could have avoided reworking it a second time.

I should also remember to commit my code more often to the git repository when I am working on projects by myself. When working in a team of people I already do it often enough, but when working solo I tend to wait until the end of a big part of the project to save my work. This led to one occasion where I had to restore an earlier state of the project and lost work as a result.

In the future I would like to further look into TCP/UDP transports as Mirror supports both, even at the same time in the form of multiple channels. I did not spend too much time looking into this, but optimizing the use of these channels should allow the messages between players to be sent faster and require less use of rollback.

The preparation step of the project where I gave myself a clear image of what exactly I was trying to make, turned out to help me a lot in developing the project. In the past I often did not spend a lot of time researching theory or creating a diagram, but rather went to work and figured things out as I went. I enjoyed working on something that was not often done before and has no tutorials available that I can follow step by step.

Git repository

All the source code of this research can be found at https://gitlab.com/Luguos/rollback-netcode

Sources

Related Posts