Saved game serialization

Well, seems like another two months has gone by in a flash!  I was away visiting family for some of this time, and the work I’ve been doing on the game has resulted in a frustrating lack of progress to show for it, but here’s an update on what I’ve been up to anyway.

Saved game serialization

Up until now, saved games and constructions have been serialized via a binary stream, with no formatting, just using BinaryReader and BinaryWriter directly.  This is fast and results in a compact file size, but has one huge disadvantage, a lack of version tolerance.  In other words, if I add or remove any variables to be saved, or reorder them, then old saved game files will no longer load correctly.  To work around this I wrote code to check a version number in the saved file, and then convert things over for each added or removed variable.  This is a hack really, and has resulted in rather messy and hard to maintain code.

This situation is bad enough with just the demo version of the game out there, with a cut down feature set.  Maintaining saved game backwards compatibility will only get harder once the full version is released.

Ideally, I need a properly structured save file format, with some kind of key value pairing that would allow for version tolerance, but wouldn’t bloat the file size too much.

BinaryFormatter

First I investigated using BinaryFormatter, because it allows for version tolerance via optional fields, but I couldn’t get it to work when deserializing MonoBehaviour classes.  I need to be able to instantiate the MonoBehaviour and then populate serialized values into it, not have the deserialization process itself try and allocate a new MonoBehaviour (which is not allowed by Unity).  I thought maybe using a serialization surrogate might allow for this, but couldn’t figure out a way to make it work.  The other downside of BinaryFormatter is all the assembly and type data it saves out adds quite a bit to the file size.

Json serialization

So after looking around for other possible solutions, I decided to try Json.  This provides the key value pairs I need in a fairly compact structured format.  I used Json.NET from Newtonsoft (provided via a Unity asset store package for ease of integration) which seemed really promising, it’s very easy to use and highly configurable.  In most cases there’s very little additional code to write, you can just use the JsonProperty attribute to specify which class properties to serialize, and configure how they’re serialized.  Also, it allows for populating properties of a MonoBehaviour that has already been allocated, by using JsonSerializer.Populate() inside a JsonConverter.

Still, it took me several weeks to get Json serialization working for both saved constructions and full saved games, there were a few stumbling blocks along the way which took time to work around, as did just learning how to best use the Json.NET API.  The end result seemed great though, it solved the version tolerance problem, and the code was so much simpler and cleaner.

One issue was that the resulting file sizes of the text based Json format were huge.  Given that the game uses the same serialization code path to send construction data between networked players, this was a problem.  So, I switched over to using Bson (the binary equivalent to Json), and also compressed the data via a DeflateStream.  This worked well, the resulting file sizes actually ending up smaller than my original binary stream format.

Performance and memory problems

At this point I thought I was good to go, but then I started profiling the Json serialization with large saved game files (more than a thousand parts), and realized I was in trouble.  Firstly, deserializing the saved game was more than twice as slow using Json vs. the old binary stream method.  This wasn’t a complete disaster as the load times weren’t terribly long in the first place.  The other more serious issue was that the Json deserialization did an enormous number of tiny GC allocations (as in millions of allocs, totalling hundreds of MB!).

I found that reducing the JsonProperty name lengths helped slightly with this but not to any significant degree.  I spent quite a lot of time restructuring the code that loads the various modules in the game (player, constructions, time of day, etc.) to try and deserialize from the file stream more efficiently, but this made very little difference to performance or memory usage unfortunately (the resulting code was cleaner than before though so I guess the refactoring was worth doing anyway).

I’m annoyed with myself that I didn’t do enough tests to pick up on these problems before putting all the effort in to convert the game over to use Json.  If I’d known ahead of time, I probably wouldn’t have bothered.

So now I’m not sure what to do.  If I stick with the old binary stream solution, then all the Json serialization effort will have been wasted and I’m still stuck with difficult to maintain code for backwards compatibility.  But the Json serialization option as it stands isn’t acceptable, I’d need to do something to resolve the memory and performance issues.  One possibility would be to manually serialize everything (i.e. use JsonReader / JsonWriter directly rather than JsonSerializer), supposedly this is the fastest way as it avoids overhead from reflection and so on.

I’ve decided for now to put all this to one side, think about it some more, and come back to it later.  In the meantime I really need to get back to trying to make some positive progress with the rest of the game!