February 24, 2010 – Timo Kämäräinen, Bugbear Entertainment
About Bugbear Entertainment
Bugbear Entertainment is an award-winning, leading independent action entertainment studio located in Helsinki, Finland. It is focused on action driving and destruction with unrivalled physics and has been around since 2000.
About the Author
Timo Kämäräinen is the technical director at Bugbear Entertainment and oversees various aspects of Xbox 360, Playstation 3 and PC development. He has previously worked as the lead programmer on the PSP version of Sega Rally and as a programmer in the FlatOut series for Playstation 2, Xbox, Xbox 360 and PC platforms.
The article describes how Bugbear shifted from long running nightly builds to continuous integration builds for multiple platforms – including Xbox 360, Playstation 3 and PC – and what that change has meant for the whole team. As code builds have been covered in previous studies, the focus here is more on the SCons-based asset builds with Incredibuild and the overall importance of build times in continuous integration.
The Atmosphere of Fear
With the old build system, which consisted of long-running nightly builds, it was always a gamble for anyone to get the latest build. Builds were broken often and had to be manually hotfixed or they would stay that way at least until the next nightly build. Working simultaneously on three major platforms often also would mean that trying to compile and run code on a rarely used configuration would produce mysterious errors which would have to be debugged thoroughly before anything else could be done, which meant that much more time was spent on finding bugs rather than fixing them. Programmers and artists working closely together would have to manually copy assets or executables to each other in order to sync their work, which often resulted in people having “unofficial” builds that broke very easily when new assets or executables came along. All of this led to an atmosphere of fear where no one wanted to update their working build until it broke down – and at some point it always did.
Something had to be done and continuous integration seemed like it would address most of these issues. While the build times for the code would have to be improved, it was obvious that one of the biggest challenges in setting up company-wide continuous integration would be the build times for the assets – these would have to be reduced dramatically in order to run the builds many times a day. The old system was already distributing manually configured build processes on a per track (or level) basis, meaning that the build process would always take as long as the biggest track, while the machines already finished building their smaller tracks would just sit idle.
As the processed build data for each platform consists of thousands of files as opposed to tens of tracks, it would make sense to have build distribution on a per-file basis. Also it was decided that the incremental build dependency checks would have to be in the build system itself as opposed to the individual tools handling time-stamp checks as it was at that time. This would eliminate the need to distribute files that wouldn’t need processing over the network, which was something to be concerned about with the processed assets totaling around 1 to 8 gigabytes of data depending on the phase of the project.
Once the requirements were sorted out we looked at various build tools and distribution software. The most important requirements for us were that the build system should behave locally as identically as possible to the distributed environment, and that the code distribution for all three platforms and asset distribution should ideally be controlled by the same distribution system for maximum efficiency and easier maintenance. Because we wanted to get away with minimum effort for setting up the new system we quickly ruled out creating and maintaining an in-house tool for this.
As a build tool SCons seemed to deliver what we needed – md5 based dependency checks, support for parallel builds, python based extensibility and a cache system for build results. For Xbox 360 and PC code builds Incredibuild seemed like a viable choice but what really made the decision for us was out-of-the-box support for SCons and PS3 code builds via Incredibuild’s automatic interception interface. Another third party tool was obtained for handling other aspects of the continuous integration process, such as version control integration and e-mail build failure reports, whereas Incredibuild would provide the raw processing power required for the builds.
Although we could have used the spare processing power of idle machines to speed up the builds, there needed to be some computers actually running the builds. For this purpose a separate server rack of 6 quad-core machines was set up. Each machine would be responsible for running different builds in parallel and also joined together via Incredibuild to form one big cpu-farm. The setup allows us to run all of our builds continuously with idle build machines acting as helpers for other builds to provide a total of ~64ghz processing power.
Setting up the code builds was quite straightforward. Each of the three platforms has their own machine for running the builds. Xbox 360 and PC code is distributed with Incredibuild’s native Visual Studio support while the PS3 code build is distributed with the Incredibuild’s automatic interception interface. After some profiling of the code build process, specifically for Xbox and PC, we came to the conclusion that once there were enough cores participating in the build the disk access started to become the bottleneck. As a result we ended up using 64-bit Windows with 16 gigabyte memory and the intermediate build data stored on a ramdrive to even further accelerate the builds.
Changing the old build batch, which used to be just that – a simple batch file, into SCons-based build scripts was a bit more complicated than the straightforward code builds. This was done gradually over time, and initially the old batch file was used as a series of dummy builders in SCons. These dummy builders would run certain parts of the batch so that we could achieve an instant improvement in build distribution. Even with this rudimentary build distribution we saw an improvement in build times with Incredibuild. However this kind of approach didn’t provide proper dependency checks and the distribution was still not happening on a per-file basis.
Looking at the typical contents of an asset build, the amount of textures and geometry relative to other files is often very high. Incidentally for us they also take the most time to process, with texture compression being the most time consuming process of all.
As first steps towards a fully distributed build we originally converted the parts of the build process that handled the textures and geometry into proper SCons builders, which gave us an even bigger improvement in build times with Incredibuild. This meant that the texture compression and geometry processing was distributed on a per-file basis.
Even though the geometry distribution was happening on a per-file basis the files themselves contained geometry for the whole track and all objects in it. To further improve build times through distribution – and also to improve artists’ workflow – the geometry pipeline went through some major changes. Previously the workflow involved processing the whole track every time any part of the geometry changed. The new system allowed the use of an object library with separate files that could be instanced into the track. While this allowed artists to only update the object they worked on it also allowed the geometry build process to handle each geometry file in a separate SCons builder for maximum build distribution.
Gradually the whole build system was changed to use proper SCons builders for all kinds of tasks such as shader compilation, xml-based data generation, etc… As an added benefit from this we now had dependency checks handled by the build system so that we could do proper incremental builds for the assets. This combined with the build time improvements provided by Incredibuild the continuous integration system is able to handle any kind of changes on the fly, even if the inherent build dependencies would cause all textures to be compressed again in the middle of the day.
Even with the dedicated hardware build farm achieving continuous integration at the current level would not have been possible without the build distribution provided by Incredibuild. Thanks to the sheer speed at which our builds are being processed we were able to add code configurations to the builds which would not normally be needed, such as the debug builds, which benefit a lot from being compiled and tested after each code submit. Also having the system build all assets on the fly as well means that we can run tests on the complete package after each code or asset submit.
Compared to the old build system which ran one build for each platform in one day, the new system handles around 20 to 30 builds on average per day for each platform. Each build consists of three different unit tested code configurations – debug, developer and release – plus assets and even simple functional tests to ensure that the new code or assets haven’t broken the build. All of this happens with a turnaround time of ~15 minutes. This has had a great impact on build stability since broken builds are detected as soon as possible and can be fixed right away instead of waiting for the next day.
With the improved stability there has also been a clear improvement in the team morale – people are no longer afraid to get the latest build and the atmosphere of fear has gradually disappeared.
Build turnaround time plays a crucial role in continuous integration. Incredibuild has provided us the necessary tools to improve our build process and to take the full advantage of continuous integration so that every member of the team is constantly running a stable build even in the early stages of game development.