Darwinia+ deep and dirty Part I

The only place you'll ever hear the truth
User avatar
Byron
level2
level2
Posts: 147
Joined: Tue May 13, 2008 3:48 pm

Darwinia+ deep and dirty Part I

Postby Byron » Fri Apr 17, 2009 4:57 pm

My mother-in-law used to do something that really annoyed me. Okay, admittedly being a mother-in-law there were lots of things she did that annoyed me, but this one not only annoyed but really frustrated me at the same time. Whenever you went to into the kitchen the most commonly used items would always be at the back of the cupboard. It didn’t matter if you rearranged them to be at the front they would always find their way to the back. So if you went to make a cup of instant coffee the granules would always be at the back, which meant un-packing the cupboard to get at said granules, and once finished re-packing again. It turned what should have been a simple task into a logistical exercise. As somebody used to optimising processes and code this was a very frustrating exercise for me and no matter what argument I put forward it would always be the same when we came back another day. Sometimes some things just don’t want to be optimised.

Darwinia+ is coming to the end of the project, and as usual Knuth’s law has come into effect and we have to improve the performance as much as possible. But like my mother-in-law there are parts of this game that just flatly refuse to be optimised. I always equate this part of the job as similar to working in the sewers of London – you are going to get deep and dirty dealing with other people’s mess.

The first task was to pick a selection of games in the Multiwinia part of Darwinia+ and produce baseline results for comparison. These baseline files are generated by the code and consist of the current frames per second output once a second into a file. In theory all we need to do is repeat the same test under the exactly the same conditions, and we can graph the changes the optimisations make. It’s not a finely grained picture of the game, but gives enough overview to see if anything is having an effect, and most importantly is something that can be automated to a certain extent. This is precisely the reason Multiwinia was chosen, as it is easier to rig the predictive physics system to play exactly the same game over and over again given the same player input, and to emulate that we just leave the controller alone and let the game play itself.

The second task is to pick which of the baselines was the worst performer, and in Darwinia+ that turned out to be the Multiwinia game Rocket Race on a 4-player map. This particular map is huge and the game can run for an undetermined period of time depending in how well the player or AI is doing – all the rest of the games use a time limit. For some reason this particular map was dipping to below 8 frames per second which is well below the target 30 frames per second.

In reality a more meaningful measure is the number of milliseconds a frame takes but that is something I will be measuring when I have to go deeper and start to look at isolated sections of the game. This is when I put on my overalls, hard-hat, spotlight and prepare to go deeper and even dirtier.

If I had my way early on I would have changed the entire approach to optimisation. For my entire career I have always had this following line from Donald Knuth quoted to me:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil."

Or more accurately and ironically a small part of that line:

“Premature optimization is the root of all evil."

Nearly every lead developer I have worked for has used that quote as an excuse to leave the optimization of the game to the last few months when everything is complete. In part I agree with that statement but there’s also a small part of my brain nagging away, screaming:

‘But he wrote that in 1974 when CPU’s and computers in general were far less complex than they are today – not to mention the amount of code they would have had to deal with!’

And you know what, I am probably going to get hung, drawn and quartered for saying this, but that nagging scream is right, the world of coding has changed a lot since 1974. We now have processors that have caches and are able to execute more than one instruction at a time with predictive branching. Running at blistering speeds that would have made Knuth’s head spin back in ’74. Our code base is now measured in the millions of lines that they simply would not have had enough storage space to keep, let alone RAM in which to fit it all for compilation. So why do we still cling to this archaic notion? Because having said all that, there is still some truth to it.

It’s here where I differ from most people’s interpretation of Knuth’s famous line. Rather than leave the optimization phase of the project to the last few months of the entire project do it early on a module-by-module basis. One of the great things about C++ is that it forces us to work in a very modular basis (at least it should!), so what we end up with is a game made up of lots of loosely coupled modules. So it should be possible to treat modules as separate entities to be optimized on a case-by-case basis with a final interoperability optimization later on in the project lifecycle.

I can’t do this for Darwinia+ because the game has been out for quite some time in the form of Darwinia and Multiwinia for the PC. The game is modular but given the quantity of code going through each module one by one would be very costly in both time and money. So all that is left is to run the game and identify hot spots and do deep analysis to work out exactly why it is killing the game – is it CPU or GPU bound? So back to the sewer worker analogy and getting down deep and dirty…

Unlike my Mother-in-Law this one WILL be optimized!
User avatar
shinygerbil
level5
level5
Posts: 4667
Joined: Wed Dec 22, 2004 10:14 pm
Location: Out, finding my own food. Also, doing the shinyBonsai Manoeuvre(tm)
Contact:

Postby shinygerbil » Fri Apr 17, 2009 5:24 pm

Good luck with that. ;)
User avatar
djspiewak
level0
Posts: 5
Joined: Mon Nov 20, 2006 9:47 pm
Contact:

Right and Wrong

Postby djspiewak » Fri Apr 17, 2009 5:44 pm

Knuth is both wrong and right to declaim premature optimization. He is really referring to the obsessive-compulsive tendency of programmers to turn this:

Code: Select all

void do_something(int x)
{
    something_else(x + 5) * something_else(x + 5);
}


...into this:

Code: Select all

void do_something(int x)
{
    int mid = x + 5;
    something_else(mid) * something_else(mid);
}


Putting aside the fact that most modern compilers will do this anyway, optimizations of this sort often serve not only to obfuscate code, but also to really bog down the development process in trivialities.

On the flip side, we do some incredibly complex things with computers these days, involving algorithms that Knuth could never have conceived of at the time. When we design a system from a high level, efficiency must be part of the equation, otherwise we will often write ourselves into a corner by our own aversion to "premature" optimization. I'm not just talking about hashtables vs AVL vs splay trees, I'm looking at really deep and profound differences between one design path and another. Unfortunately, the most obvious design path is not always the best for long-term efficiency and maintainability, especially in C++.
In the beginning, there was Uplink. Uplink rocked. Then there was Darwinia. Darwinia rocked. Then there was Defcon. Defcon... Hey, I think there's a trend here!
MaW
level0
Posts: 7
Joined: Thu Oct 19, 2006 9:42 pm

Re: Right and Wrong

Postby MaW » Fri Apr 17, 2009 7:17 pm

djspiewak wrote:Unfortunately, the most obvious design path is not always the best for long-term efficiency and maintainability, especially in C++.


I've had a lot of reasons at work to agree with that statement lately. Seem to be doing a lot of design work for things written in C++, and we keep getting to implementation time and just sort of... splat. Coding ourselves into corners, even before somebody comes along and says 'oh, but what about this thing it can't do that is essential that I didn't tell you about before because you're supposed to know by telepathy'.

But it is very easy to get things wrong if you just plough ahead on functionality, and sometimes you end up having to re-do all your inter-module interactions. Maybe that's the most important thing to get right first - how the modules work inside is, as Byron says, something you can leave for later. Hurrah for encapsulation!
Weatherproof
level1
level1
Posts: 22
Joined: Thu Jul 17, 2008 5:18 pm

Postby Weatherproof » Fri Apr 17, 2009 11:34 pm

Yeah, like shinygerbil said: Good luck!!
jsgf
level0
Posts: 1
Joined: Sat Apr 18, 2009 1:09 am

Premature optimisation might be bad, but so is pessimisation

Postby jsgf » Sat Apr 18, 2009 2:17 am

The real problem with premature optimisation is that you just don't know where your performance problems are going to be until you have some real working code. If you've thought about it and implemented the code avoiding the problems you've thought of, then by definition, the remaining problems are the ones you haven't thought of.

That said, its also quite possible to do your implementation in a bone-headed way which makes later optimisation very hard. You might choose a technique which puts 1000 .1% performance hits around the code, making them almost invisible to profiling (lots of dynamic memory allocation and/or copying). Or you might just choose an inflexible structure which turns out to be completely wrong (premature use of over-clever structures is a good way to get here).

Knuth didn't mean "you don't need to think about performance". He meant "don't be an idiot".
User avatar
Byron
level2
level2
Posts: 147
Joined: Tue May 13, 2008 3:48 pm

Re: Premature optimisation might be bad, but so is pessimisa

Postby Byron » Sat Apr 18, 2009 10:55 am

jsgf wrote:The real problem with premature optimisation is that you just don't know where your performance problems are going to be until you have some real working code. If you've thought about it and implemented the code avoiding the problems you've thought of, then by definition, the remaining problems are the ones you haven't thought of.

That said, its also quite possible to do your implementation in a bone-headed way which makes later optimisation very hard. You might choose a technique which puts 1000 .1% performance hits around the code, making them almost invisible to profiling (lots of dynamic memory allocation and/or copying). Or you might just choose an inflexible structure which turns out to be completely wrong (premature use of over-clever structures is a good way to get here).

Knuth didn't mean "you don't need to think about performance". He meant "don't be an idiot".


Very true. The problem with Knuth is not what he said/meant it's other peoples interpretation of it. A lot of people live by the rule that you should not attempt to optimise your code until the very last moment which is madness on a system that runs into the millions of lines of code. It's very easy to measure the impact a module has on the I and D cache for example and there is no reason why on a module-by-module basis you can't fix those issues.

Your example of the over-clever structures is a very good one. The biggest issue we have with modern C++ usage is what happens when virtual functions or even class layout are miss-used and how much of an impact it can have on performance. As an example, and I will be covering this in greater detail in part 2, we had a tight loop looking through all the buildings searching for one with a specific ID. The pointers to the buildings were held in an array and the loop dereferenced the pointer to check the ID contained within the instance of building class. This might seem innocent but each dereference of the pointer caused another D-Cache line to be pulled in from main memory just to look for 1 int. The solution was to store this ID in the array with the pointer to the building since the array itself was already in the D-Cache. That simple change saved over 5 million wasted cycles on a processor that can execute 2 instructions per cycle. Had the programmer been aware of what the compiler produces from such code and the impact it would have on the cache systems then things like this can be avoided. The problem is that the programmers weren't aware and the code base is full of such code. This is where early optimisation would have helped us.
User avatar
lorant
level1
level1
Posts: 34
Joined: Thu Aug 07, 2008 8:49 pm
Location: France

Postby lorant » Sat Apr 18, 2009 11:50 am

I am currently reading the book "C++ Coding Standards" by Herb Sutter and Andrei Alexandrescu, and they talk about this exact problem. Their 8th guideline is "Don’t optimize prematurely", but their 9th guideline is "Don’t pessimize prematurely"...

Knuth's rule is no excuse to discard efficient design, if this design isn't more difficult to understand than the non-optimized one, and doesn't obfuscate the code. An obvious example is the use of prefix increment operator instead of the postfix one. Avoiding algorithms with quadratic complexity is also something that should be done early.
User avatar
Shwart!!
level5
level5
Posts: 1237
Joined: Sun Nov 12, 2006 1:36 am

Re: Premature optimisation might be bad, but so is pessimisa

Postby Shwart!! » Mon Apr 20, 2009 4:33 pm

Byron wrote:Your example of the over-clever structures is a very good one. The biggest issue we have with modern C++ usage is what happens when virtual functions or even class layout are miss-used and how much of an impact it can have on performance. As an example, and I will be covering this in greater detail in part 2, we had a tight loop looking through all the buildings searching for one with a specific ID. The pointers to the buildings were held in an array and the loop dereferenced the pointer to check the ID contained within the instance of building class. This might seem innocent but each dereference of the pointer caused another D-Cache line to be pulled in from main memory just to look for 1 int. The solution was to store this ID in the array with the pointer to the building since the array itself was already in the D-Cache. That simple change saved over 5 million wasted cycles on a processor that can execute 2 instructions per cycle. Had the programmer been aware of what the compiler produces from such code and the impact it would have on the cache systems then things like this can be avoided. The problem is that the programmers weren't aware and the code base is full of such code. This is where early optimisation would have helped us.


If I understand you correctly, you're doing optimizations for D+; are any of these changes going to make it into Darwinia or Multiwinia?

Shwart!!
User avatar
Byron
level2
level2
Posts: 147
Joined: Tue May 13, 2008 3:48 pm

Re: Premature optimisation might be bad, but so is pessimisa

Postby Byron » Mon Apr 20, 2009 4:37 pm

Shwart!! wrote:
Byron wrote:Your example of the over-clever structures is a very good one. The biggest issue we have with modern C++ usage is what happens when virtual functions or even class layout are miss-used and how much of an impact it can have on performance. As an example, and I will be covering this in greater detail in part 2, we had a tight loop looking through all the buildings searching for one with a specific ID. The pointers to the buildings were held in an array and the loop dereferenced the pointer to check the ID contained within the instance of building class. This might seem innocent but each dereference of the pointer caused another D-Cache line to be pulled in from main memory just to look for 1 int. The solution was to store this ID in the array with the pointer to the building since the array itself was already in the D-Cache. That simple change saved over 5 million wasted cycles on a processor that can execute 2 instructions per cycle. Had the programmer been aware of what the compiler produces from such code and the impact it would have on the cache systems then things like this can be avoided. The problem is that the programmers weren't aware and the code base is full of such code. This is where early optimisation would have helped us.


If I understand you correctly, you're doing optimizations for D+; are any of these changes going to make it into Darwinia or Multiwinia?

Shwart!!


The optimisation I am doing are very XBox360 specific and I doubt you would notice a difference in the PC version.
brog
level1
level1
Posts: 22
Joined: Fri Jun 22, 2007 10:11 am

Postby brog » Mon Apr 27, 2009 6:52 pm

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.


People always end the quote prematurely.
User avatar
The GoldFish
level5
level5
Posts: 3961
Joined: Fri Mar 01, 2002 9:01 pm
Location: Bowl / South UK
Contact:

Postby The GoldFish » Tue Apr 28, 2009 8:33 am

My strategy has always been to target my data structures and make sure they're as functional and the data is accessable in the ways I'm going to want to access it, *before* I write a ton of code using them in different ways. Not being funny or anything, but putting the building ID in the with the pointer, and building a binary tree out of the all the building IDs/Pointers in the level, seems like what would have been there to begin with... or even accepting that the buildings could be limited to 1000 and having a fixed size array sucking up memory. But, really those are my practical concerns rather than a realisation that there could be a performance issue involved.

I'm not trying to say bad things or anything, but it has always seemed to me that IV's games are often poorly optimised/poorly implimented in some key areas. Darwinia's performance has never made sense, and from the impression I've been given, even some basic data reoganisation and rendering changes could have cut DEFCON frame times by something like a factor of 2. It seems to me like optimisation has never been a priority, and when it has been a priority, it's never been at the expence of having to rework the game in any major way...

Glad to hear D+ is coming to the end of the development cycle, though!
nameit
level0
Posts: 2
Joined: Sat Sep 25, 2010 8:40 am

hi

Postby nameit » Sat Sep 25, 2010 9:05 am

{spam!}
Mas Tnega
level5
level5
Posts: 7898
Joined: Sat Mar 02, 2002 11:54 pm
Location: Edinburgh
Contact:

Postby Mas Tnega » Sat Sep 25, 2010 12:53 pm

Spam reported.

Return to “Introversion Blog”

Who is online

Users browsing this forum: No registered users and 5 guests