Syndicate this site: (RSS)

Premature Optimization

The latest geek adventure. The browser, rather than happily rendering our web page, was instead complaining that it was taking too long to load (a valid complaint, it had been chewing for a couple of minutes already).

So I snapped off a copy of the html, and went to work....

Update: now we know what IE was looking at.

I grabbed the source, the images off of the server, and also the java script files, so that a test could be run independently of the server, and further changes to the back end. I then asked our automation lead to put a timer on the render of the page from disk. Yup, it's pig slow. The slowness is likely a function of the number of entries on the page, so a few lines of Perl give me many intermediate stages to measure.

Automation lead walks the gamut, and shows me the results. Looks rather parabolic.

With the UI lead at my shoulder, we begin an inspection of the code. We are presenting a tree, of sorts, but not in any particularly convenient way. The nodes are stored in insertion order, rather than in any way convenient or even representative of the data. As a result, the code for adding a single node to the tree ends up examining every node already present [in fairness to UI lead - the original requirements suggested that N would never be so large that it would matter].

OK, so we have the N^2 bottleneck. [those of you who have already caught on - shh. You'll spoil the surprise for everybody else]. What to do? Well, is there some reason that we can't store the nodes in a more appropriate order? We have some concerns about writing off the end of our array, but a quick experiment (suggested by code already present) confirms that the arrays are automatically grown prior to an insert.

This is excellent news, because it also allows the possibility of a sparse array. Pulling the memory of Knuth 2.3.2 from a dusty corner of my past, we are able to convert the tree to a binary tree using a pair of sparse arrays - one to track the first child, another to track the next sibling.

Even pairing, we make a few errors along the way, but eventually bring everything into line, and put a human counter on the test. Hmm, performance still seems to degrade as N^2, but it is certainly better. We cut a few versions, one seems about the same, another slightly slower, and pass them off to automation for a retest.

The retest reveals no significant change in performance. Yup, we had identified the wrong bottleneck, didn't confirm our guess, and didn't think to establish the baseline we were comparing against when doing our preliminary tests. All of the first order improvements we saw were hardware related.

As we had, at least, done some testing before our guessing began, the gods were not wholly unkind. I came away with a fair notion of what the "second" N^2 issue would be - rule one, look for string concatenation, and this time devised a test that could show that improved string performance would make a significant difference.

Oddly enough, the faster string version still threw up a dialog, where our original improvements had not seen it, even though they were much slower. I'm not yet sure what IE was tracking. So we needed one fix for the strings, and a second fix for the tree (which now is a quantifiable improvement, with the bigger issue out of the way).

Update: KB175500 claims that IE tracks the number of statements executed. The limit is set by a registry key. The exploratory tests we did here are consistent with that claim. I'm not sure how this information will be useful to anybody, but there you go.

September 17, 2003 12:13 AM | TrackBack

Comments
Post a comment




Who are you?