I recently changed my job from being a tester in an agile team at a web company to being the quality assurance in a team at an automotive company. At least in germany this industry is dominated by what I call “old-school” development techniques. V-Model, big requirements documents, manual regression test phases, …
As such I was pleasantly surprised when I was told that there’s a jenkins server doing continuous integration. The pleasantness ended when I learned three things about this server:
– it’s actually a desktop computer under a desk (high availability doesn’t seem to be an issue, still this scares me)
– builds are triggered by a timer (7 am, 11 am, 3 pm, “daily build” at 10pm)
– a build takes at least 90 minutes, more if the build is running in parallel for two branches
In their defense it has to be said that the server had been worked on by any developer that a few minutes to spare – there was never a single responsible person with a budget for it.
Now there’s me and the task to “improve build times to speed up feedback”.
The application I’m working on is based on eclipse rcp, written in java and consists of a few dozen plugins. It’s distributed to our customers in an obfuscated package.
Jenkins performs the following steps in a regular build:
– compile the application code
– compile the test code
– compile debug extensions used internally
– obfuscate the application code
– run the tests
– gather static analysis data
All compilation steps are done for x86 in 32bit and 64bit.
The build isn’t just done for CI purposes, the same build is used to generate the packages which are delivered to our customers.
After talking to a few people (and more importantly: keeping my ears open to what’s being said around me) I learned the following:
– There is a 64bit version of the app, but it doesn’t work properly because the license manager we use is only available for 32bit
– There had been problems in the past with obfuscated code not working properly but there’s not a single test actually looking for those – they had all been found manually
– the debug extension is only needed for packages which had been shipped to customers
– the java compiler is using four threads only because more didn’t speed up the build due to disk io limits
Being the new guy in the company I felt free to challenge many things and experiment a lot.
Let’s get rid of the obfuscation for CI. This had been a parameter for the job already, all I had to do was to change it from default “true” to “false”. Woops, build times are down from 91 minutes to 58. Isn’t at nice?
Since we didn’t want to give up testing with an obfuscated build completely I enabled the obfuscation for our nightly build – noone cares how long that takes.
No more debug extensions. Noone is using them in CI, so why are we building them?
Sadly, this brought the build time down only by a minute. Then again, a minute also adds up during the day…
No more 64bit builds. As mentioned before the 64bit version of our app isn’t working properly, so why would we want to build it in CI? Once the problems are sorted out I’ll add it again.
This took the build time down another 4 minutes – less than I expected, to be honest.
Luckily for us our IT department is fast and maintains quite a stock of hardware inhouse. My request for an SSD to put the jenkins workspace on was fulfilled amazingly fast and after some trouble with the Dell BIOS I could move the workspace.
What shall I say? Disc IO definitely WAS an issue – the build time dropped by 19 minutes, taking us down to a final 34 minutes. That’s a bit more than a third of the original build time so I consider it a success so far.
What did I learn from this experience?
First of all: Talk to people and keep your ears open. I wouldn’t have touched the 64bit builds had I not overheard a discussion about them not working.
Second learning: Be courageous and just do it! Everybody could have done what I did – yet somehow noone did. Especially getting the SSD was a simple thing to do yet somehow everybody shied away from it because of the external dependency to our IT department.
Last learning: Measure, measure, measure! I was expecting the SSD to help us a lot – but now that I have hard numbers I can justify the cost. On the other hand I expected the 64bit to be more expensive – good to know that it isn’t.
After going for the low hanging fruits there’s still things to do.
I did a plot of the test execution times by package compared to the amount of tests in that package which clearly shows that 10% of our tests take up 75% of the test execution time – which equals 17 minutes. Single test methods take up to a minute to execute!
I wonder if those tests can be sped up. If not, I might violate the “all tests, all green, all the time” principle and move the slow and not so important tests to the daily build.
Another thing to investigate is the amount of parallel java compiler threads. Right now there are four compiler threads – according to coworkers because more threads didn’t help due to disc io. Now that this bottleneck is gone let’s see what we can do there.
By the way, the daily and weekly builds (with an extended testing scope) which existed before are still available. They still build the full package (incl. obfuscation). Even for those tests moving the workspace was beneficial, they came down to 59 minutes.
That’s it for now, I’ll post updates once I have more things to report.