Things to be aware of
- We patch GCC in ways that are of deep importance for the maintenance and correctness of the system. See GCC Modifications for descriptions of them and their behaviour
- The system is shockingly sensitive to changes to the compiler in terms of correctness, reliability, and debugability. The fact that a given version of GCC can have our patches cleanly or trivially merged into it says practically nothing about whether the resulting compiler will work at all, never mind work acceptably for us to recommend its use
Basic steps to take
- Find the GCC revision you think is likely a good idea to merge to, create tags reflecting the upstream releases involved (unfortunately, they do not come through the git mirror)
- Bootstrap and run the full set of tests for this GCC (
make bootstrap; make check), compare the results to the current GCC. Investigate any new failures. If any are catastrophic, restart at step 1, or choose to backport fixes should they exist (or create fixes, if necessary).
- Merge the existing GCC patches onto this branch (
git rebase --onto <yourbranch> gcc-4.4.4 il-4_4_4
This merge is often complicated for large upgrades. You may find it useful to merge across the major version change first, and the minor version changes individually and in order to minimize the noise). If the merge is still difficult you should seek out out the changes which introduce the merge complexity and include those in your progressive set of revisions to rebase onto as well. Your goal is to perform the merges with the minimum amount of conflict, to get a better chance of successful, and correct, merges.
Bootstrap and run the tests again, and compare the results to the unpatched GCC. There should be no new failures.
And then compare the two
- Build illumos using this new fixed GCC, fixing any new warnings and errors encountered along the way (this will be time consuming). You should test and integrate these fixes separately – they're likely individual unrelated bugs – it doesn't matter that the newer compiler found them, what matters is that they're fixed correctly and obviously in the source history.
- Fix any bugs which prevent this build from booting and functioning to a basic degree
Things to verify after booting
Have any DTrace probes disappeared, or the results of known problematic compiler features appeared?
We want to make sure that no DTrace probes have disappeared, and that any DTrace probes refer to sensible symbols and have not been impacted by compiler function cloning/versioning or anything similar (this is the check on line 3).
Has the compiler optimized local functions in ways that impact DTrace and the debugger
GCC, on x86 especially, will by default use a different non-ABI calling convention for local functions which will harm DTrace and mdb. You should look through our source tree for a moderate number of static functions in both kernel and userland, and verify that any calls to them follow the ABI on all 3 platforms. You can do this by DTracing for calls of them and their arguments, then checking the validity of this, and by reading disassembly of their call sites. I would recommend doing both.
Does the debugger still have the ability to display function arguments on amd64
We patch the compiler to spill function arguments to the stack on amd64 such that they are available to the debugger. You should run the saveargs functional tests in
usr/src/common/saveargs/tests, and perform sensible testing of the same functionality with the running software. I'd recommend
and investigation of anything suspicious. Expect to continue investigating anything suspicious when debugging the rest of this whole effort.
You should also disassemble the entire system (I'd do the kernel and userland separately) and process the disassembly to look for changes which may affect the prologue matcher used for this (see usr/src/common/saveargs/saveargs.c). The
libsaveargs test suite may also help you find issues here, any issues you find should be added as tests (either to make sure the compiler doesn't do it, or that
libsaveargs understands it, depending on who is at fault)
Things to verify when everything works
How does performance look?
Run the libmicro, sysbench, and any other benchmarks. Investigate the results thoroughly. You may wish to seek expert help here if results are anomolous or potentially anomolous. Do not place undue trusts in results either positive or negative (or, for that matter, neutral).
You should have been running using bits built by your new GCC for quite a while now, continue to do so, use it on all your systems
This should be obvious, there's a lot of software that is very hard to test in a canned fashion. Deploy your bits on all the systems you can, and investigate any problems that occur. Expect to do this over an extended period of time.
Run all the tests suites you can find
These should include
- The ZFS test suite
- networking tests or benchmarks (be they STC, iperf, etc.)