I got swallowed up in March Madness, so my February bug story is a bit late.
Many software companies like to use their own products internally. Many places call this dogfooding. Micro Focus uses the much more sophisticated term of “self-residency.” This means that the COBOL compiler team writes their compiler in COBOL. It means that the SilkCentral team hosts their own test suites in their own SilkCentral repository. It also means DevPartner engineers use features of DevPartner to check and profile other features.
There are a few restrictions about using a profiler to profile a profiler. The same would be true to use a debugger to debug a debugger. For instance, I can use DevPartner Studio’s performance analyzer to profile DevPartner Java’s coverage or memory cores, but if I tried to profile its own performance analyzer, I would have two processes fighting to capture the singleton Quantum kernel driver, and one or both would lose out. I can also run Code Review against any other component written in support languages, because it’s not a runtime tool. If you are not careful with runtime analyses though, you can wind up in horribly wrong recursion states, if you manage to get the components to self-monitor at all.
That brings us to the February case: we had a regression bug report in that our memory analysis tool had sprung its own memory leak. Is that even possible? Certainly it is: the memory analysis tool needs to run its own code, allocate its own memory, and do all sort of operations while injected into the process being analyzed, any of which could certainly lead to leaks. As it is, the memory analyzer needs to elide all of its memory impact anyway, otherwise data results would show all of sorts of internal processing in the session file and cloud the end user results anyway.
The creative step taken this month was not magic, but it was figuring out how to get around the two architectural realities that for BoundsChecker, the DevPartner injection target cannot be itself, and that for any process running under BoundsChecker, the core injection must be first, before other DLLs come up and start making allocations. Historically, trying to even attempt this was labeled “impossible” by many who were intimately involved with this code tree. The stroke of genius was to realize that the whole shebang of BoundsChecker didn’t need to run as a fully operational product. Rather, the idea was to essentially statically link the BC core into a unit test mule process that contained enough of BC’s internals to drive the suspected locations for the perceived leak. Once this linking and enough test code and scaffolding were in place to run this mule, within 15 seconds --BOOM!-- the leak was caught and nailed down, found lurking in BoundsChecker’s symbol engine. A review of how this regression occurred was that a latent defect, possibly dormant for a decade or longer, got exposed thanks to other changes above the offending leak. It wasn’t caught in standard unit tests, because the symbol engine gets invoked only in certain modes of operation, and whereas load on it is small and unremarkable for our quick functional unit test bank, the leak becomes very problematic for fully scaled up processes with dozens or hundreds of DLLs.
It felt really rewarding to use our own logic to find our own leak. It also shows that something is impossible only until it is not.