Anyone who has worked in quality assurance or is serious about testing knows about a technique used to define test cases that is called boundary value analysis. Here is how Wikipedia describes it:
Boundary value analysis is a software testing technique in which tests are designed to include representatives of boundary values. [...] Since these boundaries are common locations for errors that result in software faults they are frequently exercised in test cases.
In short: faults like boundaries.
This principle has been of great help to me lately when investigating about a curious bug that affected a customer of mine, and, what’s more, only one of its iPads. It was a nasty bug, of the kind that is not easily reproducible but keeps popping up from time to time.
Usually what you do in such cases is asking the customer for trace logs, or, if none is available, ask her to describe how she got there and so on, in an effort to shed some light on the apparent randomness of that behavior. This approach only rarely succeeds, unfortunately, due to a multiplicity of factors (logs are not meaningful or not available, the customer is not able to describe correctly what she did and so on).
In such cases, which usually lead close to despair and to many hours sitting in front of a monitor in frustration, code review is really the only way to go. Still, in the face of a “perfect” code, where you can find no evident problem, what you really need is some criteria to guide your search. This is where boundary value analysis comes into the picture.
If you like details, to better grasp the situation I am describing, here they are: the app was a simple, animated, CSS3 clock; its main feature was it’s unique, copyrighted design, that you can appreciate below, and it sported a continuous shift of the hands and some specific lightning to reproduce a realistic shadow effect of the clock hands against the background. What happened is that sometimes, launching the app, the seconds hand shadow, and only that, behaved in a crazy way. All of the other clock hands and their shadows continued to work correctly.
What was really striking at that was that all the three hands and their shadows used the same CSS3 animation; so why, in the first place, just one was not working. So this had to do with the specific position of the seconds hand shadow at the moment when the animation started, but, even more puzzling, that was exactly the same initial position of the seconds hands. And this one was always working smoothly. I was really lost.
When I started to review my code, I immediately noticed that there were indeed a few boundaries I could inspect. The animation was defined in terms of an interpolated rotation across four cardinal points. At each cardinal point, specified by its angle, the position of the clock hand and its shadow was defined in terms of rotation and translation respect to the 12-o’clock normal position.
Based on this, I designed a test to stress the animation when the starting time was close to those cardinal points. The test repeatedly started the animation by progressively advancing the start time a bit. It turns out that this approach was right, since it quickly showed that there was a tiny interval around one of those cardinal points where the seconds hand and its shadow were almost overlapped; almost: their distance was close to zero and definitely negligible, but definitely not zero (think 1.0e-16). This led to a number which was not proper CSS3, so the animation failed.
Rounding that tiny number to zero fixed the bug.
Summing things up: when the only resort you have to try and find a nasty bug is code review, boundary value analysis can be your valuable friend.