What if Code Doesn’t Pass the Smell Test?

At Click Here Labs, when our quality control team conducts a “first pass” at testing a website or mobile app, we use exploratory testing methods to gauge the level of effort that will be required for a given testing project and to uncover any possible weak spots in the application that may yield the greatest number of bugs. During the initial test sessions, our testers are armed with quality checklists, interactive functional requirements and wireframes, and a notepad application for taking notes. We spend time exploring the application under test just as a user would. We take notes regarding potential issues that we may want to explore further or issues that we may want to discuss with the programmers before we start logging any bugs in our bug tracker. Often during these explorations (and occasionally during regression testing), we’ll encounter some aspect of the application that doesn’t appear to be quite right. It doesn’t always appear to be a bug, it doesn’t appear to be reproducible, and, initially, it may even appear to be something trivial. The testing community refers to this potential quality issue or risk as a “test smell.”

The term “test smell” was appropriated from test-driven development (TDD) programmers Martin Fowler and Kent Beck, who used the term “code smells” in their book Refactoring: Improving the Design of Existing Code. According to Fowler, the identification of “bad smells” from code should motivate programmers to redesign or refactor their code. In the chapter describing the various types of code smells, Beck relates a quote from his grandmother who was once asked, “How do you know when to change a diaper?” Grandma Beck replied, “If it stinks, change it.”

In her blog post, “Testing and Bad Smells: When to Investigate Potential Bugs,” Penny Wyatt describes why test smells should not be ignored: “The bad smell is merely a symptom of a larger issue that was otherwise unnoticeable…or, at least unnoticed. By investigating the smell, you’ve prevented a much bigger issue from shipping – possibly even one relating to security or data loss.”

But Wyatt warns that often testers don’t have the time to stop and investigate every bad smell encountered in testing. According to Wyatt, test smells should be prioritized, and those test smells with the highest priority should be investigated first. She cites three types of test smells that are worth investigating. The first are familiar smells – those that have been encountered on previous projects that used the same type of product domain (experiential site, e-commerce site, etc.), technology (programming language), or environment (operating system, database, web browser, or API). Another high-priority class of smells is related to false equivalence in which the tester incorrectly assumes that multiple instances of functionality are all controlled by the same codebase, but in reality the code is not identical in some parts of the application. The third class of high-priority test smell is encountered in regression testing, when code is slightly modified, and the programmer makes the claim that there is no need to retest the application. On the surface, there may be slight differences in the application, but the risk implications could be critical without thorough regression testing.

As an example, I recently encountered a test smell while testing an application that was being updated with new content and updated to make a call to a new version of an API to replace a deprecated version. I found what appeared to be an intermittent issue with the data in the form of a dot that appeared in place of some of the test users’ names. After some discussion and some additional testing/consultation with the programmer over instant messenger two days later, the test smell turned out to be of the false equivalence variety outlined above. Although the API calls generally seemed to be working, there was one path in the application that was still calling the deprecated version of the API.

Wyatt’s blog post also offers advice regarding when, how, and to whom the test smells should be reported. Through years of experience, I’ve learned to follow a similar approach to the advice Wyatt provides. When the test smell is first encountered, I’ve found that the best approach is to make notes regarding the details of the test smell and continue with the remainder of the current round of testing. Before logging the test smell as a bug, the issue should be discussed face-to-face with the programmer, whenever possible. During the brief meeting, it should be emphasized that the test smell could be a symptom of a deeper problem. If additional knowledge is required about the application’s requirements to investigate the test smell further, the business analyst should also be invited to the meeting. Based on the information gathered at the meeting, the tester should explore the test smell further and identify the steps for reproducing the issue, if at all possible. At this point, the test smell should be logged as a bug report (including the statement that the bug report is for the suspected deeper problem and not the symptom). In addition, the tester should make other team members aware of the test smell at scrums or reviews, in the event that they’ve encountered the issue and might have some additional insight.

If the programmer returns the bug to the tester as “Ignored” or “Unable to Reproduce,” the bug should not be sent to the backlog, but should be set to “Blocked” with the explanation that the reproduction steps have yet to be discovered. I’ve found through experience that if the test cycle is scheduled for at least two or three days, the developer, the tester, or another team member can usually discover more insight into the test smell after sleeping on the problem, and this usually leads to a resolution.

In summary, project teams should be aware that not all quality issues can be easily identified and summarized with easy-to-follow reproduction steps. Sometimes it may take a keen sense of smell, a bit of detective work, and a team effort to achieve the desired quality outcome.


The Trends Justify the Means