Thursday, September 28, 2006

Custom Fields - Regression

Can you tell off the top of your head (or via a simple query in your issue tracker) what is the regression ratio in your product for a specific version, and where are the regression areas?
Chances are, the answer is no. The reason is that out of the box, most issue trackers don't indicate an issue is a regression, and don't provide causality links.

This means that when you look at a new bug, you must rely on the description/comments if you want to note the regression source. This of course limits your ability to manipulate the data.

What can you do? Well, there are two things.
First, you can provide a simple "Regression" custom field which will be true in case the issue is understood to be a regression, or more accurately, a new issue caused by another change in the system (and not an issue which was there all along, just detected thru extended QA coverage).
This lets you know which issues are regressions, which usually points to issues you really want to deal with before releasing.

What it doesn't do is provide info as to the regression source . The only accurate way to track regression source is to provide links from the regression back to its cause. This can be done via a "Caused by/Caused" link. Hopefully your issue tracker allows custom links (JIRA does...). In case you know which specific issue caused it, fill that in. If you don't know, or its due to a large feature, just add a placeholder issue and link to that, even if its just a build number ( e.g. FeatureA, build1.2.1).

Lets assume this information is actually filled correctly most of the time (not a trivial assumption actually - those experienced in trying to convince all stakeholders to fill data they don't really think is useful to THEM will probably nod with agreement here). Now you can look at the SOURCES of regression and try to see if there is any intelligent conclusion that can be made. Is it the small stupid stuff that you feel will be trivial? Is it the hard fixes where you don't do enough code review and integration testing? Are the regressions local, or can an issue in one area cause a chain effect in different modules altogether? Are certain teams maintaining fewer/more regressions? Are certain modules pandora boxes for new bugs/regressions whenever they are touched?
These understandings should be leveraged to think where you want to improve in your development process.

NOTE: Even if you are afraid the data won't be collected, try to think along these lines via a less formal view of the regressions in your last big version. Hopefully you can make some conclusions with what you have at the moment.


Custom Fields - Detected In Field

This is a first in a series of small short suggestions on stuff you might want to track in your issue tracer.

One of the important ways to measure effectiveness of your quality effort is to understand the ratio of issues detected in the field (versus the whole issue count).

To track this, add a custom field that will be True/Checked whenever an issue ORIGINATED in the field. Note you should NOT include issues which were detected internally, waved thru by PM decision, and later detected/experienced by someone in the field. This is a different type of issue which doesn't reflect on the quality of your "detection" effort, but more on the quality of your decision making process.

An alternative to this simple field is to provide a link to a trouble ticket in some CRM system, and to decide only to do the link when the issue originated in the field. Of course a reverse link from the CRM to the issue is always recommended both for those that originated in the field and those that were detected internally.


Tuesday, September 12, 2006

Edible versions - Tips for implementation

So you read my Edible versions post and want to get the good stuff on how to make it happen in your organization. Well, to be honest, its not that difficult once all the parties sit together, talk about their expectations and design the protocols between the groups. See my earlier post for some general pointers.

Having said that - Maybe I CAN provide some tips that I saw working in the past:

  • Ensure all content in a delivery is tracked as a change request (bug/feature/other) in an issue tracker.
  • Provide an "Impact Level" for each change, so QA can easily focus on the high impact changes first.
  • For complex changes or large builds get used to hold delivery meetings where the DEV and QA discuss the changes and exchange ideas on how to proceed with covering this build. Be effective - know when the changes are small and the process can be lighter.
  • Try to establish an environment which automatically generates release notes for a version. At a minimum, as a report on whatever the issue tracker says. If possible, it should be based on actual deliveries to the SCM system. Use something like the integration between JIRA and Quickbuild/LuntBuild

Edible versions

All you QA people out there -

  • How often does your QA group "choke" on versions delivered by the development group?
  • Are you used to "unedible" versions which just don't taste right?
  • How about versions which simply come as a blackbox where you have no idea what changed, therefore no idea what to do with the version, what to expect of it?
Now all of you DEV people - think about the times where you installed 3rd party products/updates which caused you the same digestion problems...

Those unedible deliveries cause a variety of problems. Lets start with the fact that whoever gets the delivery wastes a lost of time chewing it up, in the meantime not only delaying coverage of the new delivery but also NOT making progress on previous deliveries (the classic question of when to commit your QA organization to a new build delivered by R&D and risk coverage progress for the earlier but known build. Especially tasteful when delivered to your plate a few hours before the weekend)

When the contents are unclear, QA people can only do general coverage, and the time it takes to verify regression concerns and make sure whatever we intended to fix was indeed fixed grows longer.

What is the point here? same as a sane person would refuse to swallow unmarked pills coming from unmarked bottles, refuse to take a version/build/delivery that is not documented sufficiently. I'm not aware of many good reasons not to mandate Internal release notes.

And DEV guys - consider some dogfooding on each delivery, even if its "just" to the friendly QA people next door. Lots of work you say? Well then, time to introduce Continuous Integration and Smoke Testing...

Thursday, September 07, 2006

Orcanos Product Life-cycle Management

A friend refered me to Orcanos QPack. This appears to be another candidate for the Product Life-cycle management segment. The company is based in Israel, and so far I've only briefly glanced at their documentation, and it seems interesting. Not much references in Google though...

If anyone has looked into this tool and can compare it to the other tools I mentioned here, please come forward...

Thursday, August 31, 2006

Rosie has the test management blues as well...

Rosie Sherry writes a very interesting blog focused on software testing.
One of these days I'll point to some of the interesting blogs I'm reading regularly.

In any case, One of her recent posts was "Hunt for Test Case Management System" where she discusses the lack of a killer test management solution, but tries to outline some alternatives.

Those interested should go over there and take a look...

Tuesday, August 29, 2006

QA Effort Effectiveness

How do you know your QA effort is being effective ?

Based on the different stakeholders which require input from the QA a typical answer might be that Product quality is high when released to customers.

Assuming that is indeed more or less what someone expects (I'd say effective QA needs to answer to some other requirements as well) how does one go about checking whether the product quality is indeed high?

Those who reached a fairly intermediate level of QA understanding would easily point out that the percentage "QA Misses" (namely, the number of issues missed in QA and detected in the field) should be below a certain threshold. A high number here means simply that too many issues/bugs are not detected during the entire QA coverage only to be embarrassingly detected by a customer.

If one naively optimizes just for this variable, the obvious result is a prolonged QA effort, aiming to cover everything and minimize the risk. If no reasonable threshold is set, there is a danger of procrastinating and avoiding the release.
See The Mismeasure of Man of a cool article on abusing measurements in the software world...

Of course, a slightly more "advanced" optimization is to open many many bugs/issues so the miss ratio will become smaller due to the larger bugs found in QA, not due to missing less bugs. This can result in a lot of overhead for the QA/PM/DEV departments as they work on analyzing, prioritizing and processing all those bugs.
Did I forget to factor in the work to "resolve/close" those issues? NO! Several of those issues might indeed be resolved and verified/closed, but those are probably issues that were not part of the optimization but part of a good QA process (assuming your PM process manages the product contents effectively and knows how to enforce a code-freeze...).

My point is that there are a lot of issues that are simply left there to rot as open issues, as their business priority is not high enough to warrant time for fixing them or risking the implications of introducing them to the version.

A good friend has pointed this phenomena to me a couple of years ago, naming it "The Defect Junk Factory" (translated from hebrew). He meant that bugs which are not fixed for the version on which they were opened on, indicate that the QA effort was not focusing on the business priorities. The dangers of this factory is a waste of time processing them, and the direct assumption that either the QA effort took longer because it spent time on these bugs, or that it missed higher business priority bugs when focusing on these easy ones.
Kind of the argument regarding speed cameras being placed "under the streetlight" to easily catch speed offenders (with doubtful effect on overall safety), but all the while missing the really dangerous offenders.

So what can be done? my friend suggested measuring the rate of defects that are NOT fixed for that version. The higher this number, the more your QA effort is focusing on the wrong issues.
Just remember that this is a statistical measure. Examining a specific defect might show that it was a good idea for the QA to focus there, and the fix was avoided due to other reasons. But when looking across a wide sample, its unreasonable that a high number of defects are simply not relevant. If not a QA focus issue, something else is stinking, and is worth looking at in any case.

Another factor of an effective QA is fast coverage. What is fast? I don't have a ratio of QA time related to development time. Its probably a factor of the type of changes (Infrastructure, new features, Integration work) done in the new version as each type has a different ratio of QA to DEV effort. (e.g. kernel upgrade usually requires much more QA compared to DEV effort )
Maybe one of the readers has a number he's comfortable with - I'd love to hear.
What I do know is that version-to-version the coverage time should become shorter, and that the QA group should always aim to shorten this time further without significantly sacrificing overall quality. I expect QA groups to do risk-based coverage, automation for regression testing, and whatever measures which assist them in reducing the repeatable cost of QA coverage at the end of each version. The price/performance return on reducing the QA cycle is usually worth it to some extent.

To sum up, a good QA effort should:

  • Minimize QA misses
  • Minimize the defect junk factory
  • Minimize QA cycle time without compromising quality
What do you think is a good QA effort? How are you measuring it?





Severity and Priority - The Debate

There are a couple of alternatives for managing severity and priority in the Issue Tracker.
Although there are many resources out there on this subject (see http://del.icio.us/yyeret/priority_severity) I’ll try to consolidate them and provide my 2c on the matter, as I think its an important subject.

Single-field Priority
First, seemingly simpler alternative, is Single-field priority – representing Customer Impact
The idea here is to only have a single priority/severity field. The reporter assigns it according to his understanding of the customer impact (severity, likelihood, scenario relevance, etc.). Product Management or any other business stakeholder can shift priority according to current release state, his understanding of the customer impact according to the described scenario. The developers prioritize work accordingly.
The strength of this approach is in its simplicity, and the fact that several issue trackers adopt this methodology and therefore support it better “out of the box”.
The weakness is that once in the workflow the original reasoning for the priority can get lost, and there is no discerning between the customer impact and other considerations such as version stability, R&D preferences, etc.

An example why this is bad? Lets say Keith opened bug #1031 with a Major priority. Julie the PM later decided that since there is some workaround and we are talking just about specific uses of a feature rarely used, the business priority is only Normal or Minor. Version1 is released with this bug unresolved. When doing planning for Version2 Julie missed this bug since its priority is lower.
Even if the feature its related to is now the main focus of the version. Even if not missed, looking for this bug and understanding the roots and history is very hard, especially consider the database structure of issue trackers. History is available, but its not as easy as fields on the main table of issues…

Brian Beaver provides a clear description of this approach at Categorizing Defects by Eliminating "Severity" and "Priority":
I recommend eliminating the Severity and Priority fields and replacing them with a single field that can encapsulate both types of information: call it the Customer Impact field. Every piece of software developed for sale by any company will have some sort of customer. Issues found when testing the software should be categorized based on the impact to the customer or the customer's view of the producer of the software. In fact, the testing team is a customer of the software as well. Having a Customer Impact field allows the testing team to combine documentation of outside-customer impact and testing-team impact. There would no longer be the need for Severity and Priority fields at all. The perceived impact and urgency given by both of those fields would be encapsulated in the Customer Impact field.

Johanna Rothman in Clarify Your Ranking for System Problem Reports talks about single-field risk/priority:

Instead of priority and severity, I use risk as a way to deal with problem reports, and how to know how to fix them. Here are the levels I choose:
o Critical: We have to fix this before we release. We will lose substantial customers or money if we don't.
o Important: We'd like to fix this before we release. It might perturb some customers, but we don't think they'll throw out the product or move to our competitors. If we don't fix it before we release, we either have to do something to that module or fix it in the next release.
o Minor: We'd like to fix this before we throw out this product.

bugzilla-style Severity+Priority

Here, the idea is to use a severity field for technical risk of the issue, and a priority field for the business impact. The Reporter assigns severity according to the technical description of the issue, and also provides all other relevant information - frequency, reproducability, likelihood, and whether its an important use-case/test-case or not. Optionally, the Reporter can assign priority based on the business impact of the issue to the testing progress. e.g. if its a blocker to significant coverage, suggest a high priority. If he thinks this is an isolated use case, suggest a lower priority. A business stakeholder, be it PM, R&D Management, etc. assigns priority based on all technical and business factors, including the version/release plan. Developers work by Priority. Severity can be used as a secondary index/sort only.
Developers/Testers/Everyone working on issues should avoid working on high-severity issues with unset or low priority. This is core to the effectiveness of the Triage mechanism and the Issue LifeCycle Process
Customers see descriptions in release notes, without priority or severity. Roadmap communicated to customers reflects the priority, but not in so many terms.

Strengths of this approach are:
* Clear documentation of the business and technical risks, especially in face of changing priorities.
* Better reporting on product health when technical risk is available and not hidden by business impact glasses.
* Less drive for reporters to push for high priority to signify they found a critical issue. It’s legitimate to find a critical issue and still understand that due to business reasons it won't be high priority.
* Better accommodation of issues that transcend releases - where the priority might change significantly once in a new release.

The weaknesses are that it’s a bit more complex, especially for newbies, and might require some customization of your issue tracker, although if your tool cannot do this quite easily, maybe you have the wrong tool…
In addition, customers have trouble understanding the difference between their priority for the issue and the priority assigned within the product organization. The root cause here is probably the lack of transparency regarding the reasoning behind the business priority. I’d guess that if a significant part of the picture is shared, most customers would probably understand (if not agree) with the priorities assigned to their issues. Its up to each organization to decide where it stands on the transparency issue. (see Tenets of Transparency for a very interesting discussion on the matter in the wonderful weblog of Eric Sink)

To see how our example works here – Keith will open the bug, assign a major severity, and a low priority since the bug blocks just one low-priority test case. Julie the PM sees the bug, and decides to assign a low priority value, so the bug is left for future versions for all practical matters. When planning V2 Julie goes over high-severity issues related to the features under focus for the version, and of course finds this issue as it’s a Major severity.

See the following resources for this approach:
* http://c2.com/cgi/wiki?DifferentiatePriorityAndSeverity
* Priority is Business; Severity is Technical:

business priority: "How important is it to the business that we fix the bug?" technical severity: "How nasty is the bug from a technical perspective?" These two questions sometimes arrive at the same answer: a high severity bug is often also high priority, but not always. Allow me to suggest some definitions.

Severity is levels:
o Critical: the software will not run
o High: unexpected fatal errors (includes crashes and data corruption)
o Medium: a feature is malfunctioning
o Low: a cosmetic issue

Priority levels:
o Now: drop everything and take care of it as soon as you see this (usually for blocking bugs)
o P1: fix before next build to test
o P2: fix before final release
o P3: we probably won't get to these, but we want to track them anyway

* Corey Snow commented on Clarify Your Ranking for System Problem Reports:
Comment: Great subject. This is a perennial topic of debate in the profession. The question at hand is: Can a defect attribute that is ultimately irrelevant still serve an important function? Having implemented and/or managed perhaps a dozen different defect tracking systems over the years, I actually prefer having both Priority and Severity fields available for some (perhaps) unexpected reasons. Priority should be used as the 'risk scale' that the author describes. 3 levels, 5 levels, or whatever. Priority is used as a measure of risk. How important is it to fix this problem? Label the field 'Risk' if that makes it more clear. Not so complicated, right? So what good is Severity? Psychology! Its very existence makes the submitter pause to consider and differentiate between the Priority and Severity of the defect. In other words, without Severity, the submitter might be inclined to allow Severity attributes to influence the relative Priority value. Example 1: Defect causes total system meltdown. Only users in Time Zone GMT +5.45 (Kathmandu) are affected on leap years. There is one user in that time zone, but there is a manual workaround, and a year to fix it besides. Priority=Super Low, Severity=Ultra High Severity gives a place for the tester to 'vent' about their spectacular meltdown, without influencing the relative Priority rating. Example 2: Defect is a minor typo. Typo in on the 'Welcome to Our Product' screen, which is the first thing every user will see. Priority=Ultra High, Severity=Super Low Again, Severity gives a place for the tester to express how unimportant the defect is from a functional perspective, without clouding their Priority assessment. I once managed a defect tracking system with only a Priority field. This frequently led to a great deal of wasted time in defect discussion meetings as one side would argue about Severity attributes while another would argue about Priority attributes, but the parties were not even aware of the distinction that was actually dividing them. Having both fields serves to head off this communication problem, even if Severity is completely irrelevant when fix/no fix decisions are actually made. ~ Corey Snow (03/11/03)

Author's Response: Corey, Great counterpoint to my argument. ~ Johanna Rothman (03/12/03)

Personal Favorite
As can probably be understood by now, My personal favorite is the Severity+Priority approach. I confess I don’t have much experience with the single priority approach, but I really feel the Severity+Priority way is very effective, without significant costs once every stakeholder understands it.

What is your favorite here?

Favorite resources - round I

As those who read my posts probably noticed already, I'm quite a heavy user of del.icio.us. I won't go into what it is, am sure those interested can go there or google it to see whether they like it or not.
I'm playing around with Google Notebook as an alternative, with better google integration obviously, albeit less taxonomy/tagging capabilities.

In any case, I highlighted some of my favourite resources under the rndblog_resources tag, and provided some notes to accompany the links and explain why I find them essential in the favorites list of anyone interested in the contents of this blog (and probably for some people who are NOT that interested in this blog, but then again, they won't find be here...)

There are more gems in my account, so probably expect future rounds based on existing and new resources I find. I'd love to hear about more resources along those lines - either comment or suggest them to me via the del.icio.us network.
Anyone interested to track my favorites is welcome to join my network

Now for the resources themselves (copy paste from a del.icio.us linkroll page)

Thursday, August 24, 2006

David V. Lorenzo posts favorite interviewing questions of people on his Career Intensity Blog

Here is his post about mine...

At the risk of hinting the people who I interview in the future, also check out my interviewing tag on del.icio.us for a lot of resources on the matter.

Why am I open about this?
One of my main beliefs in interviewing btw is to try and understand behaviourial aspects in addition to skills. Someone might get a head start for the skills questions if he prepares, but I think in that area if someone is diligent enough to research his interviewer, go and read multiple resources, learn enough to know the subject, he's getting extra credit right out of the gate...
For the behaviourial aspects the discussion is more flowing, and no preparation can really help you there.

I cannot finish a post about interviewing without mentioning Johanna Rothman. She's writing the Hiring Technical People blog, and wrote the Hiring The Best Knowledge Workers, Techies & Nerds: The Secrets & Science Of Hiring Technical People book. Check it out.

Monday, August 21, 2006

Sunday, August 20, 2006

QA/DEV Protocols - Opening high quality bugs

In another post in the series about QA/DEV protocols, I'll talk about opening high quality bugs, why its important, what are the forces operating on each side of the trench here, and try to describe an approach that might improve the state of affairs a bit.

First - a definition. What is a high quality bug? To be clear, we are talking a bug report of course. The quality here refers to the accuracy of the scenario, describing exactly what is necessary to reproduce, not more, not less. It refers to providing all the auxiliary information required to analyze the bug and start working to a resolution. It also aims to report A single bug, not several issues.

It might be easier to convey the point by showcasing some examples for low quality bug reports:

  • Missing logs
  • Logs of different components are not time-synched, with no way to understand the time-space relationship. (This is relevant mainly for distributed systems )
  • Errors happened, but are not mentioned explicitly in the bug report
  • Bug report focuses on analysis, not on reporting the facts. Analysis is a bonus for QA engineers, only relevant AFTER reporting the full details.
  • Much happened on the system, a couple of different scenarios, and the bug is hidden somewhere in piles of logs/information.
  • Unclear bug report, leading to difficulty to prioritize and understand by business people (PM) and DEVs.
  • A complex long scenario is reported while the bug is reproducable via a simple short one.
  • The reported severity doesn't match what really happened, leading to "cry wolf" or serious issues masked as trivialities.
  • Multiple bugs in the same report
  • Numbers - Avoid using statements like "very large" or "a lot of time". Always include the numbers you are talking about. What seem large to you may seem small to someone else, or vice versa.
Also check out FogBugz - The Basics of Bug Tracking

Now that we have deducted what a high quality bug report is, we can try to understand the forces influencing the people opening bugs and why sometimes low quality bug reports do happen:
  • When QA people find a bug, they want to report it and move on. Sometimes they feel they are metered by quantity not quality, sometimes they actually are...
  • Especially for hard cases, the scenario is not that clear, and indeed there is some mix of events (including a full moon on a friday the 13th for the real nut cases) that cannot be easily reduced to a simple scenario. Trying to do this without the internal understanding of a DEV guy might take very long without being very effective.
  • QA engineers are human. When the test setup/teardown is complex and requires attention to many small details (clear logs, sync time, grep for patterns in logs, etc.), things will get lost from time to time.
  • In some cases, the QA group or a specific engineer is not aware of the price of low quality bug reports. (point him here...). DEV guys might not be able to put a finger on it either, or are just entrenched and prefer to point fingers and exchange emails instead of working to establish a protocol.
So what can be done?
  • Discuss and educate - like I hinted, sometimes the most important step is to talk, map the expectations and root causes, and agree on a protocol, with the relevant SLAs.
  • Assist QA by providing small automated snippets that can assist with test setup/teardown/analysis, guides them thru the steps to a high quality report, and really leaves them with the important step of reducing the scenario to the minimum. (btw, its possible to do the scenario reduction in automated testing harnesses as well, by retracting steps and verifying health and expected results very frequently)
  • Work with very granular test cases - minimizing the scenario length. Still, combining different test cases in parallel will add complexity, but when the building blocks are small, its better than nothing.
  • Issue tracking system should guide the reporter thru the important information/steps to a high quality report.
  • DEVs should provide contructive feedback - when bug reports are below par quality, and when they are above. Do it privately when below quality, and publicly when above.
  • Do "peer review" of bug reports when relevant - for rookie QA engineers, for difficult bugs, etc.
  • In hard cases call in a DEV and get his advice on what needs to be done to make sure the report has its best chance to become a high quality one.
Any other ideas?

QA/DEV Protocols - Calling developers to the lab

I'm going to dedicate a couple of posts to the relationships between QA and Development (DEV) organizations.

Anyone who's ever been in either of those organizations knows that sometimes there seems to be a conflict of interest between QA and DEV, which can lead to friction between the groups and the people. Obviously when both organizations are running under the same roof, there must be some joint interest/goal, but the challenge is to identify the expectations of each group in order to work toward their goal and accomplish their mission effectively.

The difficult cases are those that put more strain on one party, in order to optimize the effectiveness of another. Example - developers are asked to unit/integrate/system test their software before handing over a build to QA. Some developers might say that this is work that can be done by QA, and their time is better spent developing software. The QA engineers will say that they need to recieve stable input from the DEVs in order to streamline the coverage progression, and that the sooner issues are found, the lower the cost to fix them.

One way to look at these "protocols" between the groups is via the glasses of TOC (Theory of Constraints), identify the bottlenecks of the overall system/process, and fine-tune the protocol to relieve the bottleneck. People in those groups, and especially the leaders, should be mature enough to know that sometimes doing the "right thing" might be to take on more work, sometimes even not native work for their group.

One example is the issue of when to ask DEV guys to see problems the QA engineers have discovered.
Reasons for calling DEV might be:
  • Wish to reopen a bug
  • Bug was reproduced and a developer was interested to see the reproduction.
  • New severe bug
There are a couple of forces affecting this issue:
  • QA wishes to finish the context of the specific problem/defect, open the bug, and get on with their work.
  • DEV wishes to finish the context of their specific task, and wish to avoid the "context switch" of looking at the QA issue.
  • In general, both QA and DEV have learned to wave the "Context Switching Overhead" flag quite effectively. (A more pragmatic conclusion is that some context switching overhead is unavoidable, and sometimes the alternative is more expensive...)
  • In some cases, "saving" the state of the problem for asynchronous later processing by DEV is difficult or takes too many resources to be a practical alternative.
A possible compromise between all those forces is to define some sort of SLA between the groups, stating the expected service provided by DEV to QA according to the specific situation (Reopen, Reproduced, New Severe, etc.). This SLA can provide QA a scope of time they can expect answers in, without feeling they are asking for personal favours or "bothering" the DEVs. The developers get some reasonable time to finish up the context they were in without feeling they are "avoiding" QA. The SLA can also cover the expected actions to be taken by QA before calling in the DEV, or in parallel to waiting for them. This maximizes the effectiveness of the DEV person when he does free up for looking at the issue, while better utilizing the time of the QA while waiting. (for example - fill the bug description, look for existing similar bugs, provide the connectivity information into the test environment, log excerpts/screenshots, etc.)

Another question is who to call on when QA needs help. The options here depend on the way the DEV group/teams share responsibility on the different modules of the system.
  • In case there are strict "owners" for each module, and they are the only ones capable of effectively assisting QA, the only reasonable choice is to call on them... this requires everyone to always be available at some level.
    I have to say though that I strongly advise against such an ownership mode. Look at code stewardship for a better alternative (in my oppinion) and see below how it looks better for this use case and in general...
  • In case there is a group of people that can look at each issue, one alternative is to have an "on call" cycle where people know they have QA duty for a day/week. In this case there will be issues which will require some learning on their part, and perhaps assistance from the expert on a specific area. This incurs overhead, but is worth its price in gold, when the time comes and you need to support that area in real time, need to send the DEVs to the field to survive on their own, or when the owner/expert moves on...

To sum up, like in many h2h (human-to-human) protocols, understanding the forces affecting both sides of the transaction is key to create a win-win solution. A pragmatic view trying to minimize the prices paid and showing the advantages of the solution to both sides and to the overall organization can solve some hard problems, as long as people are willing to openly discuss their issues and differences. I've seen this work in my organization, hopefully it helps others as well.

Some greenpepper

As I previously hinted in "Building a test case management solution", I'm personally of the opinion that the holy grail in test case management is in finding the way to manage tests via an issue tracker database.

In the time since that post I didn't find much information about this, and didn't see tools taking this approach.

Therefore, it was great to stumble upon Agile06 - François Beauregard - GreenPepper Software - a podcast discussion where I learned about greenpepper, a test automation and management system developed by Pyxis Technologies which closely integrates with JIRA to create an issue tracker solution for managing test cases, and adopts a Fitnesse-like approach (on drugs) to table-driven testing, over the Confluence wiki. I liked the choice of tools to integrate with as well as the pragmatic simple ideas.
Last week Frank and Christian demonstrated the system to me and a colleague. I was impressed, and would recommend anyone with interest in the test management or issue tracking domain to track those guys. I know I will.

One issue I've been thinking over regarding both Fitnesse and GreenPepper is how to take those tools that are focused on one-shot automated testing, and adapt them to track manual testing documentation, results, etc. Finding a good way to solve this problem might assist with adoption of those tools/frameworks in environments which are not the classic agile web development environment I suspect are the majority of adopters at this point.

Monday, August 14, 2006

Tracking Issues for Multiple Releases

Pattern:     TrackingIssuesForMultipleReleases
Context:     Multiple versions are being actively developed/maintained. New issues are discovered on one version, and their status and progress needs to be tracked on multiple versions, with minimal overhead but maximal accuracy.
Active versions – active branches where a build was already issued and documented, and new builds are planned.
Problem:     When a new issue is discovered, need to understand start of applicability (when it was introduced into the product) and end of applicability on each version branch (when it was solved). Due to branching, the state might be complex. Tracking the workflow for solving the issue on each branch/version is also complex when working with a naïve model. In addition, managing this can add considerable overhead – unchecked, this can lead to an explosion of bureaucracy and tracking overhead, leading to lack of faithful representation as a backlash.
Forces:     

  • Want to accurately visualize the status of each issue on each active version, and whenever new versions are created.

  • Want to preserve a relationship between the same issue in the various versions, so progress/understanding from previous work can be reused (reproduction efforts/success, solution, workarounds, etc.)

  • Assist project/product management in tracking issues that are active in each version.

  • Allow workflow to proceed for each version on a standalone basis (e.g. QA wants to verify issue is closed on all active versions)
Solution:     
When applicability is determined (usually when R&D analyzes the issue), use Cloning capability in issue tracker to create a new issue for each active version. The clone has all of the data of the original issue, including a clone link to the original issue. The “applicable in version” for the clone should be the active version that the clone was created for. The “Fix for version” should be a milestone/planned version on the same version branch.

Resulting Context:     
  • List of open issues for each version lists all issues. No need to calculate list of issues based on input from other versions.

  • Workflow can proceed on each version in parallel.

  • Naively, even if issue is about to be solved, clones are still created, and will be resolved independently even if based on the same promoted changeset.

Variants: (see below)
Related Patterns:


Pattern:     JustInTimeCloning
Context:     See TrackingIssuesForMultipleReleases
Problem: When working with TrackingIssuesForMultipleReleases, cloning overhead is substantial, and not always necessary. Unchecked, this can lead to an explosion of bureaucracy and tracking overhead, leading to lack of faithful representation as a backlash.
Forces:     

  • Allow TrackingIssuesForMultipleReleases

  • Want to minimize overhead for reporters, developers, QA verification. Aim to avoid O(N) processing overhead per the number of active versions, whenever possible.

  • Separate workflow is needed only for versions which were already delivered to QA.

  • Visualizing version contents at this level is needed only when delivering to QA and beyond.

  • Need to track which versions contain a fix and which don’t.

  • Motivation to merge original fix to all applicable version branches as soon as possible while still in context

  • Motivation to focus on version branch you are working on, and avoid the overhead of merging/integrating/testing to other versions.

  • For each version, the following might be the case regarding original fix applicability:

  • It might apply cleanly or with minor modifications, in which case the motivation is to apply it as soon as possible while still in context, and in which case the need for QA verification is lower (while still required depending on the version state)

  • It might not be applicable, and require a whole new solution. In this case the motivation is usually to track the issue as open for the version, and leave it to the appropriate time.

  • It might not be applicable, due to irrelevance of the issue on the version (e.g. feature cancelled, whole new behaviour).
Solution:     
Add a “Next version state” field in the issue tracker, with the following options:
  • OPEN – issue is open for the next version

  • INTEGRATED – a fix for the issue was applied in the next version

  • CLONED – the issues was already cloned to the next version so no need to track it here

  • UNKNOWN – state in the next version is unknown

  • N/A – issue is non-applicable in next version due to irrelevance (see forces above)

  • CLOSED – optional. In case QA/others want to signify that the solution was not only integrated but already verified/closed, so no need to do verification once its cloned to the new version.


Lets assume 2 version branches – V1, V2, with V1 currently at V1.10. V2 first build will be V2.1.
When applicability is determined (usually when R&D analyzes the issue), decide how to proceed with marking/cloning based on the following criteria:
  • If the issue was detected on V1 branch, BEFORE V2.1 was created (meaning the version is still only being developed, QA didn’t see it yet, no release notes, etc.), mark the issue as OPEN in next version, but don’t clone yet.

  • If the issue was detected on V1 branch, AFTER V2.1 was created (meaning the version is being actively tested, and the contents of each build are being tracked, regression is monitored, etc.), clone the issue to V2, and mark it as CLONED

  • If the issue was detected on V2 branch, clone it to V1, since V1 is already being tracked closely.

  • Whenever an issue is not applicable to the next version, mark it as N/A in the next version.

NOTE: Most issues on V1 branch will be detected BEFORE V2.1 is created, but several will indeed be detected while both versions are being actively maintained. (Hopefully 80%/20% rule applies here).
NOTE: Issues detected on the newer V2 branch before they were seen on V1 are usually a result of additional QA coverage, or a stroke of luck (another type of QA coverage). This is the minority case here.

When a solution is being integrated to V1, aim to promote it to V2 branch as well. If the issue was cloned, do the relevant workflow for the clone as well. If only marked as OPEN, mark the issue as INTEGRATED in next version.
If a solution was found for V1 but its integration is delayed due to CCB approval or any other process which is heavier for a frozen branch, integrate it to V2 and mark it as INTEGRATED.

When delivering the first V2.1 build to QA go over all issues marked as OPEN on next version (those for which a fix wasn’t already integrated on both V1 and V2) and clone them to V2.
When QA wants to reverify all issues that were integrated, clone all INTEGRATED issues as well, but avoid cloning CLOSED issues (optional)

Resulting Context:     
  • Versions already being tracked show the full list of applicable issues and their state

  • Versions yet to be tracked will show the full list of issues once tracking is started.

  • Overhead of cloning is minimized to the periods of time when two or more versions are being tested/tracked concurrently.



Process Flow Patterns

Following up on my "patterns for issue tracking" post, here is Deeper documentation of some of the Process Flow patterns. I will try to follow up from time to time with documentation of the patterns. Once the knowledge base is more or less complete I will probably consolidate it into an article/whitepaper/wiki format.

Pattern: Configuration Control Board
Aliases: Change Control Board, Configuration Management Board, CCB
Context: A development group is trying to control content/configuration of its product
Problem:
Conflicts between different stakeholders (PM, QA, DEV, etc.) and their motives can make the answer to “what is best” for the product version a complex one, and the group needs to provide the best business answer considering all aspects.
Forces:
• Every Change Request (CR) has a price – some sort of regression risk depending on scope and delicacy of the change. The risk is accompanied by the testing effort needed to verify/close the CR.
• Some CRs are required to meet release criteria.
• Change Requests (CRs) for Bug fixes potentially improve stability
• CRs for enhancement/features potentially increase user satisfaction or open new markets.
• Time to market tries to force fastest possible implementation of each CR.
• Developers want to implement CRs according to “good architecture practices”
Solution:
Classical CCB (Need to find the authoritative definition…)
But in general:
Define a board comprised of stakeholders for the product including engineering, business/user (PM), QA, management. Stakeholders should be knowledgeable and with enough authority in their domain. This is the CCB
CRs will be submitted to the CCB by engineering. They will be discussed by the CCB, in either periodical or ad-hoc meetings, and a decision will be made and communicated to the relevant parties.
Decisions take into consideration the pros and cons of each CR, the context of the product/version, and make a business decision.
Resulting Context:
• CRs cannot be committed/completed but need to be queued
• Once approved CRs should be completed/committed by either the original engineer or a RE (Release Engineer)
• Rejected CRs will be completed/committed for future versions or dropped altogether.
An issue tracking system enables streamlined CCB operation and tracking of its decisions.

Pattern - Distributed Configuration Control Board
Aliases - CCB Proxy
Context -
A development group is trying to control content/configuration of its product, without slowing down or losing context too much
Problem -
In classic CCB the latency between submitting an issue to CCB and its approval/rejection takes significant time (there is a limit to the feasible frequency of CCB meetings, even when willing to be ad-hoc).
This time the CR is not integrated, losing ContinuousIntegration time and conflicting with “Merge Early and Often”.
In addition the engineer gets farther and farther away from the context of the CR as he assumes other work.
In addition the time necessary for discussing all CRs in CCB meetings is expensive when considering the number of members and the depth required to make intelligent decisions.
Forces:
• Wish to minimize time between CR readiness and commit time:
o Meet other possibly conflicting CRs as soon as possible (Merge Early and Often)
o Deal with issues as closer to context as possible (minimize context switch cost)
o Raise engineers satisfaction of “completed” work. Minimize “friction”.
• Many issues are “no brainer” decisions that don’t require a full CCB
• Wish to minimize time spent on CCB meetings
• Wish to minimize mistaken judgment calls due to lack of the full picture or mature consideration.
Solution:
Train/Assign CCB Proxies which should be aware of the CCB criteria for decision and should be able to either reach a decision or know when to wait for full CCB.
These CCB Proxies should monitor the queue of CRs submitted to CCB and dispatch CRs according to the CCB criteria, or converse with the CR owner to get more information, or other stakeholders in case necessary.
CCB Proxy effectiveness should be reviewed periodically according to the following criteria:
• Adherence to CCB Criteria
• Results – How many regressions, whether CCB would have made a different decision
• Intimate review of random interesting decisions.

Variants:
• Dispatch the CR queue according to engineering domain – a proxy for each domain, usually a manager in that domain.
• Dispatch the CR queue using a peer system – a peer proxy for each domain, to avoid the situation where a manager approves his own group work. (sort of “peer review” system)
• PM is the CCB Proxy
• Lead QA stakeholder is the CCB Proxy
Resulting Context
• 80% of CRs should be dispatched/approved very quickly (decide on SLA). 20% will be according to classic CCB frequency.
• CCB Meetings will be shorter and more focused (to the relief of the attendees…), and potentially the frequency can be increased.
Related Patterns - CCB, Merge Early and Often (SCM)

Pattern: Heirarchical Triage for Incoming issues
Context: New issues (bugs/feature requests) are opened by interested stakeholders (QA, Customer Support, DEV, PM). Since resources are limited some business intelligence should be applied to decide which issues should be accepted into the work queue of which version (if any), and with what priority compared to other issues.
Problem: Cannot rely on engineering alone to come up with the business decision, OTOH waiting for PM or some sort of CCB committee introduces much latency/bureaucracy into the process
Forces:
• Wish to start working on high priority issues soon, to avoid working on lower priority issues while waiting for processing.
• Wish to have correct priorities and control the version contents (See CCB)
• Wish to minimize time in the decision queue.
Solution:
Priority decision should be assigned to the CCB process, using the same CCB Proxies described in “Distributed CCB” to dispatch the incoming issues queue.
Criteria for priority and version contents should again be decided and documented beforehand. They consist part of the “values” for decisions made by the proxies.
Issues which require more elaborate discussion shall be discussed in a periodical “Triage” meeting (can be in CCB meeting, or separate meeting)
Resulting Context:
• 80% of issues should be prioritized very quickly (decide on SLA). 20% will be according to “triage” meeting frequency.
• Minimum numbers of issues enter the work queue by mistake.
• Minimized feeling of bureaucracy among issue reporters and assignees.
Related Patterns: Distributed CCB, CCB

Pattern: UnderstandBeforeSchedule
Context: In classic issue tracking environments, issues are reported, and then scheduled for work (in a version). Some of the aspects of an issue include scope of change, estimated effort, impact on stability. This pattern deals with having sufficient input into the scheduling decision.
Problem: When scheduling is done without sufficient information regarding scope/estimated effort/impact, time will be spent on handling them, only to understand later on that they cannot be committed to the version (mainly due to CCB criteria). This is a waste of resources, and a source for frustration among the staff.
Forces:
• Scheduling effectively requires considerable input, which might require actual investigation/analysis by an engineer/developer
• Investigation/Analysis by engineers/developers is usually part of the work done AFTER scheduling the issue for one of the versions.
• Engineers/Developers apply pressure to commit issues they already solved, even to the detriment of the project health. Part of human nature.
• Tracking issues which require analysis is difficult when they are all in the same “unscheduled/new” state/queue.
Solution:
Add an “investigating” state/queue to the workflow. Issues should be in this state when they are pending an investigation by their owner. Exit criteria from this state is to have the required input into the scheduling process.
New issues can go to this state when insufficient scheduling input is available. When the scheduling input is available (either when reported, due to analysis, etc.) the next step is to schedule. Who schedules and according to what flow is out of the context of this pattern.
“Investigation” work stops when scheduling input is available, unless the work necessary to solve the issue is another minimal step, in which the work can be done all the way up to “resolve” (commit depends on the codeline policy and whether CCB approval is required).
Resulting Context:
• Added “investigating” state/phase/queue in the issue workflow
• Use either custom fields or comments to track the relevant scheduling input, according to the level of formality/tracking required.
• Engineers/Developers are comfortable with providing the analysis/investigation data, without going all the way to resolve the issue, knowing that the aim of the process is to utilize their time effectively.
• Shortcuts can be made whenever investigation is redundant.

Tuesday, August 08, 2006

Patterns for issue tracking

I recently spent some time on devising methodologies for software development lifecycle in my company, dealing with SCM (Version Control) and Issue Tracking.

I'm a big fan of patterns. my first encounter with them was with the POSA series (Pattern-Oriented Software Architecture, Volume 1: A System of Patterns
/ http://www.cs.wustl.edu/~schmidt/patterns-ace.html) when working on distributed systems.

As a fan of reuse, this was quite an important finding.

Later I encountered the SCM patterns. I read http://www.cmcrossroads.com/bradapp/acme/branching/ by Brad Appleton and understood, yet again, that much of what we were doing good was a pattern, and what we were doing wrong and were looking to improve was an anti-pattern. I also read his book Software Configuration Management Patterns: Effective Teamwork, Practical Integration

Software Reconstruction Patterns (http://www.cmcrossroads.com/bradapp/acme/repro/SoftwareReconstruction.html) are a related useful family of patterns.

I also encountered organizational/process patterns, but I admit to not grepping the concept fully so far (in the todo list...). See http://www.ambysoft.com/processPatternsPage.html#FAQ%20What%20are%20Process%20Patterns.

Now while trying to devise the Issue Tracking methodology, starting with a baseline documentation of how each group (recall we are one R&D group acquired by another) does its work, I felt the need for patterns in this domain, and wasn't able to find any so far.
So, I decided that while keeping on the lookout for a pattern repository for this domain, I will start documenting patterns on my own, and try to come up with a draft of the issue tracking pattern family. I'm sure it will be useful to myself in the future. Hopefully via discussion in the right community, it can evolve into a public body of knowledge.

Anyhow - the patterns I've thought of so far are below. I now see that one of the greatest challenges are naming them right - so they can be generic enough and still specific to what the context you are talking about. I'm trying to take some guidelines from the Gang of Four definition (see http://en.wikipedia.org/wiki/Design_pattern_%28computer_science%29)

Taxonomoy - Categories

v Deliverables Generation

Ø AutoInternalReleaseNotes

Ø AutoApplicableWorkaroundsList

v Process Flow

Ø Heirarchical Triage for Incoming issues

aim to make a distributed decision on 80% of the issues according to pre-discussed policies, but have a streamlined process for tracking and reaching a wise decision on the rest 20%.

Ø Resolve->Integrate->Release completed issues

Ø Understand scope and impact before committing to Schedule

Be able to track issues which need work in order to schedule, but are NOT to be solved unless are really trivial, instead are to be raised for a schedule decision/discussion

Ø Close everything

Having a closure phase for completed (fixed) issues as well as duplicates, invalids, wontfixes, etc.

Ø Match and document the actual workflow between people

§ Give leads/managers the ability to review work by their people and sign off on it (or reject it)

§ QA Lead confirms new bugs from QA

§ DEV Lead integrates fixes resolved by his people

Ø commit approval / CCB activity / code review process should be enabled by the issue tracker workflow

Ø Ownership is NOT a state. Current action phase IS.

Waiting for QA Reproduction - State or Ownership?

v Relationships between issues

Ø Track symptoms separately from change tasks ?

Ø How/When to divide issues

Ø Issue equivalent of "Release branch"

How to deal with issues that are relevant on multiple version, where their state might be different for each version, but most of the data is shared?

v Issue Meta-Data

Ø Track "resolved in" version automatically

Ø Establishing priority based (among other things) on Severity

Ø Track stage at which the bug was opened

Allows understanding of QA/DEV effectiveness at developing/releasing quality software.

Ø Track reproducability and reproduces of the issue

reproduce cases - via link to the test management ? Via sub-issues linked to the parent?

Ø Keywords might be better than Custom Fields

Ø Discern "introduced in" from "detected in"

v Interface to other processes

Ø Interface to SCM

§ Integration with Task-Level Commit

Ø Interface to Test case management

§ Track test case for each issue

· if test case opened the issue - to know what to run to test/verify/close

· if from field or exploratory - to track process of adding this to regression suite

Ø Interface to Project Management / Use as project management

Ø Interface to CRM

v Usability

v Useful Metrics

Ø AutoCustomerReleaseNotes

Ø FaultFeedbackRatio (Regression rate)

Ø Rate of bugs fixed in version they were opened for (>70%)

Ø Rate of bugs detected in the field (<5%?)

v Anti-patterns

Ø Metering people via bug counts

Ø Overloading states/fields for multiple purposes

Ø Over centralization of decision making

Don't let a workflow with many steps fool you into thinking that it requires many people. Use steps to track where you are. Use assignment to track who holds the issue. Don't assign upwards unless necessary.

Ø Tracking and metering by Components?

see http://www.anyware.co.uk/2005/2006/07/27/jira-issue-tracking-meets-tagging/



This is still very much a work in progress, but any comment or help is very much welcome.

What the other guys brought into the party...

As I mentioned earlier, before we were able to finalize our new development environment we were gobbled up (acquired) by another company, about 4 times the size of our group.

In the area of issue tracking, the bigger company was using TestDirector, with some customizations, but actually their processes were quite simplistic, and weren't enabling a truly effective R&D process.
For Test case management, they are using plain old Word/Excel but are now open to other options.

In the area of SCM btw both companies were using good ol' CVS and were quite sick of it. more on that later...


With this as the baseline, upcoming posts will try to describe the process for integrating SCM, Issue Tracking and choosing a Test Case Management solution agreeable and effective to all.

Test Automation!

At some point we understood we must have a robust test automation harness which can at least cover our smoke test and regression. This will help us feel more confident in our releases in less time, and allow meeting the business needs.

As we are an appliance based file-system product, in essence an IT infrastructure product, all of the commercially available harnesses from CA, Mercury and the like are useless, as they focus on GUI/Web automation, and we need API automation and the ability to run and control file system operations and file system testing tools.
We considered home-grown approaches but decided that the time-to-market is too long for our needs.
We considered adopting STAF/STAX (http://staf.sourceforge.net/index.php ) but again the custom work needed around it was estimated to be too long and required human resources and expertise we didn't have, and weren't available in the neighbourhood.

What we eventually chose was a testing
automation harness called Aqua
(http://www.aquasw.com/) and we are very satisfied with it.

Its still requires significant customizations/development in a project mode, rather than an off-the-shelf product, but for some situations its the best and only available approach today, and is much better than developing on your own from scratch, or testing manually.