Thursday, August 31, 2006

Rosie has the test management blues as well...

Rosie Sherry writes a very interesting blog focused on software testing.
One of these days I'll point to some of the interesting blogs I'm reading regularly.

In any case, One of her recent posts was "Hunt for Test Case Management System" where she discusses the lack of a killer test management solution, but tries to outline some alternatives.

Those interested should go over there and take a look...

Tuesday, August 29, 2006

QA Effort Effectiveness

How do you know your QA effort is being effective ?

Based on the different stakeholders which require input from the QA a typical answer might be that Product quality is high when released to customers.

Assuming that is indeed more or less what someone expects (I'd say effective QA needs to answer to some other requirements as well) how does one go about checking whether the product quality is indeed high?

Those who reached a fairly intermediate level of QA understanding would easily point out that the percentage "QA Misses" (namely, the number of issues missed in QA and detected in the field) should be below a certain threshold. A high number here means simply that too many issues/bugs are not detected during the entire QA coverage only to be embarrassingly detected by a customer.

If one naively optimizes just for this variable, the obvious result is a prolonged QA effort, aiming to cover everything and minimize the risk. If no reasonable threshold is set, there is a danger of procrastinating and avoiding the release.
See The Mismeasure of Man of a cool article on abusing measurements in the software world...

Of course, a slightly more "advanced" optimization is to open many many bugs/issues so the miss ratio will become smaller due to the larger bugs found in QA, not due to missing less bugs. This can result in a lot of overhead for the QA/PM/DEV departments as they work on analyzing, prioritizing and processing all those bugs.
Did I forget to factor in the work to "resolve/close" those issues? NO! Several of those issues might indeed be resolved and verified/closed, but those are probably issues that were not part of the optimization but part of a good QA process (assuming your PM process manages the product contents effectively and knows how to enforce a code-freeze...).

My point is that there are a lot of issues that are simply left there to rot as open issues, as their business priority is not high enough to warrant time for fixing them or risking the implications of introducing them to the version.

A good friend has pointed this phenomena to me a couple of years ago, naming it "The Defect Junk Factory" (translated from hebrew). He meant that bugs which are not fixed for the version on which they were opened on, indicate that the QA effort was not focusing on the business priorities. The dangers of this factory is a waste of time processing them, and the direct assumption that either the QA effort took longer because it spent time on these bugs, or that it missed higher business priority bugs when focusing on these easy ones.
Kind of the argument regarding speed cameras being placed "under the streetlight" to easily catch speed offenders (with doubtful effect on overall safety), but all the while missing the really dangerous offenders.

So what can be done? my friend suggested measuring the rate of defects that are NOT fixed for that version. The higher this number, the more your QA effort is focusing on the wrong issues.
Just remember that this is a statistical measure. Examining a specific defect might show that it was a good idea for the QA to focus there, and the fix was avoided due to other reasons. But when looking across a wide sample, its unreasonable that a high number of defects are simply not relevant. If not a QA focus issue, something else is stinking, and is worth looking at in any case.

Another factor of an effective QA is fast coverage. What is fast? I don't have a ratio of QA time related to development time. Its probably a factor of the type of changes (Infrastructure, new features, Integration work) done in the new version as each type has a different ratio of QA to DEV effort. (e.g. kernel upgrade usually requires much more QA compared to DEV effort )
Maybe one of the readers has a number he's comfortable with - I'd love to hear.
What I do know is that version-to-version the coverage time should become shorter, and that the QA group should always aim to shorten this time further without significantly sacrificing overall quality. I expect QA groups to do risk-based coverage, automation for regression testing, and whatever measures which assist them in reducing the repeatable cost of QA coverage at the end of each version. The price/performance return on reducing the QA cycle is usually worth it to some extent.

To sum up, a good QA effort should:

Minimize QA misses
Minimize the defect junk factory
Minimize QA cycle time without compromising quality

What do you think is a good QA effort? How are you measuring it?

Severity and Priority - The Debate

There are a couple of alternatives for managing severity and priority in the Issue Tracker.
Although there are many resources out there on this subject (see http://del.icio.us/yyeret/priority_severity) I’ll try to consolidate them and provide my 2c on the matter, as I think its an important subject.

Single-field Priority
First, seemingly simpler alternative, is Single-field priority – representing Customer Impact
The idea here is to only have a single priority/severity field. The reporter assigns it according to his understanding of the customer impact (severity, likelihood, scenario relevance, etc.). Product Management or any other business stakeholder can shift priority according to current release state, his understanding of the customer impact according to the described scenario. The developers prioritize work accordingly.
The strength of this approach is in its simplicity, and the fact that several issue trackers adopt this methodology and therefore support it better “out of the box”.
The weakness is that once in the workflow the original reasoning for the priority can get lost, and there is no discerning between the customer impact and other considerations such as version stability, R&D preferences, etc.

An example why this is bad? Lets say Keith opened bug #1031 with a Major priority. Julie the PM later decided that since there is some workaround and we are talking just about specific uses of a feature rarely used, the business priority is only Normal or Minor. Version1 is released with this bug unresolved. When doing planning for Version2 Julie missed this bug since its priority is lower.
Even if the feature its related to is now the main focus of the version. Even if not missed, looking for this bug and understanding the roots and history is very hard, especially consider the database structure of issue trackers. History is available, but its not as easy as fields on the main table of issues…

Brian Beaver provides a clear description of this approach at Categorizing Defects by Eliminating "Severity" and "Priority":
I recommend eliminating the Severity and Priority fields and replacing them with a single field that can encapsulate both types of information: call it the Customer Impact field. Every piece of software developed for sale by any company will have some sort of customer. Issues found when testing the software should be categorized based on the impact to the customer or the customer's view of the producer of the software. In fact, the testing team is a customer of the software as well. Having a Customer Impact field allows the testing team to combine documentation of outside-customer impact and testing-team impact. There would no longer be the need for Severity and Priority fields at all. The perceived impact and urgency given by both of those fields would be encapsulated in the Customer Impact field.

Johanna Rothman in Clarify Your Ranking for System Problem Reports talks about single-field risk/priority:

Instead of priority and severity, I use risk as a way to deal with problem reports, and how to know how to fix them. Here are the levels I choose:
o Critical: We have to fix this before we release. We will lose substantial customers or money if we don't.
o Important: We'd like to fix this before we release. It might perturb some customers, but we don't think they'll throw out the product or move to our competitors. If we don't fix it before we release, we either have to do something to that module or fix it in the next release.
o Minor: We'd like to fix this before we throw out this product.

bugzilla-style Severity+Priority

Here, the idea is to use a severity field for technical risk of the issue, and a priority field for the business impact. The Reporter assigns severity according to the technical description of the issue, and also provides all other relevant information - frequency, reproducability, likelihood, and whether its an important use-case/test-case or not. Optionally, the Reporter can assign priority based on the business impact of the issue to the testing progress. e.g. if its a blocker to significant coverage, suggest a high priority. If he thinks this is an isolated use case, suggest a lower priority. A business stakeholder, be it PM, R&D Management, etc. assigns priority based on all technical and business factors, including the version/release plan. Developers work by Priority. Severity can be used as a secondary index/sort only.
Developers/Testers/Everyone working on issues should avoid working on high-severity issues with unset or low priority. This is core to the effectiveness of the Triage mechanism and the Issue LifeCycle Process
Customers see descriptions in release notes, without priority or severity. Roadmap communicated to customers reflects the priority, but not in so many terms.

Strengths of this approach are:
* Clear documentation of the business and technical risks, especially in face of changing priorities.
* Better reporting on product health when technical risk is available and not hidden by business impact glasses.
* Less drive for reporters to push for high priority to signify they found a critical issue. It’s legitimate to find a critical issue and still understand that due to business reasons it won't be high priority.
* Better accommodation of issues that transcend releases - where the priority might change significantly once in a new release.

The weaknesses are that it’s a bit more complex, especially for newbies, and might require some customization of your issue tracker, although if your tool cannot do this quite easily, maybe you have the wrong tool…
In addition, customers have trouble understanding the difference between their priority for the issue and the priority assigned within the product organization. The root cause here is probably the lack of transparency regarding the reasoning behind the business priority. I’d guess that if a significant part of the picture is shared, most customers would probably understand (if not agree) with the priorities assigned to their issues. Its up to each organization to decide where it stands on the transparency issue. (see Tenets of Transparency for a very interesting discussion on the matter in the wonderful weblog of Eric Sink)

To see how our example works here – Keith will open the bug, assign a major severity, and a low priority since the bug blocks just one low-priority test case. Julie the PM sees the bug, and decides to assign a low priority value, so the bug is left for future versions for all practical matters. When planning V2 Julie goes over high-severity issues related to the features under focus for the version, and of course finds this issue as it’s a Major severity.

See the following resources for this approach:
* http://c2.com/cgi/wiki?DifferentiatePriorityAndSeverity
* Priority is Business; Severity is Technical:

business priority: "How important is it to the business that we fix the bug?" technical severity: "How nasty is the bug from a technical perspective?" These two questions sometimes arrive at the same answer: a high severity bug is often also high priority, but not always. Allow me to suggest some definitions.

Severity is levels:
o Critical: the software will not run
o High: unexpected fatal errors (includes crashes and data corruption)
o Medium: a feature is malfunctioning
o Low: a cosmetic issue

Priority levels:
o Now: drop everything and take care of it as soon as you see this (usually for blocking bugs)
o P1: fix before next build to test
o P2: fix before final release
o P3: we probably won't get to these, but we want to track them anyway

* Corey Snow commented on Clarify Your Ranking for System Problem Reports:
Comment: Great subject. This is a perennial topic of debate in the profession. The question at hand is: Can a defect attribute that is ultimately irrelevant still serve an important function? Having implemented and/or managed perhaps a dozen different defect tracking systems over the years, I actually prefer having both Priority and Severity fields available for some (perhaps) unexpected reasons. Priority should be used as the 'risk scale' that the author describes. 3 levels, 5 levels, or whatever. Priority is used as a measure of risk. How important is it to fix this problem? Label the field 'Risk' if that makes it more clear. Not so complicated, right? So what good is Severity? Psychology! Its very existence makes the submitter pause to consider and differentiate between the Priority and Severity of the defect. In other words, without Severity, the submitter might be inclined to allow Severity attributes to influence the relative Priority value. Example 1: Defect causes total system meltdown. Only users in Time Zone GMT +5.45 (Kathmandu) are affected on leap years. There is one user in that time zone, but there is a manual workaround, and a year to fix it besides. Priority=Super Low, Severity=Ultra High Severity gives a place for the tester to 'vent' about their spectacular meltdown, without influencing the relative Priority rating. Example 2: Defect is a minor typo. Typo in on the 'Welcome to Our Product' screen, which is the first thing every user will see. Priority=Ultra High, Severity=Super Low Again, Severity gives a place for the tester to express how unimportant the defect is from a functional perspective, without clouding their Priority assessment. I once managed a defect tracking system with only a Priority field. This frequently led to a great deal of wasted time in defect discussion meetings as one side would argue about Severity attributes while another would argue about Priority attributes, but the parties were not even aware of the distinction that was actually dividing them. Having both fields serves to head off this communication problem, even if Severity is completely irrelevant when fix/no fix decisions are actually made. ~ Corey Snow (03/11/03)

Author's Response: Corey, Great counterpoint to my argument. ~ Johanna Rothman (03/12/03)

Personal Favorite
As can probably be understood by now, My personal favorite is the Severity+Priority approach. I confess I don’t have much experience with the single priority approach, but I really feel the Severity+Priority way is very effective, without significant costs once every stakeholder understands it.

What is your favorite here?

Favorite resources - round I

As those who read my posts probably noticed already, I'm quite a heavy user of del.icio.us. I won't go into what it is, am sure those interested can go there or google it to see whether they like it or not.
I'm playing around with Google Notebook as an alternative, with better google integration obviously, albeit less taxonomy/tagging capabilities.

In any case, I highlighted some of my favourite resources under the rndblog_resources tag, and provided some notes to accompany the links and explain why I find them essential in the favorites list of anyone interested in the contents of this blog (and probably for some people who are NOT that interested in this blog, but then again, they won't find be here...)

There are more gems in my account, so probably expect future rounds based on existing and new resources I find. I'd love to hear about more resources along those lines - either comment or suggest them to me via the del.icio.us network.
Anyone interested to track my favorites is welcome to join my network

Now for the resources themselves (copy paste from a del.icio.us linkroll page)

» Articles - James Bach - Satisfice, Inc.
Articles and other resourcers from one of the thought leaders in the software testing/QA world.
» Bret Pettichords Software Testing Hotlist
compilation of many interesting testing resources
» Brian Marick - Writings
Articles and other resourcers from one of the thought leaders in the software testing/QA world.
» Can we ship yet
Very interesting view on the multiple streams/versions problem, counting on SCM capabilities to ease the problem.
» Cem Kaner - Publications
Articles and other resourcers from one of the thought leaders in the software testing/QA world.
» CM Crossroads - Home
The ultimate independent resource for CM material
» Cultural Fits and Starts
From the hiring expert, a focus on the cultural aspects (too often ignored)
» Data Driven Test Automation Frameworks
Thought-inducing paper on how testing harnesses should look like. Very relevant when thinking about Automation
» DefectTrackingPatterns
Initial work on defect tracking patterns
» Eric Sinks Weblog
A remarkable blog mixing development and business/marketing, especially applicable to all of us in small companies - both ISVs and startups.
» Grove Consultants Software Testing
Great material, the highlight being an interview test for testers Ive been using successfully.
» Hacknot
Anonymous writer provides cynic homurous view on issues which will sound right at home for any manager of developers/testers
» Hacknot - Interview With The Sociopath
As usual, a cynic but beneficial view, this time a spin on interviewing.
» High-level Best Practices in SCM
Interesting practices/patterns provided by Perforce but with quite a generic appeal
» Interviews
Highlights of Interviews with leading people in the testing world. Use as pointers - if someone is interesting go read their works...
» Joel on Software
» Quality Tree Software, Inc. - Publications
A single location for the works of one of my favorite content-providers on stickyminds.com
» Rothman Consulting Group, Inc. - Writings & Presentations
Articles and other resourcers from one of the thought leaders in the Product Management / Software development/Testing / Agile world
» Slashdot | Bug Tracking Across Multiple Code Streams?
Interesting view on how to handle the multiple code streams issue tracking problem. Not groundbreaking, but a good summary on what approaches are being used out there
» Software Development, Computer Programming, Software Design - developer.* - DeveloperDotStar.com
a repository for lots of development-related essays/articles, focusing on a pragmatic view of the complex world we work in
» Software Reconstruction: Patterns for Reproducing Software Builds
very interesting pragmatic work on the relatively uncharted domain of software builds
» Software Testing and Quality Assurance Online Forums
Used mainly for looking for tool-related answers, quite legacy-focused in my oppinion.
» Source Control HOWTO
basic concepts of SCM from the developer of SourceVault ( A VSS replacement system)
» StickyMinds Home Page
Stickyminds is a vast repository of essays from various thought-leaders in the development/QA world. It’s the web presence of “Better Software Magazine” which is an interesting magazine (but probably not worth the subscription these days)
» Streamed Lines: Branching Patterns for Parallel Software Development
the groundbreaking pattern work for SCM world by Brad Appleton
» The Guerrila Guide to interviewing
a classic from Joel on interviewing

Thursday, August 24, 2006

David V. Lorenzo posts favorite interviewing questions of people on his Career Intensity Blog

Here is his post about mine...

At the risk of hinting the people who I interview in the future, also check out my interviewing tag on del.icio.us for a lot of resources on the matter.

Why am I open about this?
One of my main beliefs in interviewing btw is to try and understand behaviourial aspects in addition to skills. Someone might get a head start for the skills questions if he prepares, but I think in that area if someone is diligent enough to research his interviewer, go and read multiple resources, learn enough to know the subject, he's getting extra credit right out of the gate...
For the behaviourial aspects the discussion is more flowing, and no preparation can really help you there.

I cannot finish a post about interviewing without mentioning Johanna Rothman. She's writing the Hiring Technical People blog, and wrote the Hiring The Best Knowledge Workers, Techies & Nerds: The Secrets & Science Of Hiring Technical People book. Check it out.

Monday, August 21, 2006

Fogbugz best practices and other resources

I intend to post some reference to resources I'm fond of in the area of R&D, QA, methodology and the like.

In the meantime, anyone who's interested in what I have to say will probably see some value in looking at FogBugz Online Documentation. I was referred there by Zeljko Filipin's blog which is aggregated by http://www.testingreflections.com.

In addition, my del.icio.us links might provide good resources. see
DefectTracking QA SCM accurev bugtracking bugzilla build ccb continuous_integration continuousintegration defect_management defect_tracking defecttrackingmultiplebranches developement development eclipse issue_tracking jira keyword_driven_automation methodologies methodology multiple_versions patterns perforce priority_severity product_management software subversion test_automation test_labs test_management testing testing_labs vendorbranches

Sunday, August 20, 2006

QA/DEV Protocols - Opening high quality bugs

In another post in the series about QA/DEV protocols, I'll talk about opening high quality bugs, why its important, what are the forces operating on each side of the trench here, and try to describe an approach that might improve the state of affairs a bit.

First - a definition. What is a high quality bug? To be clear, we are talking a bug report of course. The quality here refers to the accuracy of the scenario, describing exactly what is necessary to reproduce, not more, not less. It refers to providing all the auxiliary information required to analyze the bug and start working to a resolution. It also aims to report A single bug, not several issues.

It might be easier to convey the point by showcasing some examples for low quality bug reports:

Missing logs
Logs of different components are not time-synched, with no way to understand the time-space relationship. (This is relevant mainly for distributed systems )
Errors happened, but are not mentioned explicitly in the bug report
Bug report focuses on analysis, not on reporting the facts. Analysis is a bonus for QA engineers, only relevant AFTER reporting the full details.
Much happened on the system, a couple of different scenarios, and the bug is hidden somewhere in piles of logs/information.
Unclear bug report, leading to difficulty to prioritize and understand by business people (PM) and DEVs.
A complex long scenario is reported while the bug is reproducable via a simple short one.
The reported severity doesn't match what really happened, leading to "cry wolf" or serious issues masked as trivialities.
Multiple bugs in the same report
Numbers - Avoid using statements like "very large" or "a lot of time". Always include the numbers you are talking about. What seem large to you may seem small to someone else, or vice versa.

Also check out FogBugz - The Basics of Bug Tracking

Now that we have deducted what a high quality bug report is, we can try to understand the forces influencing the people opening bugs and why sometimes low quality bug reports do happen:

When QA people find a bug, they want to report it and move on. Sometimes they feel they are metered by quantity not quality, sometimes they actually are...
Especially for hard cases, the scenario is not that clear, and indeed there is some mix of events (including a full moon on a friday the 13th for the real nut cases) that cannot be easily reduced to a simple scenario. Trying to do this without the internal understanding of a DEV guy might take very long without being very effective.
QA engineers are human. When the test setup/teardown is complex and requires attention to many small details (clear logs, sync time, grep for patterns in logs, etc.), things will get lost from time to time.
In some cases, the QA group or a specific engineer is not aware of the price of low quality bug reports. (point him here...). DEV guys might not be able to put a finger on it either, or are just entrenched and prefer to point fingers and exchange emails instead of working to establish a protocol.

So what can be done?

Discuss and educate - like I hinted, sometimes the most important step is to talk, map the expectations and root causes, and agree on a protocol, with the relevant SLAs.
Assist QA by providing small automated snippets that can assist with test setup/teardown/analysis, guides them thru the steps to a high quality report, and really leaves them with the important step of reducing the scenario to the minimum. (btw, its possible to do the scenario reduction in automated testing harnesses as well, by retracting steps and verifying health and expected results very frequently)
Work with very granular test cases - minimizing the scenario length. Still, combining different test cases in parallel will add complexity, but when the building blocks are small, its better than nothing.
Issue tracking system should guide the reporter thru the important information/steps to a high quality report.
DEVs should provide contructive feedback - when bug reports are below par quality, and when they are above. Do it privately when below quality, and publicly when above.
Do "peer review" of bug reports when relevant - for rookie QA engineers, for difficult bugs, etc.
In hard cases call in a DEV and get his advice on what needs to be done to make sure the report has its best chance to become a high quality one.

Any other ideas?

QA/DEV Protocols - Calling developers to the lab

I'm going to dedicate a couple of posts to the relationships between QA and Development (DEV) organizations.

Anyone who's ever been in either of those organizations knows that sometimes there seems to be a conflict of interest between QA and DEV, which can lead to friction between the groups and the people. Obviously when both organizations are running under the same roof, there must be some joint interest/goal, but the challenge is to identify the expectations of each group in order to work toward their goal and accomplish their mission effectively.

The difficult cases are those that put more strain on one party, in order to optimize the effectiveness of another. Example - developers are asked to unit/integrate/system test their software before handing over a build to QA. Some developers might say that this is work that can be done by QA, and their time is better spent developing software. The QA engineers will say that they need to recieve stable input from the DEVs in order to streamline the coverage progression, and that the sooner issues are found, the lower the cost to fix them.

One way to look at these "protocols" between the groups is via the glasses of TOC (Theory of Constraints), identify the bottlenecks of the overall system/process, and fine-tune the protocol to relieve the bottleneck. People in those groups, and especially the leaders, should be mature enough to know that sometimes doing the "right thing" might be to take on more work, sometimes even not native work for their group.

One example is the issue of when to ask DEV guys to see problems the QA engineers have discovered.
Reasons for calling DEV might be:

Wish to reopen a bug
Bug was reproduced and a developer was interested to see the reproduction.
New severe bug

There are a couple of forces affecting this issue:

QA wishes to finish the context of the specific problem/defect, open the bug, and get on with their work.
DEV wishes to finish the context of their specific task, and wish to avoid the "context switch" of looking at the QA issue.
In general, both QA and DEV have learned to wave the "Context Switching Overhead" flag quite effectively. (A more pragmatic conclusion is that some context switching overhead is unavoidable, and sometimes the alternative is more expensive...)
In some cases, "saving" the state of the problem for asynchronous later processing by DEV is difficult or takes too many resources to be a practical alternative.

A possible compromise between all those forces is to define some sort of SLA between the groups, stating the expected service provided by DEV to QA according to the specific situation (Reopen, Reproduced, New Severe, etc.). This SLA can provide QA a scope of time they can expect answers in, without feeling they are asking for personal favours or "bothering" the DEVs. The developers get some reasonable time to finish up the context they were in without feeling they are "avoiding" QA. The SLA can also cover the expected actions to be taken by QA before calling in the DEV, or in parallel to waiting for them. This maximizes the effectiveness of the DEV person when he does free up for looking at the issue, while better utilizing the time of the QA while waiting. (for example - fill the bug description, look for existing similar bugs, provide the connectivity information into the test environment, log excerpts/screenshots, etc.)

Another question is who to call on when QA needs help. The options here depend on the way the DEV group/teams share responsibility on the different modules of the system.

In case there are strict "owners" for each module, and they are the only ones capable of effectively assisting QA, the only reasonable choice is to call on them... this requires everyone to always be available at some level.
I have to say though that I strongly advise against such an ownership mode. Look at code stewardship for a better alternative (in my oppinion) and see below how it looks better for this use case and in general...
In case there is a group of people that can look at each issue, one alternative is to have an "on call" cycle where people know they have QA duty for a day/week. In this case there will be issues which will require some learning on their part, and perhaps assistance from the expert on a specific area. This incurs overhead, but is worth its price in gold, when the time comes and you need to support that area in real time, need to send the DEVs to the field to survive on their own, or when the owner/expert moves on...

To sum up, like in many h2h (human-to-human) protocols, understanding the forces affecting both sides of the transaction is key to create a win-win solution. A pragmatic view trying to minimize the prices paid and showing the advantages of the solution to both sides and to the overall organization can solve some hard problems, as long as people are willing to openly discuss their issues and differences. I've seen this work in my organization, hopefully it helps others as well.

Some greenpepper

As I previously hinted in "Building a test case management solution", I'm personally of the opinion that the holy grail in test case management is in finding the way to manage tests via an issue tracker database.

In the time since that post I didn't find much information about this, and didn't see tools taking this approach.

Therefore, it was great to stumble upon Agile06 - François Beauregard - GreenPepper Software - a podcast discussion where I learned about greenpepper, a test automation and management system developed by Pyxis Technologies which closely integrates with JIRA to create an issue tracker solution for managing test cases, and adopts a Fitnesse-like approach (on drugs) to table-driven testing, over the Confluence wiki. I liked the choice of tools to integrate with as well as the pragmatic simple ideas.
Last week Frank and Christian demonstrated the system to me and a colleague. I was impressed, and would recommend anyone with interest in the test management or issue tracking domain to track those guys. I know I will.

One issue I've been thinking over regarding both Fitnesse and GreenPepper is how to take those tools that are focused on one-shot automated testing, and adapt them to track manual testing documentation, results, etc. Finding a good way to solve this problem might assist with adoption of those tools/frameworks in environments which are not the classic agile web development environment I suspect are the majority of adopters at this point.

Monday, August 14, 2006

Tracking Issues for Multiple Releases

Pattern:     TrackingIssuesForMultipleReleases
Context:     Multiple versions are being actively developed/maintained. New issues are discovered on one version, and their status and progress needs to be tracked on multiple versions, with minimal overhead but maximal accuracy.
Active versions – active branches where a build was already issued and documented, and new builds are planned.
Problem:     When a new issue is discovered, need to understand start of applicability (when it was introduced into the product) and end of applicability on each version branch (when it was solved). Due to branching, the state might be complex. Tracking the workflow for solving the issue on each branch/version is also complex when working with a naïve model. In addition, managing this can add considerable overhead – unchecked, this can lead to an explosion of bureaucracy and tracking overhead, leading to lack of faithful representation as a backlash.
Forces:

Want to accurately visualize the status of each issue on each active version, and whenever new versions are created.

Want to preserve a relationship between the same issue in the various versions, so progress/understanding from previous work can be reused (reproduction efforts/success, solution, workarounds, etc.)

Assist project/product management in tracking issues that are active in each version.

Allow workflow to proceed for each version on a standalone basis (e.g. QA wants to verify issue is closed on all active versions)

Solution:
When applicability is determined (usually when R&D analyzes the issue), use Cloning capability in issue tracker to create a new issue for each active version. The clone has all of the data of the original issue, including a clone link to the original issue. The “applicable in version” for the clone should be the active version that the clone was created for. The “Fix for version” should be a milestone/planned version on the same version branch.

Resulting Context:

List of open issues for each version lists all issues. No need to calculate list of issues based on input from other versions.

Workflow can proceed on each version in parallel.

Naively, even if issue is about to be solved, clones are still created, and will be resolved independently even if based on the same promoted changeset.

Variants: (see below)
Related Patterns:

Pattern:     JustInTimeCloning
Context:     See TrackingIssuesForMultipleReleases
Problem: When working with TrackingIssuesForMultipleReleases, cloning overhead is substantial, and not always necessary. Unchecked, this can lead to an explosion of bureaucracy and tracking overhead, leading to lack of faithful representation as a backlash.
Forces:

Allow TrackingIssuesForMultipleReleases

Want to minimize overhead for reporters, developers, QA verification. Aim to avoid O(N) processing overhead per the number of active versions, whenever possible.

Separate workflow is needed only for versions which were already delivered to QA.

Visualizing version contents at this level is needed only when delivering to QA and beyond.

Need to track which versions contain a fix and which don’t.

Motivation to merge original fix to all applicable version branches as soon as possible while still in context

Motivation to focus on version branch you are working on, and avoid the overhead of merging/integrating/testing to other versions.

For each version, the following might be the case regarding original fix applicability:

It might apply cleanly or with minor modifications, in which case the motivation is to apply it as soon as possible while still in context, and in which case the need for QA verification is lower (while still required depending on the version state)

It might not be applicable, and require a whole new solution. In this case the motivation is usually to track the issue as open for the version, and leave it to the appropriate time.

It might not be applicable, due to irrelevance of the issue on the version (e.g. feature cancelled, whole new behaviour).

Solution:
Add a “Next version state” field in the issue tracker, with the following options:

OPEN – issue is open for the next version

INTEGRATED – a fix for the issue was applied in the next version

CLONED – the issues was already cloned to the next version so no need to track it here

UNKNOWN – state in the next version is unknown

N/A – issue is non-applicable in next version due to irrelevance (see forces above)

CLOSED – optional. In case QA/others want to signify that the solution was not only integrated but already verified/closed, so no need to do verification once its cloned to the new version.

Lets assume 2 version branches – V1, V2, with V1 currently at V1.10. V2 first build will be V2.1.
When applicability is determined (usually when R&D analyzes the issue), decide how to proceed with marking/cloning based on the following criteria:

If the issue was detected on V1 branch, BEFORE V2.1 was created (meaning the version is still only being developed, QA didn’t see it yet, no release notes, etc.), mark the issue as OPEN in next version, but don’t clone yet.

If the issue was detected on V1 branch, AFTER V2.1 was created (meaning the version is being actively tested, and the contents of each build are being tracked, regression is monitored, etc.), clone the issue to V2, and mark it as CLONED

If the issue was detected on V2 branch, clone it to V1, since V1 is already being tracked closely.

Whenever an issue is not applicable to the next version, mark it as N/A in the next version.

NOTE: Most issues on V1 branch will be detected BEFORE V2.1 is created, but several will indeed be detected while both versions are being actively maintained. (Hopefully 80%/20% rule applies here).
NOTE: Issues detected on the newer V2 branch before they were seen on V1 are usually a result of additional QA coverage, or a stroke of luck (another type of QA coverage). This is the minority case here.

When a solution is being integrated to V1, aim to promote it to V2 branch as well. If the issue was cloned, do the relevant workflow for the clone as well. If only marked as OPEN, mark the issue as INTEGRATED in next version.
If a solution was found for V1 but its integration is delayed due to CCB approval or any other process which is heavier for a frozen branch, integrate it to V2 and mark it as INTEGRATED.

When delivering the first V2.1 build to QA go over all issues marked as OPEN on next version (those for which a fix wasn’t already integrated on both V1 and V2) and clone them to V2.
When QA wants to reverify all issues that were integrated, clone all INTEGRATED issues as well, but avoid cloning CLOSED issues (optional)

Resulting Context:

Versions already being tracked show the full list of applicable issues and their state

Versions yet to be tracked will show the full list of issues once tracking is started.

Overhead of cloning is minimized to the periods of time when two or more versions are being tested/tracked concurrently.

Process Flow Patterns

Following up on my "patterns for issue tracking" post, here is Deeper documentation of some of the Process Flow patterns. I will try to follow up from time to time with documentation of the patterns. Once the knowledge base is more or less complete I will probably consolidate it into an article/whitepaper/wiki format.

Pattern: Configuration Control Board
Aliases: Change Control Board, Configuration Management Board, CCB
Context: A development group is trying to control content/configuration of its product
Problem:
Conflicts between different stakeholders (PM, QA, DEV, etc.) and their motives can make the answer to “what is best” for the product version a complex one, and the group needs to provide the best business answer considering all aspects.
Forces:
• Every Change Request (CR) has a price – some sort of regression risk depending on scope and delicacy of the change. The risk is accompanied by the testing effort needed to verify/close the CR.
• Some CRs are required to meet release criteria.
• Change Requests (CRs) for Bug fixes potentially improve stability
• CRs for enhancement/features potentially increase user satisfaction or open new markets.
• Time to market tries to force fastest possible implementation of each CR.
• Developers want to implement CRs according to “good architecture practices”
Solution:
Classical CCB (Need to find the authoritative definition…)
But in general:
Define a board comprised of stakeholders for the product including engineering, business/user (PM), QA, management. Stakeholders should be knowledgeable and with enough authority in their domain. This is the CCB
CRs will be submitted to the CCB by engineering. They will be discussed by the CCB, in either periodical or ad-hoc meetings, and a decision will be made and communicated to the relevant parties.
Decisions take into consideration the pros and cons of each CR, the context of the product/version, and make a business decision.
Resulting Context:
• CRs cannot be committed/completed but need to be queued
• Once approved CRs should be completed/committed by either the original engineer or a RE (Release Engineer)
• Rejected CRs will be completed/committed for future versions or dropped altogether.
An issue tracking system enables streamlined CCB operation and tracking of its decisions.

Pattern - Distributed Configuration Control Board
Aliases - CCB Proxy
Context -
A development group is trying to control content/configuration of its product, without slowing down or losing context too much
Problem -
In classic CCB the latency between submitting an issue to CCB and its approval/rejection takes significant time (there is a limit to the feasible frequency of CCB meetings, even when willing to be ad-hoc).
This time the CR is not integrated, losing ContinuousIntegration time and conflicting with “Merge Early and Often”.
In addition the engineer gets farther and farther away from the context of the CR as he assumes other work.
In addition the time necessary for discussing all CRs in CCB meetings is expensive when considering the number of members and the depth required to make intelligent decisions.
Forces:
• Wish to minimize time between CR readiness and commit time:
o Meet other possibly conflicting CRs as soon as possible (Merge Early and Often)
o Deal with issues as closer to context as possible (minimize context switch cost)
o Raise engineers satisfaction of “completed” work. Minimize “friction”.
• Many issues are “no brainer” decisions that don’t require a full CCB
• Wish to minimize time spent on CCB meetings
• Wish to minimize mistaken judgment calls due to lack of the full picture or mature consideration.
Solution:
Train/Assign CCB Proxies which should be aware of the CCB criteria for decision and should be able to either reach a decision or know when to wait for full CCB.
These CCB Proxies should monitor the queue of CRs submitted to CCB and dispatch CRs according to the CCB criteria, or converse with the CR owner to get more information, or other stakeholders in case necessary.
CCB Proxy effectiveness should be reviewed periodically according to the following criteria:
• Adherence to CCB Criteria
• Results – How many regressions, whether CCB would have made a different decision
• Intimate review of random interesting decisions.

Variants:
• Dispatch the CR queue according to engineering domain – a proxy for each domain, usually a manager in that domain.
• Dispatch the CR queue using a peer system – a peer proxy for each domain, to avoid the situation where a manager approves his own group work. (sort of “peer review” system)
• PM is the CCB Proxy
• Lead QA stakeholder is the CCB Proxy
Resulting Context
• 80% of CRs should be dispatched/approved very quickly (decide on SLA). 20% will be according to classic CCB frequency.
• CCB Meetings will be shorter and more focused (to the relief of the attendees…), and potentially the frequency can be increased.
Related Patterns - CCB, Merge Early and Often (SCM)

Pattern: Heirarchical Triage for Incoming issues
Context: New issues (bugs/feature requests) are opened by interested stakeholders (QA, Customer Support, DEV, PM). Since resources are limited some business intelligence should be applied to decide which issues should be accepted into the work queue of which version (if any), and with what priority compared to other issues.
Problem: Cannot rely on engineering alone to come up with the business decision, OTOH waiting for PM or some sort of CCB committee introduces much latency/bureaucracy into the process
Forces:
• Wish to start working on high priority issues soon, to avoid working on lower priority issues while waiting for processing.
• Wish to have correct priorities and control the version contents (See CCB)
• Wish to minimize time in the decision queue.
Solution:
Priority decision should be assigned to the CCB process, using the same CCB Proxies described in “Distributed CCB” to dispatch the incoming issues queue.
Criteria for priority and version contents should again be decided and documented beforehand. They consist part of the “values” for decisions made by the proxies.
Issues which require more elaborate discussion shall be discussed in a periodical “Triage” meeting (can be in CCB meeting, or separate meeting)
Resulting Context:
• 80% of issues should be prioritized very quickly (decide on SLA). 20% will be according to “triage” meeting frequency.
• Minimum numbers of issues enter the work queue by mistake.
• Minimized feeling of bureaucracy among issue reporters and assignees.
Related Patterns: Distributed CCB, CCB

Pattern: UnderstandBeforeSchedule
Context: In classic issue tracking environments, issues are reported, and then scheduled for work (in a version). Some of the aspects of an issue include scope of change, estimated effort, impact on stability. This pattern deals with having sufficient input into the scheduling decision.
Problem: When scheduling is done without sufficient information regarding scope/estimated effort/impact, time will be spent on handling them, only to understand later on that they cannot be committed to the version (mainly due to CCB criteria). This is a waste of resources, and a source for frustration among the staff.
Forces:
• Scheduling effectively requires considerable input, which might require actual investigation/analysis by an engineer/developer
• Investigation/Analysis by engineers/developers is usually part of the work done AFTER scheduling the issue for one of the versions.
• Engineers/Developers apply pressure to commit issues they already solved, even to the detriment of the project health. Part of human nature.
• Tracking issues which require analysis is difficult when they are all in the same “unscheduled/new” state/queue.
Solution:
Add an “investigating” state/queue to the workflow. Issues should be in this state when they are pending an investigation by their owner. Exit criteria from this state is to have the required input into the scheduling process.
New issues can go to this state when insufficient scheduling input is available. When the scheduling input is available (either when reported, due to analysis, etc.) the next step is to schedule. Who schedules and according to what flow is out of the context of this pattern.
“Investigation” work stops when scheduling input is available, unless the work necessary to solve the issue is another minimal step, in which the work can be done all the way up to “resolve” (commit depends on the codeline policy and whether CCB approval is required).
Resulting Context:
• Added “investigating” state/phase/queue in the issue workflow
• Use either custom fields or comments to track the relevant scheduling input, according to the level of formality/tracking required.
• Engineers/Developers are comfortable with providing the analysis/investigation data, without going all the way to resolve the issue, knowing that the aim of the process is to utilize their time effectively.
• Shortcuts can be made whenever investigation is redundant.

Tuesday, August 08, 2006

Patterns for issue tracking

I recently spent some time on devising methodologies for software development lifecycle in my company, dealing with SCM (Version Control) and Issue Tracking.

I'm a big fan of patterns. my first encounter with them was with the POSA series (Pattern-Oriented Software Architecture, Volume 1: A System of Patterns
/ http://www.cs.wustl.edu/~schmidt/patterns-ace.html) when working on distributed systems.

As a fan of reuse, this was quite an important finding.

Later I encountered the SCM patterns. I read http://www.cmcrossroads.com/bradapp/acme/branching/ by Brad Appleton and understood, yet again, that much of what we were doing good was a pattern, and what we were doing wrong and were looking to improve was an anti-pattern. I also read his book Software Configuration Management Patterns: Effective Teamwork, Practical Integration

Software Reconstruction Patterns (http://www.cmcrossroads.com/bradapp/acme/repro/SoftwareReconstruction.html) are a related useful family of patterns.

I also encountered organizational/process patterns, but I admit to not grepping the concept fully so far (in the todo list...). See http://www.ambysoft.com/processPatternsPage.html#FAQ%20What%20are%20Process%20Patterns.

Now while trying to devise the Issue Tracking methodology, starting with a baseline documentation of how each group (recall we are one R&D group acquired by another) does its work, I felt the need for patterns in this domain, and wasn't able to find any so far.
So, I decided that while keeping on the lookout for a pattern repository for this domain, I will start documenting patterns on my own, and try to come up with a draft of the issue tracking pattern family. I'm sure it will be useful to myself in the future. Hopefully via discussion in the right community, it can evolve into a public body of knowledge.

Anyhow - the patterns I've thought of so far are below. I now see that one of the greatest challenges are naming them right - so they can be generic enough and still specific to what the context you are talking about. I'm trying to take some guidelines from the Gang of Four definition (see http://en.wikipedia.org/wiki/Design_pattern_%28computer_science%29)

Taxonomoy - Categories

v Deliverables Generation

Ø AutoInternalReleaseNotes

Ø AutoApplicableWorkaroundsList

v Process Flow

Ø Heirarchical Triage for Incoming issues

aim to make a distributed decision on 80% of the issues according to pre-discussed policies, but have a streamlined process for tracking and reaching a wise decision on the rest 20%.

Ø Resolve->Integrate->Release completed issues

Ø Understand scope and impact before committing to Schedule

Be able to track issues which need work in order to schedule, but are NOT to be solved unless are really trivial, instead are to be raised for a schedule decision/discussion

Ø Close everything

Having a closure phase for completed (fixed) issues as well as duplicates, invalids, wontfixes, etc.

Ø Match and document the actual workflow between people

§ Give leads/managers the ability to review work by their people and sign off on it (or reject it)

§ QA Lead confirms new bugs from QA

§ DEV Lead integrates fixes resolved by his people

Ø commit approval / CCB activity / code review process should be enabled by the issue tracker workflow

Ø Ownership is NOT a state. Current action phase IS.

Waiting for QA Reproduction - State or Ownership?

v Relationships between issues

Ø Track symptoms separately from change tasks ?

Ø How/When to divide issues

Ø Issue equivalent of "Release branch"

How to deal with issues that are relevant on multiple version, where their state might be different for each version, but most of the data is shared?

v Issue Meta-Data

Ø Track "resolved in" version automatically

Ø Establishing priority based (among other things) on Severity

Ø Track stage at which the bug was opened

Allows understanding of QA/DEV effectiveness at developing/releasing quality software.

Ø Track reproducability and reproduces of the issue

reproduce cases - via link to the test management ? Via sub-issues linked to the parent?

Ø Keywords might be better than Custom Fields

Ø Discern "introduced in" from "detected in"

v Interface to other processes

Ø Interface to SCM

§ Integration with Task-Level Commit

Ø Interface to Test case management

§ Track test case for each issue

· if test case opened the issue - to know what to run to test/verify/close

· if from field or exploratory - to track process of adding this to regression suite

Ø Interface to Project Management / Use as project management

Ø Interface to CRM

v Usability

v Useful Metrics

Ø AutoCustomerReleaseNotes

Ø FaultFeedbackRatio (Regression rate)

Ø Rate of bugs fixed in version they were opened for (>70%)

Ø Rate of bugs detected in the field (<5%?)

v Anti-patterns

Ø Metering people via bug counts

Ø Overloading states/fields for multiple purposes

Ø Over centralization of decision making

Don't let a workflow with many steps fool you into thinking that it requires many people. Use steps to track where you are. Use assignment to track who holds the issue. Don't assign upwards unless necessary.

Ø Tracking and metering by Components?

see http://www.anyware.co.uk/2005/2006/07/27/jira-issue-tracking-meets-tagging/

This is still very much a work in progress, but any comment or help is very much welcome.

What the other guys brought into the party...

As I mentioned earlier, before we were able to finalize our new development environment we were gobbled up (acquired) by another company, about 4 times the size of our group.

In the area of issue tracking, the bigger company was using TestDirector, with some customizations, but actually their processes were quite simplistic, and weren't enabling a truly effective R&D process.
For Test case management, they are using plain old Word/Excel but are now open to other options.

In the area of SCM btw both companies were using good ol' CVS and were quite sick of it. more on that later...

With this as the baseline, upcoming posts will try to describe the process for integrating SCM, Issue Tracking and choosing a Test Case Management solution agreeable and effective to all.

Test Automation!

At some point we understood we must have a robust test automation harness which can at least cover our smoke test and regression. This will help us feel more confident in our releases in less time, and allow meeting the business needs.

As we are an appliance based file-system product, in essence an IT infrastructure product, all of the commercially available harnesses from CA, Mercury and the like are useless, as they focus on GUI/Web automation, and we need API automation and the ability to run and control file system operations and file system testing tools.
We considered home-grown approaches but decided that the time-to-market is too long for our needs.
We considered adopting STAF/STAX (http://staf.sourceforge.net/index.php ) but again the custom work needed around it was estimated to be too long and required human resources and expertise we didn't have, and weren't available in the neighbourhood.

What we eventually chose was a testing
automation harness called Aqua
(http://www.aquasw.com/) and we are very satisfied with it.

Its still requires significant customizations/development in a project mode, rather than an off-the-shelf product, but for some situations its the best and only available approach today, and is much better than developing on your own from scratch, or testing manually.

Test Case Management progress - TD?

About 8 months ago we chose TD (TestDirector) for Test Case Management as we needed to employ something fast and the Director of QA I brought in had experience with it, while the open source tools we looked at didn't convince us at that point, and seemed a risk we didn't want to take, considering the other work necessary in other QA areas at the time.

NOTE: I wonder how much money Mercury made just from people in this situation...
Anyhow we are now documenting test cases and managing testing progress with the tool, with moderate satisfaction. The lack of integration to our issue tracker (both Bugzilla and JIRA) is a concern, as well as the high admission price per user, and the feeling that we are working with a glorified MSAccess inside a web browser (I wonder why...).

Lack of support for Linux machines, Firefox, and in general the fact that Mercury (sorry, HP now... http://www.mercury.com/us/company/pr/press-releases/072506-hp-acquires-mercury.html) is evidently considering itself more of an IT Governance / BTO company, and outgrew its QA roots, doesn't make this move seem very strategic.

I'm not saying TD is bad, its just not the right solution for small dynamic groups, which want to integrate new solutions when they solve business problems, and where much of the environment is open-source or small vendors, not the gorillas of the Enterprise Tools segment.

We made a tactical move knowingly, but there might be room for change, as we are not fully invested in the tool at this point.

JIRA Progress - an easy decision...

Sometime before the acquisition, we decided on JIRA as the issue tracker and started evaluating it, including history migration, a bit of customization, and trying to understand what processes need to be in place in order to be effective and deal with the complexities we were seeing regarding real world software development and maintenance.
This was an easy decision, as we really really like JIRA the more we look at it. The main advance in the JIRA space this year was in my view the plethora of plugins that became available, and provide functionality we needed, as well as convince us of the strategicness of the JIRA/Atlassian ecosystem.
Another easy decision was migrating to Confluence, the atlassian enterprise Wiki, for our knowledge base and documentation platform.

Another thing we are seeing is commercial vendors in the development tools space (SCM, Build management, etc.) integrating with atlassian. This is very positive and not surprising, considering the openness of the platform, the reach into many customers, and I guess the BizDev efforts of the atlassian guys, which I think try to be Best of Breed for issue tracking and wiki, while relying on very strong partnerships with every sexy major player/community in the space (not necessarily the heavy-weight guys like Mercury/CA) to provide other functionality people expect.

Update on many fronts... JIRA, Aqua, org changes and the like...

Long time no post.

I'll try to recap where I stand regarding the issues I started talking about a year ago, at least for a bit of closure regarding the issue tracking and test case management areas.
Lets see if I can keep it up this time...

In any case, my company was gobbled up by another software company, and so somewhere in the middle of upgrading our development environment infrastructure, our plans were disrupted, but one cannot complain... Now the new company R&D department needs to have an issue tracking and test case management, with the added challenge of integrating two methodologies, histories, and views on the world. I was tasked with this project. in the following posts I will try to address several aspects of this project.