Reply
Thread Tools
Jaffa's Avatar
Posts: 2,535 | Thanked: 6,681 times | Joined on Mar 2008 @ UK
#41
Originally Posted by fms View Post
Yes, I remember the meeting. But I do not seem to remember any of the points proposed by Flandry agreed at that meeting. This is kinda troubling, too.
For cross-reference, the summary of the meeting, and the discussion with the most involved developers (i.e. maemo-developers) is here:

http://lists.maemo.org/pipermail/mae...er/022381.html

As VDVsx says, he's going to push through the agreed changes and improvements; but the person most familiar with the code (X-Fade) has been involved in other things (which hopefully are now mostly in the past).
__________________
Andrew Flegg -- mailto:andrew@bleb.org | http://www.bleb.org
 

The Following 5 Users Say Thank You to Jaffa For This Useful Post:
Texrat's Avatar
Posts: 11,700 | Thanked: 10,045 times | Joined on Jun 2006 @ North Texas, USA
#42
Originally Posted by fms View Post
Well it has long been agreed that 5 votes is usually sufficient. In fact, it was agreed in an IRC meeting months ago.
The number of testers should be driven by the degree of confidence you want vis-a-vis results. I can look at the statistical tables later and provide a good guideline. But regardless, I don't think it should be a guess or good feeling (not saying 5 or 10 are).

But roughly (this is for example only and in no way scientific) it would look something like 1 tester = 50% confidence that test results sufficiently address likely defects; 2 testers = 70%; 3 testers = 80%; 4 testers = 87%; 5 testers = 91%, etc (tends to be logarithmic).
__________________
Nokia Developer Champion
Different <> Wrong | Listen - Judgment = Progress | People + Trust = Success
My personal site: http://texrat.net
 

The Following 6 Users Say Thank You to Texrat For This Useful Post:
Jaffa's Avatar
Posts: 2,535 | Thanked: 6,681 times | Joined on Mar 2008 @ UK
#43
Originally Posted by Texrat View Post
But roughly (this is for example only and in no way scientific) it would look something like 1 tester = 50% confidence that test results sufficiently address likely defects; 2 testers = 70%; 3 testers = 80%; 4 testers = 87%; 5 testers = 91%, etc (tends to be logarithmic).
That's fascinating. Do you know of any articles/papers on this WRT software quality?
__________________
Andrew Flegg -- mailto:andrew@bleb.org | http://www.bleb.org
 
Posts: 5,795 | Thanked: 3,151 times | Joined on Feb 2007 @ Agoura Hills Calif
#44
I'd just like to say that anyone who sees the fMMs thread sees a thrilling example of how great software can be and is being developed here. It seems to me that it has almost nothing to do with officially established procedures, but is due to the common sense of one developer. The same goes for mymenu and, some time ago, the work of the liqbase guy.

I hope these are the models you are using for deciding how best to handle software here. Maybe you see more sides of this issue than I do, but these are shining examples. Of course, the developers I mentioned above aren't the only heroes out there, but I want this message short.
 

The Following 5 Users Say Thank You to geneven For This Useful Post:
Flandry's Avatar
Posts: 1,559 | Thanked: 1,786 times | Joined on Oct 2009 @ Boston
#45
FWIW both the required karma and quarantine are stored in fields of a repository record. Here is where the check is made. This is the schema for the data record.

Should be easy enough to change those two values should someone with access actually care to do so...
__________________

Unofficial PR1.3/Meego 1.1 FAQ

***
Classic example of arbitrary Nokia decision making. Couldn't just fallback to the no brainer of tagging with lat/lon if network isn't accessible, could you Nokia?
MAME: an arcade in your pocket
Accelemymote: make your accelerometer more joy-ful
 

The Following 2 Users Say Thank You to Flandry For This Useful Post:
Texrat's Avatar
Posts: 11,700 | Thanked: 10,045 times | Joined on Jun 2006 @ North Texas, USA
#46
Originally Posted by Jaffa View Post
That's fascinating. Do you know of any articles/papers on this WRT software quality?
I'll look for some. I've only applied it to product quality (the concept is called AQL or Acceptable Quality Level) but I'll see if I can find something relevant to our use.
__________________
Nokia Developer Champion
Different <> Wrong | Listen - Judgment = Progress | People + Trust = Success
My personal site: http://texrat.net
 

The Following User Says Thank You to Texrat For This Useful Post:
VDVsx's Avatar
Posts: 1,070 | Thanked: 1,604 times | Joined on Sep 2008 @ Helsinki
#47
Originally Posted by Flandry View Post
FWIW both the required karma and quarantine are stored in fields of a repository record. Here is where the check is made. This is the schema for the data record.

Should be easy enough to change those two values should someone with access actually care to do so...
What are the benefits of changing these values ?
You get a lot more apps in Extras, but the quality will be lower for sure.
In the beginning I was also a bit against the quarantine time, but had to change my mind after see skilled testers finding big blockers in apps with 10+ thumbs during the quarantine period.
__________________
Valério Valério
www.valeriovalerio.org

Last edited by VDVsx; 2010-01-19 at 01:41. Reason: typo
 
Texrat's Avatar
Posts: 11,700 | Thanked: 10,045 times | Joined on Jun 2006 @ North Texas, USA
#48
ooo... found good stuff on software quality!

Typical metrics:
http://www.scribd.com/doc/7010681/So...uality-Metrics

Formal softtware testing (outside the scope of most projects here, but I found good material in its 732 pages:
http://digi.physic.ut.ee/tanel/books...g.eBook-KB.pdf

So far I've come across vague statements asserting that more testers can increase confidence levels in results, but nothing matching what I suggested yet. However, the following describes a methodology for building a software test plan and may be useful:

http://www.lucas.lth.se/publications...erssonCdoc.pdf
__________________
Nokia Developer Champion
Different <> Wrong | Listen - Judgment = Progress | People + Trust = Success
My personal site: http://texrat.net
 
Flandry's Avatar
Posts: 1,559 | Thanked: 1,786 times | Joined on Oct 2009 @ Boston
#49
Originally Posted by VDVsx View Post
What are the benefits of changing these values ?
You get a lot more apps in Extras, but the quality will be lower for sure.
In the beginning I was also a bit against the quarantine time, but had to change my mind after see skilled testers finding big blockers in apps with 10+ thumbs during the quarantine period.
It leads to better software, faster.

Going back to the original point of this thread--we should do it because it was what was proposed and, i gather, agreed upon at the IRC meeting. In any case, the examples you gave are a good example of how the present system doesn't work, not that it does. There should not have been 10 thumbs up if there were blockers. I would venture to guess that the 10 thumbs up were popularity votes and not from the "testers group" that i advocated we adopt.

Anyway, my interest in this is that when an update for an app is ready, especially a trivial update, and it improves upon the current Extras version, making it go through the same level of scrutiny as the original version is a waste of time and discourages thorough testing in the cases where it really is called for. In other words, it leads to people being careless and cavalier in testing and missing blockers.

I would even go so far as to say that requiring 10 tests may be less secure than requiring 5, for the threefold reason that the tester is more likely to be complacent when there are 9 others to pick up the slack, there are more tests to get done, and because the dev is more likely to get fed up with the process and recruit the "testers" in less than helpful ways (which seems to be a fairly common practice). I have resisted telling people to "try and and thumb up" because i believe in following the mandated procedure, but i don't happen to believe that this one is very effective.

I'd be fine with just reducing the karma and quarantine to 50% in the case of an app already in Extras. 5 days and 5 tests is more than enough, especially if the dev and the testers are both coming into it as a positive thing (a chance to scare out bugs) and not an onerous and unrealistic burden. Five or even one real test is immensely better than ten cursory tests.
__________________

Unofficial PR1.3/Meego 1.1 FAQ

***
Classic example of arbitrary Nokia decision making. Couldn't just fallback to the no brainer of tagging with lat/lon if network isn't accessible, could you Nokia?
MAME: an arcade in your pocket
Accelemymote: make your accelerometer more joy-ful

Last edited by Flandry; 2010-01-19 at 02:44.
 

The Following User Says Thank You to Flandry For This Useful Post:
RevdKathy's Avatar
Posts: 2,173 | Thanked: 2,678 times | Joined on Oct 2009 @ Cornwall, UK
#50
Reading this, and the karma thead has made me wonder if we need a smarter tool than 'thumbs up' for these things. Especially now we have a lot more people on board who don't actually know what a 'thumbs up' should signify. I'm going to be a pain an open a thread on that, which will relate to both these issues.
__________________
Hi! I'm Kathy and I'm a Maemo Greeter! Welcome.
Useful links for newcomers: New members say hello , New users start here, Community subforum, Beginners' wiki page, Maemo5 101, Frequently Asked Questions (FAQ)
Did you know Meego.com has forums too?
 

The Following User Says Thank You to RevdKathy For This Useful Post:
Reply

Tags
extras-tesing, finishing the job, quality assurance, quarantine, software quality, user testing

Thread Tools

 
Forum Jump


All times are GMT. The time now is 21:30.