![]() |
Re: The Testing is half empty
Quote:
http://lists.maemo.org/pipermail/mae...er/022381.html As VDVsx says, he's going to push through the agreed changes and improvements; but the person most familiar with the code (X-Fade) has been involved in other things (which hopefully are now mostly in the past). |
Re: The Testing is half empty
Quote:
But roughly (this is for example only and in no way scientific) it would look something like 1 tester = 50% confidence that test results sufficiently address likely defects; 2 testers = 70%; 3 testers = 80%; 4 testers = 87%; 5 testers = 91%, etc (tends to be logarithmic). |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
I'd just like to say that anyone who sees the fMMs thread sees a thrilling example of how great software can be and is being developed here. It seems to me that it has almost nothing to do with officially established procedures, but is due to the common sense of one developer. The same goes for mymenu and, some time ago, the work of the liqbase guy.
I hope these are the models you are using for deciding how best to handle software here. Maybe you see more sides of this issue than I do, but these are shining examples. Of course, the developers I mentioned above aren't the only heroes out there, but I want this message short. |
Re: The Testing is half empty
|
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
You get a lot more apps in Extras, but the quality will be lower for sure. In the beginning I was also a bit against the quarantine time, but had to change my mind after see skilled testers finding big blockers in apps with 10+ thumbs during the quarantine period. |
Re: The Testing is half empty
ooo... found good stuff on software quality!
Typical metrics: http://www.scribd.com/doc/7010681/So...uality-Metrics Formal softtware testing (outside the scope of most projects here, but I found good material in its 732 pages: http://digi.physic.ut.ee/tanel/books...g.eBook-KB.pdf So far I've come across vague statements asserting that more testers can increase confidence levels in results, but nothing matching what I suggested yet. However, the following describes a methodology for building a software test plan and may be useful: http://www.lucas.lth.se/publications...erssonCdoc.pdf |
Re: The Testing is half empty
Quote:
Going back to the original point of this thread--we should do it because it was what was proposed and, i gather, agreed upon at the IRC meeting. In any case, the examples you gave are a good example of how the present system doesn't work, not that it does. There should not have been 10 thumbs up if there were blockers. I would venture to guess that the 10 thumbs up were popularity votes and not from the "testers group" that i advocated we adopt. Anyway, my interest in this is that when an update for an app is ready, especially a trivial update, and it improves upon the current Extras version, making it go through the same level of scrutiny as the original version is a waste of time and discourages thorough testing in the cases where it really is called for. In other words, it leads to people being careless and cavalier in testing and missing blockers. I would even go so far as to say that requiring 10 tests may be less secure than requiring 5, for the threefold reason that the tester is more likely to be complacent when there are 9 others to pick up the slack, there are more tests to get done, and because the dev is more likely to get fed up with the process and recruit the "testers" in less than helpful ways (which seems to be a fairly common practice). I have resisted telling people to "try and and thumb up" because i believe in following the mandated procedure, but i don't happen to believe that this one is very effective. I'd be fine with just reducing the karma and quarantine to 50% in the case of an app already in Extras. 5 days and 5 tests is more than enough, especially if the dev and the testers are both coming into it as a positive thing (a chance to scare out bugs) and not an onerous and unrealistic burden. Five or even one real test is immensely better than ten cursory tests. |
Re: The Testing is half empty
Reading this, and the karma thead has made me wonder if we need a smarter tool than 'thumbs up' for these things. Especially now we have a lot more people on board who don't actually know what a 'thumbs up' should signify. I'm going to be a pain an open a thread on that, which will relate to both these issues.
|
Re: The Testing is half empty
Quote:
No matter what happened to your wallpapers, if someone is using a copyrighted image without permission then that is a blocker. |
Re: The Testing is half empty
Quote:
Maybe the community thinks that using for instance a Google logo in a free app is not that bad, but this thought doesn't change the fact that there is a copyright infringement. Personally I don't think that this will bring down Google's business but also personally I think it's the interest of the community to stay away from potential trouble with lawyers. This is why community projects like Debian are very very careful with these topics, even if many times it looks like it's not worth the hassle. About Load Applet, I love the app and is one of the first ones I install. However, nobody can deny that it's causing confusion to maemo.org users: Bug 5780 - Screencast video is shown as completely black when played back Filed on 2009-10-25. Two duplicated bugs filed as well. Two Talk threads: [Load Applet]: Remove from Extras Working video/audio screencasting in load-applet A fork only showing the status area icon was created by another developer: http://maemo.org/downloads/product/M...cpumem-applet/ Certainly not the end of the world, but it wouldn't be either the end of the world if the app was pulled back to extras-testing if the blocker is found and the developer can't provide a quick fix. |
Re: The Testing is half empty
Simple way to raise awareness of what the thumbs up should signify would be to, like the bugtracker, include the checklist pasted by default in the comment form and also maybe clearly couple the comment box with the thumbing.
This won't do anything to the lack of people testing or improve the process but might alleviate the popularity contest issue. |
Re: The Testing is half empty
IMHO the extras-testing warning should read differently than beware, here be dragons.
Mayb something like Quote:
|
Re: The Testing is half empty
Quote:
First on the page there should be a listing showing the overall checklist status for a package. Below is a mockup: (-2/5 shows how many tester are needed "smiling" for that task to be finished. In this case I assumed that smiley is +1 and unhappy is -3.) Tasks Done [4/8] ... 3. [1/1 :)] Announced features available. 4. [-2/5 :):mad:] Working provided features. * FAIL: When exporting file the program crashes (see bug: http://url/456) 5. [1/1 :)] No performance problems. ... Tester would have a list of tasks which are not finished with voting interface. By pressing a [+] on top of the list all tasks would be shown so tester is able to give additional confirming or denying vote. The tester would see her previous vote and should be able to change it. Additionally the tester should be able to write a comment for each testing task - this would be required for unhappy face. Additionally a tester should be able to give general comment. ... Testing [+] show all tasks 4. [tested ok |v] Working provided features. [Concentrated on importing functionality ] 7. [not tested |v] No known security risks. ... When all tasks would be done the application would go automatically to extras if guarantee time would be over and there wouldn't be any unhappy faces. If there would be unhappy face, some über-tester should decide if the issue is a blocker or not. If a package which was accepted before would be sent to the testing queue for bugfix/update release, it would have all tasks as done expect for the features and some randomly selected task. If application description would be the same as before, announced features task would be done. This way the testing would concentrate on functionality, and acceptance on other areas would be still checked from time to time. I would remove the word karma from the testing - people have karma due to their activities, applications have acceptance testing tasks. Like in this case, all tasks should be finished, not that there is 10 smileys for "no performance problems" and it's all ok. |
Re: The Testing is half empty
Testing should have it's own forum on this site where developers can make a topic about their app, and we can leave quick feeback etc there. Also, developers could ask people on the forum to test certain things etc. Keeping direct conact between devs and testers!
|
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
I've read most of this thread, and I'm still confused as to how ever vote or demote an app that is currently in Extras?
Using your example of Load-applet. That should definitly not be in extras cause it doesn't have the functionality it should. Another one that I have found is MasterGear which is in Extras-testing. Now for me and loads of others this emu doesn't ever open any ROMs. In my eyes that is failed testing, and should be moved back. However it's not clear on how or what we as community members need to do for this to happen. Thanks |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
If an app goes up it must be possible to come down :p |
Re: The Testing is half empty
A 10 day quarantine may be too long for some and I appreciate the developer(s) need to see the package in Extras quickly but in the real world, I don't see how 10 days go against the 'release early release often' mantra. 10 days before an app reaches a large number of users is not too long IMO. Perhaps the time can be reduced by a bit and maybe more for updates. But on the other end, 5 days for new apps is not enough. We must provide a 'weekend opportunity' since most of us use our free time to contribute towards testing. Reducing it to 7 days would be preferable and enable testers at least a weekend to come around to the app.
I agree with the assessment that the current process has too many 'open to interpretation' areas that have brought the process very close to abuse. 1) Packages without bug-trackers These are the most common. But again, this is a grey area depending on the size of the project. small project, like wallpapers etc may be not all that necessary. Its not difficult to find many of these apps with >10 thumbs up (e.g. Easy-chroot). 2) Optification These are easily caught but again there is a grey area on how much should an app take (including/excluding dependencies) before it is categorised as not-optified. The sad part is both the above checks can be easily automated (build? promotion?) to save energy downstream. Requirement for a bug-tracker or a mailto link would be easy to check. I am sure some simple rules can be applied for checking optification. The process as it is today is good enough but its not properly written up. Lack of clarity in QA-Checklist doesn't help justifying a thumbs-down against a popular app. The idea of having a dedicated super-testers group is also a good one to "override" the judgement of testers. but this complicates the process further IMO but perhaps a necessary step as more people begin to rate without a firm understanding the goals of extras-testing. Maemo community apps stand for reliability and authenticity. I hope we can iron-out the process and come to a consensus quickly. |
Re: The Testing is half empty
Quote:
Note that two of the most popular linux distros use the current packaging format and forking away from that is a really bad idea. The maemo build system is already too different from debian (i.e. optify.) |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
As it happens, I made a Brainstorm this morning about this issue. I had no idea that this was already been discussed. But RevdKathy kindly pointed me to this thread. As my defence, Talk was down when I did it, so I could not search the forum. ;)
Anyway, since this is been discussed, let me offer my opinion too. These are the issues I have with the current system (and some of them have already been mentioned): 1. A new version in Extras-testing resets package Karma/quarantine time For example, I got this app in Extras-testing. I know that there is a little mistake on the actual help file. It say "time you want to countdown from" instead of "count up from". Normally I would just correct this in few minutes. But with the current system I would loose the Karma/quarantine time of the app. I already lost them once, so I'm not going to do that again. Granted, it won't affect the functionality of the app, but it might confuse some end-users. However, I still think that the user would much rather have it sooner instead of having a grammatically correct help file 10 days later (that is, if it would get so many votes, which is highly unlikely, so the real wait might be several weeks. In other words, this current system discourages updates! 2. Why can't bug reporting page be automated? The preferred place for bug reporting is Bugzilla. Fine. I have nothing against that. However, with the current method, I have to fist release my package, send an email to request a Bugzilla page for my package, wait for the creation of that page and finally release another package with the correct reference to the Bugzilla page. Why could there not be a simple way to auto create the proper Bugzilla place, like the Optify - Auto setting? If some additional pages or settings are needed, only then could a request be sent. 3. There are no testers in Extras-devel A developer needs testers. But those are not available in Extras-devel. Especially since the user is warned that his device will explode if he activates this repository filled with malicious apps from mad developers who are secretly palning to take over the world. :D So the only real testing can happen in Extras-testing. Therefore there should be two stages: Beta testing A stage where the developer can get valuable feedback from users and thus improve the app. Updates that add features are allowed. Release candidate A stage that is initiated by the developer himself. Yes, not all developers are clinically insane megalomaniacs! Some of them actually take bride of the quality of their work and don't want to release a product that is not working well. ;) In this stage, only bug fixing and some minor needed functionality changes would be allowed. For these two stages, there should be some kind of unified voting system. What system? I have no idea and I'm tired of typing. :p Anyway, this was only my opinion. Not that great, I know, but at least it's mine. :D |
Re: The Testing is half empty
Quote:
Quote:
Here's what happens in the rare case that one of the solicited user testers tries to jump through the requested hoops rather than just going and giving a thumbs up: Quote:
Quote:
|
Re: The Testing is half empty
Quote:
In your first comment you mentioned an "obvious" case, and it's this "obvious" level what the community can assume without legal training. My only point is that legal problems are troublesome for the community just as they are for Nokia, while in your comment it seemed that you were putting all the responsibility and reasons to be concerned on Nokia alone. |
Re: The Testing is half empty
It would be interesting to know the amount of downloads per app in Extras-testing. This would give some indication how many of those who download the app, bother to actually vote for it.
|
Re: The Testing is half empty
Back to the quarantine issue... in my brief study of software QA materials the past couple of days, one thing that stuck was the emphasis on lines of code (LOC). Granted, sheer LOC doesn't tell you everything you need to know about an app, but it's a decent, rough indicator of complexity.
So maybe LOC and/or compiled file size could be *one* factor that drives quarantine length. We could create demarcation points every-so-many-kilobytes for instance and relate them to days in quarantine (eg, 1 day per 25K file size or 2500 LOC, etc). Does anyone know of any best practices in this regard? We're not the first to travel this path... |
Re: The Testing is half empty
What if these Super Testers (or moderators... whatever they should be called) would have the ability to override the quarantine? I mean, if an app or update would not have any votes from these Mega Beings, the normal quarantine would be in place. But, provided that the app has already enough voters (which, btw, should be 5 ;)) then one of the Celestial Creatures would be able to give it a green light, if he feels that no further testing would be needed.
|
Re: The Testing is half empty
There's merit to that, Sasler, but my thinking is we need to first step all the way back to the entry point of the process and start applying rational methodology as opposed to numbers and actions driven by warm fuzzies. "Jailbreaking" quarantine might just ensure more bad apps get out. We need to add more meaning to process steps... especially do what we can to ensure the proposed 5 testers aren't thumbing up or down based on like or dislike.
|
Re: The Testing is half empty
Quote:
When a new app reaches a point that it'sfairly functional to get a general idea, a Talk thread is opened for it. Next, say 5 testers are selected. These would then communicate with the developer of any issues and ideas. They would also fill the "Good Quality Check-list". When the developers is happy with the app and all the testers agree that all the points are adequately met, they would then unanimously vote for promotion to Extras. Now guarantee would be needed. Of course, there should be proper reward to encourage this kind for this kind of commitment. For example, Karma based on the stars and downloads. The developer would get half of it and the testers would all get a fifth of the remaining half. Or something like that. The important thing would be that this Karma would only be give after the app has been released, so it would encourage active involvement from start to end. Also, the star multiplier, would encourage quality apps. However, to avoid any hasty releases, just for the sake of some easy Karma. There should be in place some kind of penalty system. That is, is a critical fault is found after the app has been promoted to Extras, it will be demoted again until it will be fixed. And, depending of the gravity of the fault and how obvious it should have been, part of the received Karma would be removed. A part of it would be returned when the problem is fixed again. |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
I like seeing all the different suggestions here about what to do for testing. I feel like those Frosted Miniwheats... the professional side of me wants to see a great amount of testing done on every app... and the fun side of me wants it all ASAP! |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
Quote:
*I* do not see a problem here.. make backups though. |
Re: The Testing is half empty
Quote:
|
Re: The Testing is half empty
I added the following points to the QA improvement wiki page:
Is this the right place for these suggestions? |
Re: The Testing is half empty
Quote:
Can you elaborate a bit more in the first point, don't know if I'm understanding it well. Do you mean something like use cases for the app ? Who should provide this list ? |
All times are GMT. The time now is 13:21. |
vBulletin® Version 3.8.8