Active Topics

 


Reply
Thread Tools
woody14619's Avatar
Posts: 1,455 | Thanked: 3,309 times | Joined on Dec 2009 @ Rochester, NY
#1
For those not aware, some users are having an issue with the graphics driver for the N900. It started back in PR1.1, and while changes have been made to try to prevent the problem it's still occuring for some people in PR1.3. (I'm one of them, lucky me.)

It seems to be triggered by a combination of heavy graphics usage, high CPU usage, and possibly GPS usage, though the latter isn't always the case. Users that play intense flash games in the browser, or use high use programs like mapping programs (modRana, Mappero, etc) seem to see this more often. It's listed in bug tracker here. Feel free to vote for it.

My biggest issue with this is that it can happen just about any time, even when CPU use is low. The end result is a dead phone, as the process jumps silently to 98% CPU usage, burns through the battery, and the device shuts down, usually without a warning signal. Short of carrying a spare battery, you can't even turn it back on before charging it.

To help stop this until Nokia fixes it, I made a little script that checks to see if the process, a kernel level driver process in this case, is eating a lot of CPU. It only does this every 30 seconds or so, and is rather non-invasive. If it detects activity, it watches it a little closer, and issues warnings via espeak if things are going bad, then reboots. I did this just in case you're actively using it (on a call, etc). After a few warnings, it reboots the device, to prevent the system from draining the battery.

Currently there's no way to repair this other than rebooting. It would be great if there were a way to re-init the chipset/driver involved, but it's not a module from what I can see. If the custom kernel folks could compile this driver as a module, that would rock, since the script could then just re-init (or unload/reload) the graphics driver, and we could re-start X or what not, vs rebooting the whole device.

For now, this script handles the immediate issue of reaching into ones pocket and finding a dead device with no battery left. :P I placed it in /usr/sbin/ on my device and made an RC script to auto-start it at boot. It needs to be run as root to pull off the reboot, so keep that in mind. Hope this helps those having this issue frequently.

PS: The script assumes you have espeak installed. If you don't, you may want to install it, or replace the espeak lines with whatever notification mechanism you want to have. Or just delete those lines if you don't care about being warned.
Attached Files
File Type: gz sgx_tracker.sh.gz (1.0 KB, 212 views)
 

The Following 6 Users Say Thank You to woody14619 For This Useful Post:
Posts: 1,042 | Thanked: 430 times | Joined on May 2010
#2
I encountered this problem on the past where it raises my CPU to 100%... I Overclocked my n900 to 1150mhz and still reaches to the optimum... I think removed QBW v 1.3 and then I never seen it again in conky till now
 

The Following User Says Thank You to Radicalz38 For This Useful Post:
Posts: 306 | Thanked: 106 times | Joined on Feb 2010
#3
I had this issue just few minutes ago for the first time. I have compiled my own kernel which has patches for iphb and ppp_async, and NAT enabled. Thanks for the script.
__________________
------------------------------------------------------------------
Voice choppy on sip calls
Please vote for bug number 10388
 

The Following User Says Thank You to rajil.s For This Useful Post:
woody14619's Avatar
Posts: 1,455 | Thanked: 3,309 times | Joined on Dec 2009 @ Rochester, NY
#4
I'm pretty sure this is a driver issue, which means anyone could see it. I think it just happens more often when one or more of the following is going on:
  • Program is doing CPU intensive work
  • System free memory is low
  • Graphic redraw is high (high frame rate apps)
  • GPS is active
  • Heavy Wifi/3G/GPRS data activity, and/or mode switching.

If you have an app doing several of these things at once, like a flash game (CPU/graphics/wifi), or a navigation app (GPS/CPU/memory), it's just more likely to happen.

My phone woke me a 5am today, warning of a reboot... Totally inactive, but sure enough sgx_misr was pegging the CPU & screen wouldn't turn on. It rebooted itself and everything was fine. Had I not had the script, it would have drained the battery (even while plugged in), and I'd have had a dead/off phone this morning when I awoke.
 

The Following 2 Users Say Thank You to woody14619 For This Useful Post:
Posts: 100 | Thanked: 408 times | Joined on Aug 2009 @ Helsinki
#5
Originally Posted by woody14619 View Post
To help stop this until Nokia fixes it, I made a little script that checks to see if the process, a kernel level driver process in this case, is eating a lot of CPU. It only does this every 30 seconds or so, and is rather non-invasive. If it detects activity, it watches it a little closer, and issues warnings via espeak if things are going bad, then reboots. I did this just in case you're actively using it (on a call, etc). After a few warnings, it reboots the device, to prevent the system from draining the battery.
Waking up the device every 30 seconds is not very power-friendly either. Maybe there is some way to tune the N900 watchdog?
__________________
Do you like mappero? Consider contributing with a donation!
So far, 673€ were donated by 26 people.
 

The Following 2 Users Say Thank You to mardy For This Useful Post:
F2thaK's Avatar
Posts: 4,365 | Thanked: 2,467 times | Joined on Jan 2010 @ Australia Mate
#6
Rufff..................!!!
 
Posts: 12 | Thanked: 2 times | Joined on May 2010
#7
Originally Posted by woody14619 View Post

To help stop this until Nokia fixes it, I made a little script that checks to see if the process, a kernel level driver process in this case, is eating a lot of CPU. It only does this every 30 seconds or so, and is rather non-invasive. If it detects activity, it watches it a little closer, and issues warnings via espeak if things are going bad, then reboots. I did this just in case you're actively using it (on a call, etc). After a few warnings, it reboots the device, to prevent the system from draining the battery.
The only issue I see with this script is the fact that I've only ever seen the bug after manually issuing a reboot. That said its possible there is no longer an issue with that.
 
woody14619's Avatar
Posts: 1,455 | Thanked: 3,309 times | Joined on Dec 2009 @ Rochester, NY
#8
Originally Posted by Vertikar View Post
The only issue I see with this script is the fact that I've only ever seen the bug after manually issuing a reboot. That said its possible there is no longer an issue with that.
While I'm glad that you only see this after a reboot, you are not the common case. In the bug tracker there's no indication that this happens after a manual reboot, and I can verify that I've had this happen on many occasions where I had not manually rebooted.

In fact, I've often had to replace the battery because this bug drained it to the point that the system asked for a time/date. On changing to spare battery, I've booted directly (no reboot involved) only to have it go back into this loop within an hour because I was then using it as a navigation aid.

Even if it did occur only after a reboot, devices need to be able to autonomously reboot themselves and return to a stable state. If they can't that there's a flaw in the driver that needs to either put the chipset in a known state before a system reboot, or be able to handle/reset the state it's in. (If it were properly programmed to do so, I suspect it wouldn't get into this state in the first place.)
 
Posts: 12 | Thanked: 2 times | Joined on May 2010
#9
Originally Posted by woody14619 View Post
Even if it did occur only after a reboot, devices need to be able to autonomously reboot themselves and return to a stable state. If they can't that there's a flaw in the driver that needs to either put the chipset in a known state before a system reboot, or be able to handle/reset the state it's in. (If it were properly programmed to do so, I suspect it wouldn't get into this state in the first place.)
IIRC (im not awake enough yet) one of the issues with the driver was that with a system reboot the chipset wouldn't properly clear the memory leaving it in a state where it was possible to see this issue without a huge amount of load. That said I haven't rebooted since PR1.3 came out.

Edit: Basically the driver needs fixing...either Nokia or someone else who's worked on it.

Last edited by Vertikar; 2010-12-07 at 19:10.
 
woody14619's Avatar
Posts: 1,455 | Thanked: 3,309 times | Joined on Dec 2009 @ Rochester, NY
#10
Originally Posted by mardy View Post
Waking up the device every 30 seconds is not very power-friendly either. Maybe there is some way to tune the N900 watchdog?
There are plenty of things that wake the system more often than this script. And again, this is a stop-gap measure, not a solution. I find this much more power friendly than reaching into my pocket to find my battery 100% drained because of a kernel bug...

I would love to be able to tune the watchdog to look for this. And I'm sure writing it as a native app in C would lower it's processing time further. If I could, I would have it hook into the frequency scaler to only bother checking when the CPU is running at top speed. But I don't know how to do that from a shell script. Is there a dbus message I can hook against to only wake when this is happening? I'm very open to changes.
 

The Following User Says Thank You to woody14619 For This Useful Post:
Reply


 
Forum Jump


All times are GMT. The time now is 20:03.