Reply
Thread Tools
Posts: 109 | Thanked: 91 times | Joined on Dec 2007
#1
I'm running the latest version of OS2008 on my N810, and having consistent problems booting from the internal mmc card.

The boot process hangs at the "Nokia" splash screen--the progress bar fills the screen, but nothing happens after that. The problem is consistent with or without the charger in place, and the tablet fails to continue booting after several hours.

I don't think there's any problem with the mmc format or data--booting from the internal flash allows me to acces the mmc card with no errors.

Here's what I've done to try to trace the boot process:
  • Changed the bootmenu.conf script to save the "dmesg" output to /tmp. There are no errors shown in the dmesg output.

  • Added an "echo" statement to each of the /etc/init.d scripts, logging each script name to a file in /tmp, in order to determine if the boot process was hanging in a script. Nothing is logged, but this may be because files written to mmc2:/tmp when booting from mmc2 are not available when I later reboot from internal flash to view the logfile.

So, are there any suggestions for additional ways to trace and troubleshoot the boot process? What happens in the Maemo boot process after the init scripts complete that can cause the process to hang, and how can I get more detail on what step is hanging?

Thanks!

Last edited by z2n; 2008-02-02 at 08:32.
 

The Following User Says Thank You to z2n For This Useful Post:
Posts: 2,152 | Thanked: 1,490 times | Joined on Jan 2006 @ Czech Republic
#2
instead of editing each script in /etc/init.d you may try to edit /etc/init.d/rc and add debug code to startup() function like this

Code:
#
# Start script or program.
#
dbgout(){
chroot /mnt/initfs text2screen -x 0 -y 60 -w 800 -h 20 -c
chroot /mnt/initfs text2screen -s 2 -H center -y 60 -T 0 -t "$@"
}

startup() {
dbgout "$@"
  case "$1" in
        *.sh)
                $debug sh "$@"
                ;;
        *)
                $debug "$@"
                ;;
  esac
}
then you see nicely what runs at startup and where it stops

you can also install syslog and see /var/log/syslog after unsuccessful boot

to install something to nonbooting system first boot working system, connect to network, mount nonbooting system and chroot to it
Code:
mount /dev/mmcblk0p2 /opt
chroot /opt
then you can 'apt get install sysklogd' from maemo repository, exit shell, unmount it and try to boot it again, wait until it hangs or reboots, boot working system, mount bad system and see end of /var/log/syslog

If there is no /var/log/syslog you may create it first time (not sure about this)
__________________
Newbies click here before posting. Thanks.

If you really need to PM me with troubleshooting question please consider posting it to the forum instead. It is OK to PM me a link to such post then. Thank you.

Last edited by fanoush; 2008-03-05 at 08:21. Reason: sysklogd not syslog
 

The Following 6 Users Say Thank You to fanoush For This Useful Post:
Posts: 109 | Thanked: 91 times | Joined on Dec 2007
#3
Thanks for responding so quickly!

Originally Posted by fanoush View Post
instead of editing each script in /etc/init.d you may try to edit /etc/init.d/rc and add debug code to startup() function like this
OK, that'll be a big help. Editing the individual scripts was no problem...I scripted the process.


The debugging shows that all of the scripts are called...now the display freezes with a screen that shows:


Booting from mmcint1 ...
/etc/rc2.d/S99zzinitdone


In a change from the boot process before editing /etc/init.d/rc, there is no blue progress bar vizible at the bottom of the screen, and no "Nokia" graphic.

Do you have any suggestions about what comes next in the startup process...

Is there any way to boot the tablet in text mode (runlevel 3), or to get more debugging on the steps that follow S99zzinitdone?

SNIP!
Originally Posted by fanoush View Post
you can also install syslog and see /var/log/syslog after unsuccessful boot
There doesn't seem to be a syslog package built for chinook...or at least it's not in any of the gronmayer repositories.

Thanks!

Last edited by z2n; 2008-02-03 at 04:55.
 
Posts: 2,152 | Thanked: 1,490 times | Joined on Jan 2006 @ Czech Republic
#4
Originally Posted by z2n View Post
In a change from the boot process before editing /etc/init.d/rc, there is no blue progress bar vizible at the bottom of the screen, and no "Nokia" graphic.
It is not related. On my device with same change it boots normally except overwriting one line with the message.
Originally Posted by z2n View Post
Do you have any suggestions about what comes next in the startup process...

Is there any way to boot the tablet in text mode (runlevel 3), or to get more debugging on the steps that follow S99zzinitdone?
There is no text mode (except serial console). zzinitdone is last. Try same change on your booting system. Hands show at S50, then desktop shows and still many scripts are started when desktop is already running.
Originally Posted by z2n View Post
There doesn't seem to be a syslog package built for chinook...or at least it's not in any of the gronmayer repositories.
It is called sysklogd.
__________________
Newbies click here before posting. Thanks.

If you really need to PM me with troubleshooting question please consider posting it to the forum instead. It is OK to PM me a link to such post then. Thank you.
 
Posts: 13 | Thanked: 6 times | Joined on Feb 2008 @ UK
#5
I'm getting a virtually identical set of symptoms when booting off the the internal MMC on my 810.

I've updated the /etc/init.d/rc and added a number of logging statements to help identify where booting freezes and the watchdog process reboots. Here's the tail of syslog after failed boot:

Code:
Feb  3 21:34:10 Nokia-N810-50-2 user: Starting temp-reaper-startup.sh
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Closed a client connection
Feb  3 21:34:10 Nokia-N810-50-2 user: Starting dbus-sessionbus.sh
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Closed a client connection
Feb  3 21:34:10 Nokia-N810-50-2 user: Waiting for X
Feb  3 21:34:10 Nokia-N810-50-2 user: Starting sapwood
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Closed a client connection
Feb  3 21:34:10 Nokia-N810-50-2 user: Starting matchbox
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Closed a client connection
Feb  3 21:34:10 Nokia-N810-50-2 user: Waiting for D-BUS
Feb  3 21:34:10 Nokia-N810-50-2 waitdbus[932]: trying to connect to the system bus
Feb  3 21:34:10 Nokia-N810-50-2 waitdbus[932]: got connection
Feb  3 21:34:10 Nokia-N810-50-2 user: Starting media server
Feb  3 21:34:10 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  3 21:34:12 Nokia-N810-50-2 init: Switching to runlevel: 6
Feb  3 21:34:12 Nokia-N810-50-2 DSME: Closed a client connection
Feb  3 21:34:13 Nokia-N810-50-2 exiting on signal 15
This section of the log was generated by /etc/osso-af-init/real-af-services and suggested to me that the lockup is being caused by the media server failing to start cleanly, however I get the same reboot, at the same point, even when I comment out this section of the script.

Is it possible that a process started by a previous init script has hung and caused the watchdog to restart. If so, has anybody got any pointers to tracking it down?

Thanks
 
Posts: 2,152 | Thanked: 1,490 times | Joined on Jan 2006 @ Czech Republic
#6
Originally Posted by charlie View Post
Is it possible that a process started by a previous init script has hung and caused the watchdog to restart.
Well, yes, sort of, but not exactly. The hardware (Retu) watchdog reboots device only if kernel hangs completely or dsme dies because then nobody pings the watchdog anymore (Retu watchdog timeout is 63 seconds, dsme pings it every ~5 seconds). Then there is dsme with its policy which is the problem here. DSME tries to startup system services and when startup fails or it starts but exits later and it is considered to be critical (by dsme), dsme gives up and switches to runlevel 6. So perhaps something dies here. Maybe the X server?

One can also boot to usb networking recovery mode, log-in, leave the shell running and then try to continue booting and examine system (via ps or whatever) when it hangs somewhere. This needs modification of bootmenu.sh to not to shutdown usb networking. Here is the change (in bold) for binding it to menu key
Code:
while true ; do
        key=`evkey -u -t 100000 /dev/input/${EVNAME}`
        [ "$key" = "$KEY_ESC" ] && break
        [ "$key" = "$KEY_MENU" ] && break
done
${T2S} -c
if [ "$key" = "$KEY_ESC" ] ; then
killall dropbear
killall utelnetd
#sleep 1
ifconfig usb0 down
umount /dev/pts
rmmod g_ether.ko
fi
Root filesystem gets switched for this shell too so don't be confused, it should still work.

Also if system reboots then one also needs modification to stop doing this (/etc/init.d/minireboot).
__________________
Newbies click here before posting. Thanks.

If you really need to PM me with troubleshooting question please consider posting it to the forum instead. It is OK to PM me a link to such post then. Thank you.

Last edited by fanoush; 2008-09-02 at 07:51.
 

The Following User Says Thank You to fanoush For This Useful Post:
Posts: 13 | Thanked: 6 times | Joined on Feb 2008 @ UK
#7
Thanks, that approach is working and letting see further into the boot sequence.

I've updated bootmenu.sh (but not disabled minireboot yet). Here's the process list just before (within 1 sec of) the boot failing and restarting

Code:
  PID  Uid        VSZ Stat Command
    1 root       1468 SW  init [5]
    2 root            SWN [ksoftirqd/0]
    3 root            SW  [watchdog/0]
    4 root            SW< [events/0]
    5 root            SW< [khelper]
    6 root            SW< [kthread]
   16 root            SW< [dvfs/0]
   67 root            SW< [kblockd/0]
   68 root            SW< [kseriod]
   81 root            SW< [OMAP McSPI/0]
   88 root            SW< [ksuspend_usbd]
   91 root            SW< [khubd]
  115 root            SW  [pdflush]
  116 root            SW  [pdflush]
  117 root            SW< [kswapd0]
  118 root            SW< [aio/0]
  121 root            SW< [mipid_esd]
  246 root            SW  [mtdblockd]
  287 root            SW< [kondemand/0]
  288 root            SW< [kmmcd]
  300 root            SW< [krfcommd]
  313 root            SW< [mmcqd]
  345 root       1084 SW< dsme -d -l syslog -v 4 -p /usr/lib/dsme/libstartup.so
  350 root        564 SW  /usr/sbin/kicker
  355 root        776 SW  /usr/bin/bme_RX-44
  576 root        152 RW  /usr/sbin/utelnetd -l /bin/sh -d
  585 root        376 SW  /usr/sbin/dropbear -d /tmp/dropbear_dss_host_key -r /
  741 root       1044 SW  /bin/sh
 1406 root            SW< [cx3110x]
 1459 root       1576 SW< /sbin/udevd --daemon
 1660 root       1540 SW  /sbin/syslogd
 1714 root       1468 SW  /sbin/klogd
 1773 messagebus   1916 SW< /usr/bin/dbus-daemon --system
 1779 haldaemon   3980 SW  /usr/sbin/hald
 1780 root       2800 SW  hald-runner
 1787 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1788 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1789 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1790 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1791 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1792 root       2436 SW  /usr/lib/hal/hald-addon-omap-gpio
 1793 haldaemon   2508 SW  hald-addon-usb-cable: listening on /sys/devices/plat
 1794 root       2940 SW  hald-addon-input: Listening on /dev/input/event2 /dev
 1795 root       2436 SW  /usr/lib/hal/hald-addon-mmc
 1796 root       2436 SW  /usr/lib/hal/hald-addon-mmc
 1798 root       2952 SW  /usr/lib/hal/hald-addon-cpufreq
 1825 root       3636 SW< /sbin/mce --force-syslog
 1828 messagebus   3324 SW  /usr/lib/gconf2/gconfd-2
 1875 user       1312 SW< /usr/sbin/temp-reaper
 1879 user       1916 SW< /usr/bin/dbus-daemon --session
 1885 user       6776 SW< /usr/lib/sapwood/sapwood-server
 1890 user       5760 SW< /usr/bin/matchbox-window-manager -theme echo -use_tit
 1903 root            SW< [dsp/0]
 1906 root            SW< [dsp/0]
 1909 root       2952 SW  /usr/sbin/dsp_dld -p --disable-restart -c /lib/dsp/ds
 1917 root       2792 SW< /usr/bin/bme-dbus-proxy -N
 1980 root       4804 SW  /usr/sbin/multimediad
 1987 root       2176 SW< /usr/bin/esd
 1994 root       1960 RW  ps
Being a newcomer to maemo, I'm not sure what should be running at this point and no obvious problems stand out to me - any advice welcomed!

I'll disable minireboot next and see what results that gives.
 
Posts: 2,152 | Thanked: 1,490 times | Joined on Jan 2006 @ Czech Republic
#8
There is no X server running (/usr/bin/Xomap), this is fairly critical. At least matchbox window manager is already started so X server should be already up too. I think this is the line
Code:
Feb  3 21:34:10 Nokia-N810-50-2 user: Waiting for X
in previous dmesg output. It waits and after some time (perhaps when matchbox times out when trying to connect to display) it gives up. Try to run /etc/init.d/x-server by hand to see possible errors.
__________________
Newbies click here before posting. Thanks.

If you really need to PM me with troubleshooting question please consider posting it to the forum instead. It is OK to PM me a link to such post then. Thank you.
 
Posts: 13 | Thanked: 6 times | Joined on Feb 2008 @ UK
#9
I can see how that might be classed as fairly critical!

I'll see if manually restarting X gives any clues.

Thanks.
 
Posts: 13 | Thanked: 6 times | Joined on Feb 2008 @ UK
#10
Well, sure enough X is exiting - here's the evidence from syslog:

Code:
Feb  5 20:28:06 Nokia-N810-50-2 DSME: Closed a client connection
Feb  5 20:28:06 Nokia-N810-50-2 DSME: process '/usr/bin/Xomap -mouse tslib -nozap -dpi 96 -wr -nolisten tcp' with pid 1014 exited with return value: 1
Feb  5 20:28:06 Nokia-N810-50-2 DSME: '/usr/bin/Xomap -mouse tslib -nozap -dpi 96 -wr -nolisten tcp' exited with RESET policy -> reset
Feb  5 20:28:06 Nokia-N810-50-2 DSME: Here we will request for sw reset
Feb  5 20:28:06 Nokia-N810-50-2 DSME: Here we could do some bookkeeping..
Feb  5 20:28:06 Nokia-N810-50-2 user: Starting temp-reaper-startup.sh
Feb  5 20:28:06 Nokia-N810-50-2 DSME: Accepted new client connection
Feb  5 20:28:06 Nokia-N810-50-2 DSME: Closed a client connection
I've attached a zip containing the full syslog from this boot (truncated at the point where matchbox is repeatedly restarted).

Manually starting X (executing "/usr/bin/Xomap -mouse tslib -nozap -dpi 96 -wr -nolisten tcp") produces the following:

Code:
The XKEYBOARD keymap compiler (xkbcomp) reports:
> Warning:          Multiple names for keycode 138
>                   Using <I138>, ignoring <PROP>
> Warning:          Multiple names for keycode 140
>                   Using <I140>, ignoring <FRNT>
> Warning:          Multiple names for keycode 211
>                   Using <I211>, ignoring <AB11>
Errors from xkbcomp are not fatal to the X server
X appears to continue running, however there are no visible changes on the tablet screen at this point and nothing further is reported to the shell connection.

Any other suggestions, or should I give up on this installation and roll back to a previous backup?
Attached Files
File Type: zip BootResults.zip (8.3 KB, 245 views)
 
Reply


 
Forum Jump


All times are GMT. The time now is 18:38.