![]() |
N900, ohmd, syspart, VM & swap tweaks
Hi all,
this post is intentionally kept light & clear in order to have a scratch page to be updated only with sure and certain findings and conclusion. My goal would be to find the better configuration (if any) of system files for a specified use-case. When something is being explicitely working (I mean, lot of people report it works for their use case), we can put that single item in a wiki. Here's the usual WARNING FOR EVERYBODY - MANY OF THOSE TWEAKINGS COULD LEAD TO A NON WORKING N900 AND A REFLASH TO RESTORE IT IN CASE OF MISTAKES - SO BE AWARE!!! Now you can go on reading, you've been warned STILL NO DATA CONFIRMED NOR AVAILABLE Things under investigation: VM tweaks SYSPART / OHMD tweaks SCHEDULER/SYStemBLOCK tweaks OTHER tweaks |
Re: N900, ohmd, syspart, VM & swap tweaks
First of all, if a single document speaking about the aforementioned matters does exist, please point me in the right direction, it will be very nice from you! After having said that, now, this post is a long one.
Lot of comments had been posted during last year and a half, together with scripts, scripts collections, programs then completed with GUI (thank you DeBernardis & Saturn!!! and all the other that contributed, I hope to have thanked all of you when I found something useful), but until today I did not find a resume with subsequent explanations and conclusions on many detaiils expressed in the subject. One thing on we all could agree - I think - is that most of the lagginess problems that arise on our beloved 900s are due to memory constraint. Probably with 512M of rams, the quantity of threads around full of people wanting to smash the phone against a wall would be 1/10 or 1/100 in respect of as today, knowing that anyway the VSYNC problem will make our 900s feel slow perhaps forever (or until Stskeeps changes his mind and decides to try finally to compile vsync against 2.6.28 kernels... :) I know I know, many times you said it is almost impossible to too many changes required in the source, but we can always hope!). So said, I tried in a year really LOT of this tweaks. I am a curious person, and I learnt lot of things about kernel internals of VM and the like. On one side it has been rather rewarding, sometime frustrating, but at the end is a week now that I really feel that 'urge to change something' going down. To verify it is not a placebo effect, I used this morning half a day my 900 in its original configuration (not from scratch, with the full bloat of software I have installed only with no tweaks applied in its standard configuration except the overclock), and now I can rather firmly claim that for my use case the difference does exist. Reverted now happily to tweaked configuration. I also made some rough benchmarks, and often the results from those benchmarks left me wiithout clear ideas. So i decided the final judge was the use I made of the phone day by day. Such decision on a side is very important because inspite everything we could say, we have a n900 to USE it, not only to hack with it. On the other side, it is rather hard to stay subjective when travelling in the feeling realm, and I decided to write this post in order to find some other testers willing to compare opinions. I installed some test tools also from the tools repository, being IOSTAT perhaps the most important. I modified Conky in order to have dirty pages, writebacks and uninterruptible processes updated once per second.I started working in a systematic manner around a month ago, but decided not to share anything until the point I had some clear ideas on what I wanted and was looking from/for those experiments. For working, I mean not only try to change something and to say: -"yes, it feels better". I mean methodically change 1 parameter, fire a test script with 128M dd to and from swap partitions, fire at least three memory hogs (browser with standard pages with flash AD, maps and mobilestellarium for example) keeping htop and my modified conky running in the background all the day. When testing, this modified conky+htop alone keeps the sistem load around 1 when screen is on and Xorg working, so I hope the stress test is good enough. Batteries never got to the 4 hours mark while testing, with phone always warm... I cannot tell how many times my phone rebooted under such high loads with modified parameters, and especially the number of phone calls i lost while doing those test :)!!! I think a definitive conclusion will be almost impossible to achieve, because VM organization and prioritizing is not a simple matter, and a good part of that is pure math. But at least to achieve some confidence that if I use my N900 as a media server, for example, some modifications will be helpful, I think that's a reasonable goal! Next post will resume my tests. The idea is to keep the thread clear as much as possible in order to collect all infos in the first post. So please, if somebody would like to join and share his experiences, please try to follow the scheme I am explaining: SETUP: N900 mmc yes//no which one stock/modified kernel which one USAGE short description of your use case TWEAKS APPLIED divided in the area where they affect WHY THOSE TWEAKS? Here is the trickyest part. It would be nice to explain WHY and HOW you get to the conclusion that the modifications make the n900 feel better, technical background and kind of response (feeling, stress test, benchmark...) So let's start, hoping somebody will follow me in this crazy job. After all the work I did, the feeling is that Nokia engineers did a very good job on their part, inspite of some comments stating the opposite. Keep in mind they have to provide a resilient machine instead of a top-performing one optimizing what they have and, keeping in mind the kind of device the 900 is, I think they did really a great job |
Re: N900, ohmd, syspart, VM & swap tweaks
Everything said, here follows a resume of my experiences so far.
SETUP: N900 8Gb class 6 uSD card Power kernel, std overclock 850 (sometimes I up to 1100 while watching a quick video or using heavily Gnumeric) USAGE: Browser (maemo.org, home banking, other forums, no flash video and flash adverts blocked), 4 online IM accounts OR (mutually exclusive) bluetooth tethering for my PC, mediaplayer for OGGs, Sygyc maps, games from time to time, calendar and obviously PHONE TWEAKS: VM--------------- MODS EXPLANATION
FINAL COMMENTS
With the settings reported here and my use case, I read 14 uninterruptible processes in the queue and processor 100% @850, with a waiting reaction time always less than 3/4s maximum. During last week only once I had to leave the n900 to settle for some minutes before going responsive again. Try to launch, without waiting states between your clicks, microb, contacts (my list is over 580 buddies), mediaplayer, angry birds, calendar, bounce evolution, mobilestellarium, gnumeric, and panorama - you get it! But the best thing happened this morning - I was testing with tons of apps active going back and forth between them, system load was over 4, 12 D processes, processor ranging from 50 to 100%, I was messaging and the phone rang - OK, I thought, let's see who I will have to recall now... and 1 second later the phone interface appeared! At that point I decided it was the time to post on TMO :) So that's all folks! I hope this one is only the start of a constructive process trying to understand better the internals of 900, and at the same time the start for a good 'optimization based on use cases' wiki, or best, some CSSU packages adaptation based on use cases who any user could then choose! Cheers, everybody. PS: please, don't blame me too much for grammar and english mistakes - english is not my native language! EDIT - And thank you for patience if you read everything - tried to clean a little bit with formatting after vi_ suggestion |
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
|
Re: N900, ohmd, syspart, VM & swap tweaks
Tried to clean up a little bit, lot of text meant lot of formatting. TY for the suggestion, didn't thought about that
|
Re: N900, ohmd, syspart, VM & swap tweaks
Whoa. Mind. Blown.
|
Re: N900, ohmd, syspart, VM & swap tweaks
amazing post jurop88, should implement it to swappolube :)
|
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
Great finds here with modifications to priority processes omhd. I've have pulseaudio et al media at a high stack priority in order to reduce jittering for a few months now - but never felt the need to play around more. Thanks for the testing you've done. Seriously. |
Re: N900, ohmd, syspart, VM & swap tweaks
Moving hildon-sv-notification-daemon out of [mediasrc] closes the socket and doesn't allow any sound?
|
Re: N900, ohmd, syspart, VM & swap tweaks
Made some of these changes and will see how it pans out over the next few days. I am a fairly heavy user so it will be interesting.
|
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
What I can affirm is that on my machine in its current state, the baloons are now delayed (also 5 or 10 seconds) while chatting, don't know about emails, but I hear both vibration and notification sounds. |
Re: N900, ohmd, syspart, VM & swap tweaks
if your configurations really make the n900 snappier and more responsive without sacrificing anything else, it should really be in a wiki or included via swappolube or something...
This is just a great work that you've done. Marvellous |
Re: N900, ohmd, syspart, VM & swap tweaks
As for kernel reporting mmcb blocksize as "512k", it's not. It's saying logical blocksize is 512 bytes. This is meaningless for your purposes though, it only tells you the smallest request size that the mmc will accept. Internally it then translates 512 byte write into a read-modify-erse-write cycle of 128k or 256k, whatever its true block size is.
This brings us to the "noop" scheduler issue. You are correct that there are no moving parts, but the huge blocksize calls for scheduling writes close to eachother anyway, to minimize the amount of read-modify-erase-write cycles the mmc/usd has to do. Imagine if kernel sends request for writing 4k at position 2M, and then 4k at position 8M, and 4k at position 2M+4k, 4k at 8M+4k, and so on. Each request makes the uSD/emmc internally read 128k (assuming that's the true eraseblocksize), change 4k of that 128k, erase another 128k block, write 128k to that block. A write amplification factor of 32. You can divide your raw write rate of a nominal 6Meg/s for Class6 with 32 to get estimated 192 kilobytes/sec... So ideally we'd want an elevator that knows about the special properties of flash. but we don't have one, so we use CFQ. which atleast has some heuristics for distributing IO "fairly" between processes. Incidentally, this is where the explanation for why moving swap to uSD seems to improve performance begins too. The heaviest loads for the emmc is swap, and anything that uses databases like sqlite. That includes dialer and conversations, calendar, and many third party apps. Why is this a heavy load? Because these things typically write tiny amounts of data, and then request fsync() to ensure the data is on the disk. This triggers the writeout of all unwritten data in memory, and updating all the filesystem structures. Remember that a tiny amount of data spread out randomly triggers massive amount of writing internally to the emmc. Worse, while this goes on, all other requests are blocked. And what else besides /home and swap is on emmc? /opt. Containing, these days, both apps and vital parts of the OS. The CPU is starved for data, waiting for requests to be written out so that the requests for the executable demand-paged code of apps can complete. Btw for Harmattab I'm told sqlite will be using a more optimized db, that essentially works like one gigantic journal. Sequential writing is fast and good on flash, random in-place updates is bad. Moving swap to uSD gives a path for swap that is always free (well almost always unless you do heavy acesses to uSD by other means), and offloading swap from emmc means less random IO load on the emmc. |
Re: N900, ohmd, syspart, VM & swap tweaks
@jurop88
lots of respect and thanks..thats fantastic and lots of mindblowing effort you have put in. it took me 3 reads just to understand things you have tried out.. very impressive..hope u do some more r&d and we can make the n900 more better Thanks |
Re: N900, ohmd, syspart, VM & swap tweaks
Hi Shadowjk,
thank you for you participation. Quote:
1) why 512k will mean 512 byte? Can you point me somewhere, also through kernel source? I just started digging on the matter, found relevant code in the mmc driver (I hope to be on the right path to understand something) but I must admit my C knowledge is rather rusty 2) where to find the true HW block dimension? Is there a place where is it reported or shall I know it directly from the uSD producer? The 128k size, though, explains why Nokians choosed to set page-cluster to 5; 32*4=128 and that's it Quote:
Quote:
Wikipedia again, Quote:
After having used the setting in the first page for some days, I have to say that with NOOP probably the fragmentation is bigger, but the feeling is that it works faster UNTIL IT WORKS. Another member on the forum (don't remember precisely who) set a swap rotation during the night in order to avoid this fragmentation, and I can confirm that after two days my N900 started 'choking' and a swapon/swapoff/swapon/swapoff let it fly again, in line with identifying the issue due to swap fragmentation. Quote:
What we ideally need is a scheduler saying: Code:
- kernel: we need some free room. I have already found an example of NOOP scheduler written in C on the internet, and it does not look to much hard to implement. Here we are speaking of brute force, not high math ;) - A simple modified NOOP algorithm good for flash could look like: Code:
- check if the page to be unloaded is already cached and not dirty or in the current queue Quote:
On a side note, I am digging into the ohmd & cgroups realm and I am happy to have learnt lot of things :) - probably the parameters in the first page will be tuned again after some days of usage and having looked at the patterns arised in terms of load and memory used. EDIT - oh, and I forgot to report this https://bugs.maemo.org/show_bug.cgi?id=6203 where many hints on ohmd & syspart are given! |
Re: N900, ohmd, syspart, VM & swap tweaks
hehe it looks like I made some confusion amongst kswapd and IO scheduler - still learning a lot in this illness period :)
|
Re: N900, ohmd, syspart, VM & swap tweaks
Hi jurop88,
I've spent at least 20 minutes trying to find again this thread as I'm doing some experiments with information that is split across multiple threads:
And this one ;) Have you made any more progress? |
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
I since wrote the orginal post made some slight modifications, but still not updated here. Perhaps will do it through the WE |
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
|
Re: N900, ohmd, syspart, VM & swap tweaks
> partition desktop memory-limit 70M
When I've cgroups mounted I noticed that the desktop groups only need 25M. So, it's better to write partition desktop memory-limit 25M or echo "25M" > /dev/cgroup/cpu/desktop/memory.limit_in_bytes. |
Re: N900, ohmd, syspart, VM & swap tweaks
Has anyone ever tried the deadline scheduler and:
Code:
echo 1 > /sys/block/mmcblkX/queue/iosched/fifo_batch |
Re: N900, ohmd, syspart, VM & swap tweaks
I'd also like to try the anticipatory scheduler, a lot of the Android guys have been switching over to it...
|
Re: N900, ohmd, syspart, VM & swap tweaks
I considered anticipatory, but it was taken out of the kernel altogether as of 2.6.33 and supposedly CFQ replaced it after 2.6.18. Also I'm never sure whether heuristics are a good idea or not.. I suppose it wouldn't hurt to add it to the config as a module, though.
|
Re: N900, ohmd, syspart, VM & swap tweaks
Ok, I've built the latest BFS from git tree, with both deadline and anticipatory enabled as modules within the config. Everything runs as smoothly as before, with no added overhead. To enable either one of them, you have to echo deadline or anticipatory to /sys/block/mmcblk0/queue/scheduler - this seems to automagically insert the corresponding module (probably best to rmmod if you then change back to noop or cfq at a later date, though, as they stay loaded). I did find a thread regarding anticipatory on Android and found this snippet to be quite interesting:
Quote:
|
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
|
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
/j |
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
|
Re: N900, ohmd, syspart, VM & swap tweaks
Hi all,
I've also been trying to tweak /usr/share/policy/etc/current/syspart.conf One thing I noticed though is, sometimes the values are being applied for a little while, and then it gets overridden again by ohmd. For example, I've tried changing the desktop cpu-shares to say 4096, then i run "stop ohmd" and "start ohmd". For a short time, I can see the value 4096 is being set by cat-ing the corresponding value in /syspart/desktop/cpu.shares But then after a while, ohmd seems to write back the original value 6144 to it. Any ideas? Thanks. |
Re: N900, ohmd, syspart, VM & swap tweaks
Now I don't remember exactly - I worked on that a long ago - but there is a policy file written and compiled in Prolog (?) somewhere read by ohmd - and it's responsible for that.
Long story short, what I remember is that I resigned after a lot of searching. Now I am really short on time for next month, but if you are interested in looking at that I can look for my knowledge and send everything to you - some material found on wikis, irc logs and the like |
Re: N900, ohmd, syspart, VM & swap tweaks
Quote:
Thanks. |
All times are GMT. The time now is 05:26. |
vBulletin® Version 3.8.8