maemo.org - Talk

maemo.org - Talk (https://talk.maemo.org/index.php)
-   MeeGo / Harmattan (https://talk.maemo.org/forumdisplay.php?f=45)
-   -   Lots of ECC errors in dmesg (https://talk.maemo.org/showthread.php?t=91175)

sponka 2013-08-28 22:24

Lots of ECC errors in dmesg
 
Hello,

I ran dmesg and got quite a lot of errors like this:

Code:

correctable ECC error = 0x5555, addr1 0xa, addr8 0x0
Complete output is here: https://dl.dropboxusercontent.com/u/1420887/dmesg.txt

Rebooted approx. 3 times and output is always like this.

My N9 is a bit older than a year and I really hope that doesn't mean it's failing?

Thanks,
b.

rainisto 2013-08-29 08:24

Re: Lots of ECC errors in dmesg
 
its a feature, just ignore those lines :)

Fuzzillogic 2013-08-29 16:45

Re: Lots of ECC errors in dmesg
 
I've had them too. But it disappeared. Flash memory will fail, that's why the ECC is there. I guess at some point the controller will swap the faulty block for a fresh spare one, so the errors go away.

wicket 2013-08-30 01:50

Re: Lots of ECC errors in dmesg
 
Quote:

Originally Posted by Fuzzillogic (Post 1370472)
I've had them too. But it disappeared. Flash memory will fail, that's why the ECC is there. I guess at some point the controller will swap the faulty block for a fresh spare one, so the errors go away.

Those errors relate to main memory, not flash memory. It basically means that a bit flip was detected and corrected. It does not mean the memory failing and won't affect performance unless you are getting at least somewhere in the region of tens of thousands of errors a day. There are a number of reasons why correctable memory errors may occur. They can even be caused by cosmic rays! Don't worry about about them.

Fuzzillogic 2013-08-30 12:20

Re: Lots of ECC errors in dmesg
 
Correct me if I'm wrong, but I looked for those errors in code and they originated from a piece of code used for Samsung's OneNAND, which is flash memory.

AFAIK there's OMAP's 512MiB internal/embedded flash, and the 16/64GB "external". Is that what you meant?

I've read that worn out flash cells could be revitalized by heating them. You can try putting your device in the oven :D (kidding here ofc. But flash-heating is a valid way to fix it.)

mikecomputing 2013-08-30 22:41

Re: Lots of ECC errors in dmesg
 
Quote:

Originally Posted by Fuzzillogic (Post 1370618)
Correct me if I'm wrong, but I looked for those errors in code and they originated from a piece of code used for Samsung's OneNAND, which is flash memory.

AFAIK there's OMAP's 512MiB internal/embedded flash, and the 16/64GB "external". Is that what you meant?

I've read that worn out flash cells could be revitalized by heating them. You can try putting your device in the oven :D (kidding here ofc. But flash-heating is a valid way to fix it.)

That techonlogy will never appear. Simply because manufactors want not to sell products that lives as long as "end of human civilization" that would kill theyr bussiness. Because they need us to buy new products all the time...

juiceme 2013-08-31 11:54

Re: Lots of ECC errors in dmesg
 
Quote:

Originally Posted by mikecomputing (Post 1370765)
That techonlogy will never appear. Simply because manufactors want not to sell products that lives as long as "end of human civilization" that would kill theyr bussiness. Because they need us to buy new products all the time...

Well I'd say this does not relate to flash memory technologies, fortunetely. :D
See, the densities are growing anyway so mfg's will offer larger capacity devices all the time, obsoleting the smaller devices. There's no need to obsolete devices by building in faults...

wicket 2013-08-31 20:57

Re: Lots of ECC errors in dmesg
 
Quote:

Originally Posted by Fuzzillogic (Post 1370618)
Correct me if I'm wrong, but I looked for those errors in code and they originated from a piece of code used for Samsung's OneNAND, which is flash memory.

Thanks for pointing that out. They do indeed originate from OneNAND. I should really have looked at the attached dmesg output before posting. My post came from previous experience having seen main memory ECC errors in hundreds of servers (before flash memory was commonplace) and it never occurred to me that ECC would now be available in flash memory devices.

The same ECC principles should still apply though regardless of whether the memory is volatile or non-volatile.

Interestingly enough, OneNAND is actually known as "fusion" memory which not only consists of flash memory but also includes a 5KB SRAM buffer (as well as controller logic and hardware ECC) on the same chip so it's possible (but not likely) that the errors come from the SRAM buffer.


All times are GMT. The time now is 23:16.

vBulletin® Version 3.8.8