Active Topics

 


Reply
Thread Tools
Posts: 535 | Thanked: 598 times | Joined on Apr 2011 @ Republic of the Philippines
#21
@joerg_rw
wow! information overload! i like it! thanks for sharing.
 

The Following User Says Thank You to vetsin For This Useful Post:
joerg_rw's Avatar
Posts: 2,222 | Thanked: 12,651 times | Joined on Mar 2010 @ SOL 3
#22
if the smoking analogy didn't help, think about an incandescent bulb built to last 10.000h @ 110V. You can operate it at 140V, even at 220V. And of course it doesn't matter much as long as you switch it on only 5s a day. But it's all up to your usage pattern, your kids/software (and the mistakes they commit that might leave the light on errr the CPU at 100% load and thus at max clock without you even noticing), and your luck with this particular bulb how long it lasts (in NY there's a bulb in a fire house that shines since some 100 years and never gets switched off or needed any replacement, maybe your particular CPU is the same miracle)
One thing's for sure: at doubled load neither the bulb nor the CPU will live for even 50% of their regular active lifespan, it's more like each 10% increase in clock speed halve the expected lifespan.
Let's do the math:
500 + 2*10% =~600; 50.000 / 2 / 2 = 12.500 (datasheet says 10.000)
600 + 3*10% =~800 10.000 / 2 / 2 / 2 = 1250h
800 + 2*10% =~1000 1250 / 2 / 2 =~300h

I'm actually suspecting the real figures are worse

/j
__________________
Maemo Community Council member [2012-10, 2013-05, 2013-11, 2014-06 terms]
Hildon Foundation Council inaugural member.
MCe.V. foundation member

EX Hildon Foundation approved
Maemo Administration Coordinator (stepped down due to bullying 2014-04-05)
aka "techstaff" - the guys who keep your infra running - Devotion to Duty http://xkcd.com/705/

IRC(freenode): DocScrutinizer*
First USB hostmode fanatic, father of H-E-N

Last edited by joerg_rw; 2011-10-19 at 01:03.
 

The Following 2 Users Say Thank You to joerg_rw For This Useful Post:
Posts: 105 | Thanked: 4 times | Joined on Sep 2011
#23
thanks a lot sir joerg
 
Posts: 3,074 | Thanked: 12,960 times | Joined on Mar 2010 @ Sofia,Bulgaria
#24
Originally Posted by joerg_rw View Post
sorry I have to disagree, for two reasons: first your equation is missing the true nature of the problem which is electromigration and not voltage or power or energy.
Of course it is electromigration which is the main problem we are discussing, did I say something else? But EM does not appear out of the nowhere, it has it's physical explanation, more on that later on.

And second it's missing SmartReflex(R) set of measures in OMAP, which mostly defeats the only possible purpose of your undervolting which was to reduce current density, the true cause of electromigration. Higher clock rates cause steeper edges in level changes, cause faster charging of parasitary capacitors, cause higher current density surges, cause more electromigration.
Actually SmartReflex function blocks inside OMAP, which are taking care about virtually every single gate and transistor and adjust their individual working points (basically the quiescent current) accordingly to match the clock frequency might even cause your undervolting to result in a worse situation regarding electromigration. I for sure don't know enough of the particular details of what's going on inside the chip's gates to rule it out. If you actually know more... I'm listening.
So no, you're NOT safe.
Sure, I am not aware which exactly technology was used to produce 3430 SoC. But that does not render laws of physics invalid.

How did you come to a conclusion that higher clock rates lead to steeper edges?!? Is there something in datasheet about that? I failed to find such a thing, will you point it for me. I really wonder how you came to that, perfectly knowing it is a square waveform (in core), not sine or triangle. Yes, it is not a pefect square, and this square is derived from sine, but the only places where it is sine are quartzs(or whatever clock generator is used and which run at relatively low frequencies) and PLLs. By raising frequency (i.e. switching more often) we do not affect clock pulse edges , it is duration and period which is affected. So we are not charging/discharging gate parasitic capacitor more quickly, just more often. And this capacitance is related to voltage, because of the structure of the MOSFET, i.e. the higher the voltage, the bigger capacitance we have when transistor source-drain channel is closed. Of course it is not linear and is highly dependable on physical structure of the transistor, but still the effect remains. That is one of the reasons why lowering the voltage reduces EM (because of reduced gate capacitance, thus reduced transient current needed to switch the CMOS). The other reason why reducing voltage leads to reduced EM comes from transient function of RC serial cirquit:

i(t)=(V0/R)*e^(-t/RC)

so, the less voltage (i.e. charge) in the start of the transition we have, the less initial current is (and so is the current density).

And I have never said that SmartReflex shoud be turned on while oveclocking. I don't believe it can do better than manually ajusting OPPs for each particular chip.

Please don't first turn my words into their opposite to accuse me being a liar then! :-( What of >>2000h total of actually running at that clock speed (see my prev posters about dynamic clock speed and idle)<< makes you think I told something about CPU being permanently locked at max clockspeed?
Sorry if you understand me like that, I was just saying that someone without technical background MAY misunderstand you. Didn't want to be offensive.

Sure the parameters come from TI, as not even Nokia can afford the needed tests or had the mandatory insight in and knowledge about the chip's internals, nor the tools like electron microscopes etc to examine chips suffering EM after test runs. It's however a weird idea to think of those parameters as "optimized for TI eval-boards". And it's silly to think you could optimize those for N900, as it's just the chip and only the chip that's relevant here and that is determining those parameters.
No, it is not only the chip, you are EE, you should know that. It is the stability/noise on power lines, it is the PCB material used, it is the shielding, etc. The fact that AFAIK all of the n900s work reliably at 600 MHz when supplied with a voltage 30%-40% lower than stock means that somewhere someone did't do their homework. And the fact that SmartReflex was disabled in PR1.2 (not sure for PR13) supports my conclusion that SoC in n900 is far more capable, it is just SW team responsible for frequency/voltage management did't do the best they can.

And you can look at Harmattan DSP bridge driver for more proves. If I read the code correctly, the driver does not allow 3630 DSP to go above 600 MHz. Which is funny, because it is capable of going up to 870MHz

Any design particulars like heat dissipation of N900 are absolutely irrelevant for that, they only where relevant if the main problem of OC was overheating which it definitely is not.
And overheating is not a problem in n900 because of the thermal dissipation design of PCB/case. If it wasn't so good then OC to almost double of what is stated to be overdrive frequency (600 MHz) would be impossible, thus my point.

I am assuming that your statement that heat dissipation is unrelated to EM and OC is caused by a lack of coffee. Two players in EM are (search for Black's equation for more info):

1. current density
2. junction temperature

Curent density is major player here, but this does not mean that temperature could be ignored. I don't want to go into details here of the physics behind EM, but in simple:

electromigration is a process of moving metal atoms caused by electrons stricking those atoms. It appears at junction between metal wires and semiconductor. Really simple explanation, but there is enough info on internet if someone wants to read more.

So how is the temperature related to EM? The answer is - the higher the temperature the bigger is the amplitude of the atoms in crystal lattice, the higher is the probability an electron (or groups of electrons, one is not enough) to strike with enough energy to move the atom to a different place. So with lowering the junction temperature we are reducing the EM effect. The junction temperature is tightly related to termal resistance inside the silicon, and thermal resistance between SoC and PCB. So with a good thermal desing (which seems to be the case with n900) you are lowering junction's temperature and reducing EM.

All this shows again why there's so much nonsense around OC, everybody is starting with arbitrary random assumptions (like OC problem was heat) and then gets involved in sophistic developments and theories based on those false assumptions (here e.g. undervolting, maybe even dynamic, based on ambient temperature, eh?)

Sorry if the above maybe sounds a bit harsh, but it really annoys me since almost 2 years now, and all the info has been given over and over and over again, it's all there for everybody to read and understand. But no, OC is cool, and WFM, and of course every EE that tells something different is just a fool, no matter if it's Igor of Nokia, or me, or the guy writing the OMAP3430 datasheet. N900 community is so smart they know best, no doubt.

cheers
jOERG
I have absolutely no problem with how you express yourself, it is your way of explaining things

Last edited by freemangordon; 2011-10-19 at 11:32.
 

The Following 13 Users Say Thank You to freemangordon For This Useful Post:
Posts: 1,427 | Thanked: 2,077 times | Joined on Aug 2009 @ Sydney
#25
If we talk probability here and assuming that people "generally" don't use their "high-end" (which even N900 was at the time) smart phone for longer than 4+ years these days... It is more safe than dangerous to overclock within the boundaries of what the Kernel Power on the N900 allows. This is known fact. If it isn't a known fact, many, I mean MANY people here would have had their N900 die, crash, burn and explode on their faces. You also cannot deny that majority of people are much more willing to say what is bad than what is good when it comes to their experiences. So to me, overclocking the N900 is safe no matter what the nay sayers say.

"Safe" is a very vague term by the way. There is always a risk in everything. Even just turning on your N900 is reducing its life span. Some CPUs in the N900 also have higher yield than others. So on some samples, running it at 1000MHz might actually put about the same stress as another N900 running at a stock max 600MHz. Some at 600MHz might even be more stressed and have shorter life span than some "overclocked" to 1000MHz. But both could have a life span of over 5 years or just 5 months. We just don't know. What we do know is that more people than not have no issues "so far" with overclocking up to their "stable" limits. This is fact like it or not. So for now, I'm saying it's safe. How safe? That is up to pure luck.
 
Posts: 1,033 | Thanked: 1,013 times | Joined on Jan 2010
#26
Going by the given hours for each frequency it can be concluded that overclocking will have the same effect in normal use.

Let's say 600MHz gives a lifetime of 20000h, while 800MHz gives 10000h. All this chips are running on demand clocks unless tinkered with. When overclocking the N900 we are definitely lowering the voltage at 800MHz compared to stock since apparently all N900s are capable of such feat. This most certainly has an impact on ultimate lifetime (positive).

Here is my experiment:

600MHz takes 7s to compute a task and go to rest
800MHz takes 4s to compute a task and go to rest

Going purely on this and not considering lower voltages, lifetime during normal use equals out.

Also, let's not forget, TI produced A8 running at 800MHz under the 3440, optimized for performance. Do you really think, it's a completely different SoC as opposed to 3430? I doubt 800MHz clock has such a dramatic impact as stated by TI. They surely wanted to sell the 3440.

It's a simple logic, there is no need to get into so many details like freemangordon to prove a point, even though I appreciate it a lot and had a flashback of being in my physics class
 
joerg_rw's Avatar
Posts: 2,222 | Thanked: 12,651 times | Joined on Mar 2010 @ SOL 3
#27
@jakiman: err, sorry - how are you able to make statements of 4 year stability of a device that's available since less than 2 years? I'm not inclined to do a per-sentence reply here but let me say your whole conclusion chain once more is an example of "based on false assumptions" - you CAN NOT state something is safe based on a limited-duration random field test where you don't have control about *any* of the critical parameters like usage patterns, hell not even about feedback for problems.
Your arguments are like "sure this car's engine will last >1000 miles at full speed, as we've seen no defects here at the 250 miles checkpoint yet" - does this sound honest or reassuring?
What you *might* say to those asking "is it safe?": If you are using the commonly used 800MHz+undervolt setup, and your usage pattern is that of an average user then your chances that your device will survive at least 12 months after enabling that OC are for sure better than 50%.

@freemangordon: I mostly agree with you, even partially with your chain of evidence about edge steepness. You're missing just one detail about SmartReflex: it's not only the automatic control of regulators in twl4030, SM is also a whole bunch of measures *on OMAP* chip, one of which is (AIUI) that there are individual micro "regulators" for voltage/current/working-point of each gate or even transistor. Those can't get switched on or off, they are always active. And again AIUI the twl4030 regulators and all the undervolting just determines the *input* voltage to those micro regulators. So if you really want to undervolt the logic gates, you'd need to lower the VDD input voltage to those regulators to a level where they cease to function, which can't be a sane thing in my book. All the SM that you can enable or disable in twl4030 is just meant to operate those micro regulators on a voltage drop as low as possible, to save energy (which btw only makes sense because the switching regulators in twl4030 are of better efficiency than those SM linear micro regulators).
Regarding temperature, while we're talking energy consumption, I agree that it for sure makes a lot of difference. Nevertheless the main factor is current density and the fact that it has a certain threshold above which EM goes through the roof while below it there's almost no EM at all. On OMAP3430 it seems this threshold is somewhere in the range 500..600MHz. I'm taking these values as well as my idea that they are independent from the SoC's "peripheral components" from the fact that the datasheet doesn't mention any of those as being relevant for the calculation for estimated lifespan at different clock speeds. Sure exceeding max die temperature will massively increase EM even on lower current densities, but the datasheet clearly says that up to the maximum allowed operating temperature and core voltage and whatnot the CPU still can live for 100.000h@500MHz and only for 23.000h@600MHz. I think if there'd been a way to safely extend that period (or increase clock speed while keeping the period) then TI for sure would've mentioned that, as they are interested in selling chips with the best possible specs.
I still fail to see any competence in N900 community that would allow to invalidate what TI came up with about maximum ratings of their chips, and speculating about what will happen if we deliberately go beyond those specs is just that: speculations. I'm feeling my speculations are based on sound facts and reasonable extrapolations, and they make me think a OMAP3430 will last no longer than 300h *running* at 1GHz clock speed, no matter what you do to it, as long as it works at all at that clock speed and not simply degrades to a random generator due to too low core voltage. After that timespan of CPU running at that clock speed I'd expect ~5..10 of 100 devices starting to expose random failures which make the device useless.

/j
__________________
Maemo Community Council member [2012-10, 2013-05, 2013-11, 2014-06 terms]
Hildon Foundation Council inaugural member.
MCe.V. foundation member

EX Hildon Foundation approved
Maemo Administration Coordinator (stepped down due to bullying 2014-04-05)
aka "techstaff" - the guys who keep your infra running - Devotion to Duty http://xkcd.com/705/

IRC(freenode): DocScrutinizer*
First USB hostmode fanatic, father of H-E-N

Last edited by joerg_rw; 2011-10-19 at 14:12.
 

The Following 7 Users Say Thank You to joerg_rw For This Useful Post:
Posts: 105 | Thanked: 4 times | Joined on Sep 2011
#28
so many infomation i learned about this conversations..
 
Estel's Avatar
Posts: 5,028 | Thanked: 8,613 times | Joined on Mar 2011
#29
While both freemangordon and joerg_rw got their points, I would like to mention my totally uneducated observation, based on ~10 years practice of overclocking desktop (later also some notebooks) components. If I recall correctly, "from the beginning of (overclockable hardware) times", every manufacturer (like TI) tend to intentionally "underestimate" lifetime, especially when it comes to frequency increase. I don't know how well this apply for such miniaturized things as our SoC, but data-sheets were "scarring" us with xyz times degraded lifespan with every-10%-of-frequency-overclock, since times of early Durons.

Practice seems to prove that it's a lie. Explanation is simple - it's more fun to sell same chip factory-overclocked, as new one (or, sell something *very* of very similar architecture, with almost-cosmetic changes in design).

joerg_rw assumption of TI willing to sell thing with best specifications possible would be real in "perfect world", but in our reality it may be just false - IMO, TI (and other manufacturers) got quite $$$ interest in lowering declared lifespan of it's products. *Especially* when overclocked beyond point, that is becoming base frequency of their (planned) brand-new, 5x times more expensive (at least on introduction moment) "new" higher clocked product, which is - in reality - almost identical to old one.

The same things happen in medical industry, mind You. Many "revolutionary" next-gens of medicals for serious illness are augmented old ones, with little effect of extending lifespan of patients (other than placebo effect...).

Sorry for quite harsh patient-CPU metaphor, and keep in mind that this post was sober speculation , totally uncertified by other means that personal experience.

Ps.

I still got working machine with 600 mhz Duron running @ 1000 mhz (working as router/media server/whatsnot combo, 24/7, from so many years that I can't count it...), and Athlon XP 1700+, 1400 mhz, running @ 2400 mhz, both still able to pass all CPU stability test without single glitches. Yea, I know about working/sleep time (ho ever, AIUI, such mechanics were much less reliable in times of Durons), but it seems to me that according to manufacturer specs, my oldie CPU/s are "this" fire station 100-years old carbonate light-bulbs equivalent.

What makes me doubt such a luck, though, is that it happens with every device I overclock - always doing hundred of hours (in total) tests, monitoring and tweaking, never leaving it with even shortest instability. So, it seems to me, that over-clocking is not much different from other "life" activities - if You're doing it with brain and some knowledge, You'll be fine. Which I can't tell about some people here, stating in their signatures about running their devices @ 1000 or 1150 locked (no offense to anyone, it's free world)

/Estel
__________________
N900's aluminum backcover / body replacement
-
N900's HDMI-Out
-
Camera cover MOD
-
Measure battery's real capacity on-device
-
TrueCrypt 7.1 | ereswap | bnf
-
Hardware's mods research is costly. To support my work, please consider donating. Thank You!

Last edited by Estel; 2011-10-20 at 00:26.
 

The Following User Says Thank You to Estel For This Useful Post:
Posts: 1,033 | Thanked: 1,013 times | Joined on Jan 2010
#30
@Estel

As I've mentioned, TI was selling a 3440 with an A8 clocked at 800MHz along with the 3430. It's hard to believe they would sell a chip with a dramatically lower lifetime or that it's a completely new design which would raise r&d and production costs. Also, going by what they said, 45nm A8 is best at 800MHz and they sell a 1.2GHz version. Do you think it has a 10h lifespan at such a frequency? Scorpion core must have 2h lifespan at such high frequencies since it's a similar design, just pushed higher out of factory for competitive reasons. They have to somehow stop overclocking and force you to buy a new device with their "faster and new" chip.

EDIT: OMAP 2420: N95/8GB, N82, E90 = 332MHz
N800/810 = 400MHz

Last edited by patlak; 2011-10-20 at 08:40.
 

The Following User Says Thank You to patlak For This Useful Post:
Reply


 
Forum Jump


All times are GMT. The time now is 02:56.