Running at a slower but acceptable speed on more devices versus running at a faster speed on fewer devices makes the latter strategy much more device dependent. If you're the device manufacturer, you can control that but Google must rely on a number of manufacturers using their OS and apps. Plus, don't forget Moore's law. With faster CPUs the relative speed advantage decreases. I'm old enough to remember writing in assembler language. If you want the absolute fastest program, it's still the way to go but very few people think it's worth the trouble.