Log in

View Full Version : Megahertz delusions


jage
07-16-2003, 01:13 PM
Don't let the MHz to fool you. PPCs are all very very unbalanced designs. It would be relatively straightforward to design 100MHz system that would beat any PPC in every possible category. That is, if you have gates & energy to burn in the chips. All X-Scale & SA designs are badly memory bandwidth starved. They have also severe other weaknesses - like small cache, no floating point or division instruction.

This is not tested or anything - just a feel of things, but this is how I "feel" like PPC stands at the moment. To repeat, not verified, but I'd be surprised if this is far from the truth.

X-Scale 400MHz
Following integer/fp section ignores bandwidth problems:
Integer operations except division: Pentium 200-300MHz, depending on register use (PXA has more registers, that helps) and 'pairability' which helps Pentium.
Division: Pentium 50MHz (if such a thing existed...)
General Single precision floating point: hypotethical 16MHz Pentium
General Double precision floating point: 10MHz Pentium

*floating point note*: PXA255 has no FPU. You can usually find faster compromises for special cases if you're not scared of low level coding and in the worst case assembler. FPU uses too much energy & chip real estate. I'd like to see FPU you can turn on and off... but I doubt we'll see such a thing soon. Just FADD, FSUB, FMUL, FSTORE and FCMP would make me happy, even just single precision, pretty much all you need most of the time for floating point math.

RAM bandwidth:
Typical early Pentium class, latency somewhat better.

CPU cache size: typical low end 486 system. Cache size is one of the biggest performance problems. Can be worked around to some extent by smart coding and using prefetch functionality. This is something way beyond ability of an average programmer, though, as it requires close knowledge of the system architecture and assembler programming. PXA 255 16kB data cache, 32k code cache. Energy-burning unified 256kB cache would help a lot - at the expense of energy consumption and die size... Maybe they could make cache with 2 modes? Low power consumption 32-64kB mode and performance 256kB mode...

CPU cache speed: Comparable to Pentium 150

"Disk" IO, flash cards and such:
Bandwidth similar to high end 386-systems, except latency which is Pentium-class system level.

"Disk" capacity:
Typical 486-era system.

Network IO:

Comparable to Ethernet on 386-class systems.

Graphics subsystem:

Simple operations (like filling rectangles, etc): 386/early 486 with VLB
Complex operations (like drawing filled polygons or clipping, etc): Pentium 100-150

Graphics bandwidth:
From high end 386-system or VLB 486-systems.

Comments?

Jason Dunn
07-16-2003, 05:54 PM
Comments?

Umm...it's a PDA. What do you expect? :wink:

Don't get me wrong, I really want to see performance improve, but you have to consider that mobile CPUs are completely different beasts than desktop CPUs. It's not an Apples to Apples comparison - hell, it's not even the same CPU language...(x86 vs. ARM).

jage
07-16-2003, 08:56 PM
Comments?

Umm...it's a PDA. What do you expect? :wink:

Don't get me wrong, I really want to see performance improve, but you have to consider that mobile CPUs are completely different beasts than desktop CPUs. It's not an Apples to Apples comparison - hell, it's not even the same CPU language...(x86 vs. ARM).

Oh, I don't expect much. :) Just to put the things into more familiar perspective. Many people just look at the MHz and think the performance is somewhat related to equivelant MHz Intel processor. I just wanted to point out in some cases it's order of magnitude from the reality.

Low power consumption doesn't come for free. I'm perfectly aware the cache configuration they use is especially low power use design. Switchable cache wouldn't cost power, but it would cost considerable amount of die area, area which would probably be useless for the vast majority of the users. What I don't get is that why CPU designers want to throw complete IEEE-compliant FPU implementation *or* nothing, when simple FADD (addition), FSUB (subsctration), FMUL (multiplication), FSTORE (integer conversion, writing, etc), FCMP (comparison) design would do just fine.

Jason Dunn
07-16-2003, 09:05 PM
Many people just look at the MHz and think the performance is somewhat related to equivelant MHz Intel processor. I just wanted to point out in some cases it's order of magnitude from the reality.

That's very true - 400 Mhz on an Xscale CPU isn't quite where 400 Mhz on a Pentium II would be. :wink:

JSY
07-16-2003, 09:11 PM
Yes, and a Gameboy Advanced SP "only" has the power of a Super Nintendo Entertainment System that came out about a decade or so ago. I mean, that's perspective. It's okay for people to look at the Mhz when comparing it to other Pocket PCs as the comparison is all relative. So really, unless you comparing a Pocket PC with a desktop - or something that is not analogous, in terms of performance - does it really matter if people make an emphasis on the clock speed? Just my thoughts.

jage
07-16-2003, 09:33 PM
Yes, and a Gameboy Advanced SP "only" has the power of a Super Nintendo Entertainment System that came out about a decade or so ago.
That's (probably) false. I don't know much about Nintendo products, but I bet 16MHz or so ARM is an order of magnitude faster than whatever they used in SNES.

Ah, just read about SNES cartrigdes themselves having coprocessors to accelerate games... yeah, those cartridge coprocessors could put SNES on par with ARM @ 16MHz or so and even past it.

Anyways, Nintendo machines rely on external graphics chips anyways to offload heavy graphics processing from the CPU, like sprite engine, rotation, etc.