From: peter@abbnm.com
Sent: Thursday, January 10, 2002 12:23 AM
To: Info-VAX@Mvb.Saic.Com
Subject: Re: Compaq still tries to spin Alphacide both ways

Let's try this again. I had an attack of the stupids in the last version of
this article, and mixed up horizontal and vertical microcode in several places
in it. Pardon for the confusion (and if I still have them wrong, feel free
to send me nasty email).

I also took the opportunity to expand a little near the end.

In article <Mf7_7.9209$Vz3.1079597@newsread2.prod.itd.earthlink.net>,
Felger Carbon <fmrfne@jps.net> wrote:
> Well, it's been a while since the RISC-CISC wars were fought in this NG.
> What I'm asking is, "Has there _ever_ been a true CISC x86 processor?

They all are. CISC doesn't describe the implementation, it describes the
instruction set.

A RISC instruction set is very close to what used to be known as "vertical
microcode", where each micro-op performed one simple operation very quickly.

The alternative was "horizontal microcode", where each microcode word had
many fields to operate on every part of the processor simultaneously.

For small processors, and they were all small in the '80s by our standards,
horizontal microcode was a significant win, since each microinstruction
took about the same amount of time whether it was a vertical micro-op or
a horizontal one. Once pipelining was developed, and even more with
superscalar designs, vertical microcode became the thing to do, because
the fewer side effects a micro-op had the more easily it could be run in
parallel with other micro-ops.

Part of the background for RISC was the realization that a vertical
microcode micro-op wasn't that far removed from the individual operations
of simpler microprocessors... and people didn't have a lot of trouble
writing code for them, so why not just use them directly. So you had a
lot of fiddling with the design, with different kinds of RISC designs being
floated from the extremely raw MIPS designs (they didn't even have
hardcoded interlocks on the pipeline so you had to put delays in after
branches to let the pipeline refill: the branch didn't actually occur for
a couple of instructions after the branch opcode) to the complexities of
the Sparc.

Things have tended towards the MIPS end of the spectrum, but they've had to
add interlocks... what happens when the pipeline's longer in the second
generation?

Meanwhile, some people decided to try and see if horizontal microcode could
also be used directly. This is where VLIW and EPIC and the IA64 come from.

The problem is, writing horizontal microcode is tough. And it still makes
superscalar implementations hard: Intel has switched to a vertical microcode
(what they call their RISC core) in the x86 to get performance up. So why did
they think they could win with an explicit horizontal design? Well, the idea
was that you could get the compiler to handle all the scheduling decisions
and dump a stream of pre-decoded horizontal micro-ops that could just be fed
into the CPU. You wouldn't need a superscalar design, you'd just make the
instruction word wide enough that all the potential parallelization was
already handled. Sort of like feeding an athlete on pure predigested
proteins and energy drinks instead of beef and beans.

There's two main issues with this:

	1. You need a heck of a compiler.

	2. What happens when the second generation unit comes out? With
	   horizontal microcode, there's no reason two versions of a CPU
	   have to have the same fundamental units, and if you change the
	   implementation you change the rules on what operations happen
	   when, how long they take, even how many subunits there are,
	   well, only the microcode programmers see it. You can't DO that
	   with an instruction set: It's like the MIPS situation, except
	   now you have to add interlocks all over the place, run code
	   in an emulator, or have a hardware mode that does some kind
	   of interpretation on the old instructions so they can feed the
	   new hardware.

So what do you do? You compromise. You abstract the underlying hardware
somewhat and put a little bit of interpretation in. There's still timing
issues, so old code may run slower than a dog with no legs, but at least the
instruction set doesn't change every time you tweak the design. You probably
want to recompile everything for every new CPU, but you don't *have* to.

That makes upgrading to a new machine possible without crosscompiling. Always
good.

But you still need a hell of a compiler, and you've also thrown away most
of the potential advantages of the VLIW model since you still have to have
an instruction processor. And looking at the IA64 they've compromised an
awful lot: the EPIC instruction now looks like a bunch of RISC instructions
crammed asymmetrically into a single word.

They think they can get enough of a win from compiler technology that it'll
end up much faster than a traditional RISC design. But it's telling that
people are seriously talking about the EV8 team putting an Alpha style
RISC/Vertical Microcode engine under the hood of the great grandson of
Itanium.

> If so, why are current x86 processors not considered to be CISCs,

Current x86 processors *are* CISCs. They use a microcode that's closer to
the kinds of microcodes RISC instruction sets were derived from now, but
that's an implementation detail. What goes on under the instruction set has
only been part of the way you classify the instruction set in marketing
literature.

However, it still means that RISC design (that is, something like the
internal microcode your Pentium IV's 'RISC CORE' runs) has proven itself
the best design to build a high performance CPU. Everything between the
Pentium IV's 'RISC CORE' and the compiler has two effects:

	1. It adds hardware that has to be designed, taped out, tested,
	   verified, and so on. It adds hardware that reduces yeild,
	   increases costs, and takes resources away from hardware that
	   actually makes the secret high performance instruction set run
	   fast.

	2. It makes it harder for the compiler to schedule instructions,
	   because the code it generates doesn't actually get executed by
	   the processor. About all it can do is pick instructions that
	   run fast, and let the dynamic instruction scheduler do the
	   work.

In addition:

	3. The 'RISC CORE' has to be better at scheduling micro-ops than
	   regular RISC processors, which means that it's got to be more
	   complex, and thus slower than one that can use a simpler
	   scheduler.

So to get equivalent performance you have to spend more resources on design
and fabrication.

> aside from
> the fact that there are a lot of former Taliban - excuse me, former RISC
> supporters - out there who don't like the fact that they wound up on the
> wrong side, as their worthless stock options conclusively prove?"

How do you figure it's the wrong side?

> This is comp.arch.  Isn't it true that a micro's instruction set
> architecture, and not its implementation, defines the micro?

Absolutely. The x86 is a CISC, and its performance is due to some truly
brilliant implementations and amazing heroics, and the fact that Intel has
been able to throw orders of magnitude more engineering into making it run
fast than everyone else put together.

It's telling that Digital/Compaq, using a fraction of Intel's resources, has
been able to stay ahead of them... usually significantly so... in CPU
perfomance for almost all the last decade. I would love to see what Intel
could do with a good design in their pocket. If Intel had, say, adopted
something like the Alpha in 1995 instead of striking out on their traditional
path of trying to build a complex design that'll knock everyone's socks off
(iApx432), at least when the compiler technology comes together (i860, IA64),
we'd already be using them for all our high-performance systems.

-- 
 `-_-'   In hoc signo hack, Peter da Silva.
  'U`    "A well-rounded geek should be able to geek about anything."
                                                       -- nicolai@esperi.org
         Disclaimer: WWFD?