From - Sun Dec 31 21:26:31 2000
Path: reader4.news.rcn.net!feed1.news.rcn.net!feed2.news.rcn.net!rcn!news.maxwell.syr.edu!sunqbc.risq.qc.ca!news-out.usenetserver.com!newsfeed2.earthlink.net!newsfeed.earthlink.net!news.mindspring.net!not-for-mail
From: Iain McClatchie <iain@mcclatchie.com>
Newsgroups: comp.arch
Subject: Re: 4M pages are a bad idea (was Re: AMD 64bit Hammer CPU and VM)
Date: Sun, 31 Dec 2000 15:47:30 -0800
Organization: MindSpring Enterprises
Lines: 57
Message-ID: <3A4FC592.3211661F@mcclatchie.com>
References: <918g9r$19q$1@nnrp1.deja.com> <91f5jb$80o$1@rocky.jgk.org> <3A3B75B8.31DD9104@moreira.mv.com> <977174086.645306@haldjas.folklore.ee> <3A3F6B2A.1C98807B@moreira.mv.com> <slrn9401lj.68d.jepler@potty.housenet> <91tpcn$s6u$1@news7.svr.pol.co.uk> <rozvgsdts6h.fsf@thulcandra.houst.sgi.com> <91tutt$31v$1@pyrite.mv.net> <3A43E034.129C5BEE@hda.hydro.com> <921br3$n70$1@pyrite.mv.net> <3A44EB37.3CEC0ED8@hda.hydro.com> <925k89$t8b$1@pyrite.mv.net> <3A467BCC.63965D7F@hda.hydro.com> <7HN16.4878$bU.333212@bgtnsc04-news.ops.worldnet.att.net> <3A48BCAF.C00961E9@hda.hydro.com> <iz426.4816$7f3.341052@bgtnsc07-news.ops.worldnet.att.net> <3A4E3F99.2AEA295B@mcclatchie.com> <oNz36.11058$7f3.775700@bgtnsc07-news.ops.worldnet.att.net>
NNTP-Posting-Host: d1.6f.d4.09
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Server-Date: 31 Dec 2000 22:46:21 GMT
X-Mailer: Mozilla 4.75 [en] (X11; U; Linux 2.2.12-20smp i686)
X-Accept-Language: en
Xref: reader4.news.rcn.net comp.arch:122333

Stephen> I'm not sure, in practice, which analysis is more meaningful.
Stephen> I suspect it depends on the error characteristics of the disk
Stephen> system.  If erorrs are really very few, then the probability
Stephen> of >N (where N is the number of error bursts tolerable in a
Stephen> short record) occurring within the same (larger)
> record is small and my statement of the problem is more applicable.  If the
> error rate is higher to the point where the probability of >N bursts
> occuring in the longer records is significant, your analysis is more
> applicable.
> 
> Since the point of the sub group was to evaluate longer records in order to
> decrease the overhead of ECC, then they (the coding experts, of whom I was
> most certainly NOT one) must have thought they could use fewer ECC bits per
> data bit in the longer records.  That was what led to my statements.
> 

Right.  But it turns out the error characteristics of the disk
system are a dependent variable, not a parameter.

The head, media and amplifiers combine to deliver some amount of
analog signal energy and noise to the digital decoding logic for a
given amount of media traversed.  Disk designers get to choose how
many bits they will encode in this signal.  More bits means a lower
signal/noise ratio per bit.

So if you use more parity bits for a given number of data bits over
a given amount of media, you get a lower signal/noise ratio per bit,
but you can correct more errors.  For disk-block-size Reed-Solomon
codes, you generally want about one parity bit for every few data
bits, and you generally end up about a factor of two away from
Shannon's limit to the amount of data storable on that medium.  You
can't get closer than about a factor of two with R/S codes, no matter
how much you scale up the block size.

And that was the state of the art in coding in the early 1990s.  Then
turbo codes came along, or more to the point, turbo decoding.  The
tremendously cool thing about turbo decoding is that you get closer
to Shannon's limit as you increase the block size.  With, say, 64KB
blocks, you might not be able to get all the way there, but you could
pick up something like a 50% capacity increase.

Yes, there are downsides.  You need big blocks.  Turbo decoding
requires a great deal of computation.  But you could fix the turbo
decode to, say, 8 or 16 iterations over the entire block, so that you
had a deterministic decode time.  (Modern disk drives don't have
deterministic decode times anyway.  If they can't read a sector the
first time, they retry at least once before they give up.)

If Broadcom's BCM5400 can do 1/4 of a TERAOP just to equalize a
cheezy cat-5 cable enough to send 1 Gb/s, then it's only a (short)
matter of time before disk drives do something similar to squeeze
another 50% out of their physical subsystems.  The question is,
how do we get to the big block sizes?

-Iain McClatchie                            650-364-0520 voice
http://www.10xinc.com                       650-364-0530 FAX
iain@10xinc.com                             650-703-2095 cell