From: Bill Todd [billtodd@foo.mv.com]
Sent: Wednesday, July 14, 1999 1:21 AM
To: Info-VAX@Mvb.Saic.Com
Subject: Re: Whither VMS?

Rob Young <young_r@eisner.decus.org> wrote in message
news:1999Jul13.013336.1@eisner...
> In article <910612C07BCAD1119AF40000F86AF0D802CCDDE2@kaoexc4.kao.dec.com>,
"Main, Kerry" <Kerry.Main@Compaq.com> writes:
> >
>
> And Bill Todd wrote ... but hard to keep this in synch as some players
trim
> attribution for whatever reasons.  Legal?  Must be.

As I said earlier, I tend to trim attribution when the material I'm
including came from the message I'm responding to.  I've since come to
suspect that some people see these posts through a list server (which
doesn't reflect inheritance) rather than a news reader, so am earnestly
trying to mend my ways.

> It is true the Coupling Facility has been expanded to hold
> 16 Gigabytes of cache and it is a wonderfully tuned locking
> mechanism, superior to VMS DLM , yadda , yadda.  Point is though
> MVS is trapped in a 31-bit (32, 33 depends on who you run across)
> world.  The Alpha systems of the near future will contain several
> hundred processors and 1+ Terabyte of memory.  MVS loses as it won't be
> able to address much past its hacked up 16 Gigabytes and while
> that CF is a wonderfully tuned caching/locking beast, it too
> is a bottleneck (a mainframe bottleneck, the CF is a mainframe).

Leaving aside my suspicion that the IBM hardware types could find a way to
add more address bits to the architecture if they felt it necessary, and the
fact that alternatively they could address arbitrarily large amounts of
local memory similarly to solid-state disk to create as large a local cache
as you might wish for (and possibly map it dynamically into their more
limited physical address space if the difference in access time made this
worthwhile), the fact that the shared CF memory allows sharing of dirty data
without funneling it through the physical disk could make existing Sysplex
facilities faster than Alpha clusters with infinite memory for some
workloads:  shared cache isn't useful just for reads.

> As the VMS DLM moves into shared memory and is distributed
> (surely must be , my conjecture) across several nodes (see Galaxy
> Locks thread) the CF suddenly isn't so hot after all.

If you look at the relative performance of CF vs. locks in Galaxy shared
memory (including the mechanisms required to keep one crashed system from
bringing down the entire lock database), they're likely about equal.  But
the CF still has the advantage that it supports shared caching as well.

> So when you talk about "going up against".. (hee-hee-hee) the
> IBM mainframe folks we'll see Alpha systems with Galaxy in the
> next several years that will be truly monsterous in comparison to
> large Sysplexes.

As I suggested above, maybe, and maybe not.  Not to mention RS/6000
clusters, which aren't likely to stand still in the interim (and, of course,
are already a full 64-bit architecture):  they are not an unreasonable
alternative to VMS today for many (not all:  Unix still doesn't treat
in-process asynchrony very well) applications, and the Monterey initiative
is going to make them significantly more attractive in terms of providing a
compatible application architecture from the x86 on up.  Give them a
top-notch cluster file system and the (appropriately modified) Sysplex
version of DB2 (or just run Oracle, since they can already run Oracle
Parallel Server on [truly] shared disks the same way VMS can) and they may
well be somewhat superior to VMS for the majority of applications - they
already support distributed shared memory and a high-performance
interconnect which can doubtless be improved if seriously challenged by the
Galaxy shared-memory speed.

> > The underlying VMS cluster facilities are up to the task, but
> > the file system falls a bit short in areas of performance
>
> Think we've beat on that one a while before but worth mentioning
> again.  If you have a Terabyte of memory isn't the filesystem mostly
> for writes?  And if I have VCC_WRITEDELAY and VCC_WRITEBACK
> enabled, I'll race you, okay?  :-)

I'd be more than happy to race (metaphorically), as long as you let me pull
the power switch on both systems in the middle so we can see how their
robustness compares.  What's that?  You didn't write back your file system
meta-data and didn't have it logged?  Too bad...  When it comes to
performance and availability, I prefer to have my cake and eat it too -
especially if I can get better performance on a given amount of hardware
simply by bringing my software up to contemporary designs.

Then again, this isn't the first time that people have prophesied that
increasing amounts of memory would make file systems effectively
write-only - e.g., this was a large part of the rationale behind
log-structured file systems.  Recent papers seem to be backing away from
this position, after having evaluated the LSFSs that are available for
inspection.

And, of course, there are people who would just as soon make do with a lot
less memory, as long as they could get equivalent performance out of it:  in
the amounts you're postulating it's by far the dominant element of the
system hardware cost (either that, or you've got so much disk out there that
it's not going to give you much better cache performance than current
systems do).

To digress a bit at the end here:

Monterey looks like about the only potentially serious general-purpose
contender to MS in the OS arena these days, and may well gather popularity
simply because of that.  Aside from being a low-end-to-high-end standard,
however, it's a very respectable system in its own right.  If it is truly
based on AIX, then its file system is (currently) JFS:  a good single-node
log-backed file system that exports that node's (fail-over-able) portion of
the file system to the rest of the cluster - certainly not ideal, and not
quite as desirable as Tru64's Cluster File System (if I've correctly
interpreted the few bits of data I've been able to come up with on Tru64
CFS:  it certainly looks like it runs AdvFS on locally-owned portions of the
file system, but instead of exporting them like a file server it may export
the meta-data information that allows other nodes to access the data
directly on shared disks - again not ideal, but somewhat more scalable than
JFS-style exporting), but nothing to be sneezed at.

Given the emergence of a *real* Unix standard (which the Unix community has
been clamoring for for so long), why would customers be likely to choose
Tru64 - especially when Compaq itself will be selling Monterey x86 systems
at the low end, and, at least in the short term, x86 Linux systems which are
binary-compatible at the application level with the SCO UnixWare x86
Monterey systems?  Unless Tru64 jumps onto the Monterey band-wagon, can it
have a serious long-term presence - especially when, unlike HP and Sun
Unixes, one could question the significance of its current presence?

So if I were Compaq, I'd be asking questions like "Just how much better than
Monterey does Tru64 have to be to be viable in the coming market?" and "Is
it possible that we'd sell more Alpha systems if they ran Monterey on them,
given that Alpha seems likely to enjoy a performance advantage over
competing hardware for at least the immediate future?" and "If we do sell
Monterey on Alpha, how much additional revenue will Tru64 bring in?" and "If
we had a foot in the Monterey camp, what unique software (all right, you
know I'm talking about file systems, but Compaq likely doesn't know that
this specific opportunity - and it's likely not the only one - exists) could
we add transparently that would make our Monterey systems better, but still
'standard'?"

Customers want standardization and consolidation in the industry -
preferably while maintaining multiple sources (as Monterey will have),
though they put up with MS for lack of any real choice in the matter.  Given
a platform that offers standardization and multiple sources, they will
choose it over a platform lacking such perceived advantages unless the
competing platform is significantly superior in price and/or performance
and/or necessary features.

That tends to suggests industry consolidation into Windows on the desktop
(and perhaps higher), Monterey in the server space, and, if Monterey can't
take over the high end, S/390 and VMS (ignoring true niche markets and those
customers who will continue using whatever they're using today because
changes are too painful:  they produce revenue, but of the diminishing
variety).

Being pushed by improving Monterey facilities on one side and the IBM
behemoth on the other, can VMS really afford to pass up opportunities for
significant improvements?  If VMS does pass them up, can Compaq convince
customers that it's really backing VMS for the long term?