new sys cluster update

David Becker (becker@wakko.cs.washington.edu)
Fri, 8 Jan 1999 14:09:15 -0800

Status report on the sys cluster
------------

The sys cluster now includes 40 PCs. They are all connected by fast
ethernet hubs. The hubs, the router and master.sys are connected to a
fast ether switch.

PC allocations so far are:
Services
master.sys - server for dhcp, rconsole, bind, nfs, x10, cyclades ttys
linux1.sys - linux cycle server
module-3.sys - a special linux box for M3 compiles (uses libc5)
Research
vegas[1-2].sys - for TCP/vegas kernel work
nimi1.sys - coming this afternoon

The rest are named test1.sys to test40.sys with some holes in that
namespace. test1 to test19 are the new Dells without serial lines and
have been used by Yaz for porcupine experiments. As they are allocated
the "test" names will disappear.

Service changes
---------------

krconsole
The serial lines are now accessed with krconsole. In the old
rconsole system we all had to remember a well-known rconsole password.
With kerberos, you use your password with kinit to get a ticket
and krconsole doesn't ask for a password.

krconsole also encrypts the data on the wire so we can't be
snooped liked we were last summer.

serial lines
We are ordering more serial ports so everything can have a
serial line.

yp/NIS
The cluster will not use yp any longer. Its slow and buggy.
passwd files will be distributed by rdist without passwords.
The password itself will be your AFS password. At some point
this can become the dept kerberos password.

Boot PROMs
All the old PCs had boot PROMs installed. All the sys cluster
machines now begin booting by first fetching a boot program from
master.sys. In the current setup, most hosts are told to boot
their hard drive like normal PCs. Some do a diskless linux boot.

It is now possible to network boot linux kernels which is very
useful for kernel hacking.

Not Ready for Prime Time
-----------------------
FreeBSD
Machines that were running Free need their IP addresses changed
on disk before they'll boot. And they'll need entries in DNS,
DHCP and rconsoles.

ssh
ssh is available on most of the linux boxes, but the host public
keys haven't been collected and put into the ssh_known_hosts
files everywhere. This means your client may complain about
unknown hosts.

ksu
You should be able to kinit everywhere, but ksu won't work on
most hosts because Ihaven't made new host keys or figured out how to
do key distribution and key backup. That goes for the ssh
private keys too.

password
Now that we're not using YP, we'll have to exercise the system
for getting new accounts into the cluster correctly. In theory
it will work fine :-)

alpha console lines
The alphas are still in the old racks. The only change is they
no longer have serial lines. The plan is to buy more racks and
then move them into 328C where we'll have serial ports available
on the new serial controller.

X10/power
The power scripts have not been updated yet, so there is no easy
way to remote power cycle. We also discovered the limits of X10.
The signals do not travel well through the power strip fuses
that are under load. This means the service is not reliable for
all hosts. More X10 controllers, or fancy power strips that you
can telnet to, will help solve this.