.TL
The \f(BIHesiod\fR\s-1*\s0\fB Name Server
.FS
\s-1*\s0 n.  8th century B.C. Greek poet.  The names of the Gods and
the myths surrounding them are recorded in his poetry.
.FE
.AU
Stephen P. Dyer
.AI
Project Athena
Massachusetts Institute of Technology
Cambridge, MA 02139
dyer@ATHENA.MIT.EDU
.AB
\fIHesiod\fR, the Athena name server, provides naming for services and
data objects in a distributed network environment.  More specifically,
it replaces 
databases that heretofore have had to be duplicated on each workstation
and timesharing machine (e.g., remote file system information,
\fI/etc/printcap\fR, \fI/etc/services\fR, \fI/etc/passwd\fR, \fI/etc/group\fR)
and provides a flexible mechanism to supply new information 
as the need arises.
.AE
.2C
.NH
Introduction and Purpose
.PP
The computing enviroment at Project Athena has recently changed
from a group of timesharing machines to a collection of file servers
and many hundreds (and potentially many thousands) of publically-accessible
workstations.  The origins of
.UX
as a time-sharing system become acutely
obvious when confronted with the need to
manage information for hundreds of machines that may be
used by many different individuals.  The method used by
.UX
to maintain information for its users and programs has been ASCII
database files stored on each machine which are authoritative
for all users of that machine.
However, this breaks down when the number of machines and potential
users are multiplied
by two or three orders of magnitude.  The system management effort
to keep each machine's information current grows directly as the number of
machines; this quickly becomes unworkable with more than a few
dozen machines.  We wanted a solution that could easily accomodate
Athena's expected growth for the next 5 years.
.PP
Rather than having information duplicated on each machine, the
concept of retrieving information via a network service, a \fIname server\fR,
has proved workable and reliable.  Xerox's \fIClearinghouse\fR,
.[
XEROX
.]
Sun's \fIYellow Pages\fR
.[
SunYP
.]
and the Internet Domain Name Server
.[
Mock1
.]
.[
Mock2
.]
are examples of
name services in current use.  We chose to base our name service,
\fIHesiod\fR, on the Berkeley Internet Domain Name Server, BIND,
.[
BIND
.]
for several reasons.  First, the design had proved itself
through its use in the Internet over the past several
years, and it had a number of features that made it an attractive
base for \fIHesiod\fR: its hierarchical name space, the ability
to delegate authority to subsidiary name servers, and the ability
to take advantage of local caching of data to improve performance.
Second, the BIND source code was readily available and
provided a firm foundation for a more general name service;
we did not have to spend time building low-level support facilities
which it already provided.
Finally, BIND source code is non-proprietary, which would facilitate our
distribution of \fIHesiod\fR to other interested sites.
.PP
\fIHesiod\fR provides a name service for use by workstations
and timesharing systems.  It does not address the problems of
centralized management and
distribution of such information, which is provided by another
service, the Athena Service Management System, or SMS.
.[
SMSUsenix
.]  
SMS maintains and distributes information managed by Athena Operations
to each of the Athena \fIHesiod\fR name servers.
\fIHesiod\fR may be used without SMS; neither is
dependent on the other.  However, without an information
management system front end, the \fIHesiod\fR databases are simply
ASCII files in BIND-compatible resource records format that
must then be managed with a text editor.  Large sites may appreciate
the convenience SMS provides, while smaller sites
may opt for the simplicity of using \fIHesiod\fR without SMS.
.PP
\fIHesiod\fR provides a
content-addressible memory where certain strings can be mapped to
others depending on the query.  \fIHesiod\fR has no
knowledge about the data it stores; queries and responses are
simple key/content interactions.
It is designed to be used in situations where a small amount
of data that changes infrequently needs to be retrieved quickly, with little
overhead.  It is not intended to serve as a general-purpose database
system supporting arbitrary queries, or as a repository for information
that changes frequently.  The current implementation
provides no facility for an arbitrary
application to update the \fIHesiod\fR database, which is
refreshed several times
a day by the Athena SMS.  Because of the limitation imposed by the
underlying implementation of \fIHesiod\fR, based as it is on the Internet
domain naming scheme, there is a maximum length of 512 bytes of
data that can be exchanged between the client and the name servers
using UDP datagrams.  This imposes
limits on both the maximum size of an individual data record, as well as
the number of records that can be returned in a single packet in the
case of multiple matches.  \fIHesiod\fR was designed to provide applications
with a rapid, low-overhead naming service in which a query would return
no more than a few matches of limited size.  Applications that
require more complicated queries or ones that return voluminous
data should consider interfacing to SMS.
.NH
\f(BIHesiod\fB Queries
.PP
A \fIHesiod\fR query consists of two parts, a \fIHesiodName\fR, which is the
name of an object in the network, and a \fIHesiodNameType\fR, an
application-specific
qualifier that identifies the application space in which that
object is named.  
.PP
We do not use standard Internet Domain Name notation to
refer to \fIHesiodNames\fR for several reasons:
First, we wish to have
objects with name containing the '.' character.\(dg
.FS
\(dg As just one example, MIT course names, such as \fI6.001\fR, contain periods.
.FE
In Internet domain notation, a name that contains a '.' is considered fully
resolved.  Second, early BIND implementations 
had no provision for deciding the proper domain suffix to use when
resolving a relative name.
.PP
A name given to the \fIHesiod\fR name server for resolution looks like:
.nf
HESIODNAME => LHS
HESIODNAME => LHS@RHS
LHS	   => [Any ASCII character, except NUL and '@']* 
		{ 0 or more characters from this alphabet }
RHS	   => [Any ASCII character, except NUL and '@']+
		{ 1 or more characters from this alphabet }
.fi
.LP
In other words, a \fIHesiodName\fR consists of [LHS][@RHS]
where either [LHS] or [@RHS] need not be present.
.PP
The LHS of a \fIHesiodName\fR is \fIuninterpreted\fR; although it may be
modified according to the rules described by the information in
\fI/etc/hesiod.conf\fR (see below), it is not itself a domain name.
.PP
We define a set of routines known as the \fIHesiod\fR library that take two
strings, a \fIHesiodName\fR and a user-supplied key, a \fIHesiodNameType\fR,
convert it to a
fully-qualified domain name, call the BIND library, and return the
results to the original caller.
The \fIHesiodNameType\fR is
a well-known string that is provided by an application that
uses the \fIHesiod\fR library.  It is used directly in the expansion of a \fIHesiod\fR
name to a BIND name (see below) without further indirection or translation.
A new \fIHesiodNameType\fR comes into existence simply by being used
by an application; no libraries or configuration files need to be modified.
Naturally, there has to be appropriate data stored by the name server
which is associated with that \fIHesiodNameType\fR.
.PP
To provide an example, one of the routines in the \fIHesiod\fR library takes a
\fIHesiodName\fR and returns a fully-qualified name to be handed to BIND:
.nf
.in +2
\fCchar *
hes_to_bind(HesiodName, HesiodNameType)
char *HesiodName, *HesiodNameType;\fR
.fi
.in -2
.PP
The \fIHesiodNameType\fR
identifies the query to make to BIND and the proper expansion
rules to use with the LHS and RHS of the name.  This would be chosen by
the application, and could be application-specific.
.PP
Thus, the following are valid \fIHesiodNames\fR:
.nf
.in +2
14.21
default-printer
default-printer@SIPB
@heracles
@heracles.MIT.EDU
kerberos@Berkeley.MIT.EDU
.in -2
.fi
.LP
The configuration file \fI/etc/hesiod.conf\fR contains two tables specifying the treatment of
LHS and RHS components of a \fIHesiodName\fR.  In the translation of a
\fIHesiodName\fR to a valid BIND name, the LHS is expanded by concatenating
together the \fIHesiodName\fR, the separator '.', the HesiodNameType, 
and the LHS entry found in the configuration files.
If the RHS is null, the RHS entry in the configuration file is used.
If the RHS is a fully qualified domain name already, it is used directly.
Otherwise, if a RHS is present,
it is used as a \fIHesiodName\fR for further
resolution against the \fIHesiodNameType\fR, "rhs-extension".
If this query succeeds, the first reply is used as the RHS,
otherwise an error is
returned.
The fully-expanded LHS and RHS are then
concatenated together, separated by a '.', and this value is passed to BIND
for resolution.
.PP
The following is a typical copy of \fI/etc/hesiod.conf\fR:
.nf
.in +2
.sp
#file /etc/hesiod.conf
#comment lines begin with a '#' in column 1
#LHS table
lhs = .ns
#RHS table
rhs = .Athena.MIT.EDU
.sp
.fi
.in -2
.PP
With this definition,
a call to \fBhes_to_bind\fR("e40", "printer")
would
produce a LHS of "e40.printer.ns" and a RHS of ".athena.MIT.EDU", and the
resulting BIND name, "e40.printer.ns.Athena.MIT.EDU".
.PP
In C pseudo-code, we would have the following productions: 
.nf
  \fBhes_to_bind\fR("14.21, "filesys") =>
	"14.21.filesys.ns.Athena.MIT.EDU"
  \fBhes_to_bind\fR("e40, "printer") =>
	"e40.printer.ns.Athena.MIT.EDU"
  \fBhes_to_bind\fR("SIPB, "rhs-extension") =>
	"SIPB.rhs-extension.ns.Athena.MIT.EDU"
  \fBhes_to_bind\fR("default@SIPB", "printer") =>
	"default.printer.ns.SIPB.MIT.EDU"
	(this assumes that the previous production
	resolved to "SIPB.MIT.EDU")
  \fBhes_to_bind\fR("kerberos@Berkeley.EDU", "sloc")
	=> "kerberos.sloc.ns.Berkeley.EDU"
.fi
.LP
These productions are then passed to the BIND name server for resolution.
.NH
Data Types
.PP
\fIHesiod\fR data are stored as Internet domain resource records.  A new class,
HS, signifying a \fIHesiod\fR query or datum has been reserved, and
a new query type, TXT, that allows the storage of arbitrary ASCII strings.
Paul Mockapetris, the Internet Domain System designer, has recently
specified the HS class and the TXT type in RFCs 1034 and 1035.
.NH
BIND Requirements
.PP
A version of BIND that supports the HS
query class and TXT query type is required to
support the \fIHesiod\fR name service.  The latest release of BIND as
of 12/31/1987, version 4.7.3, has been modified at Athena
to support this, and we will be forwarding these changes to Berkeley
for future releases of BIND.
.NH
Athena Client Applications of \f(BIHesiod
.PP
Many applications and subroutines have been modified to take advantage of
the \fIHesiod\fR service.  See Appendix A for an enumeration of
some of the \fIHesiodNameTypes\fR in common use within Project Athena.
.PP
The \fIattach\fR program queries \fIHesiod\fR for the filesystem
with the given name, retrieves the data, and mounts the appropriate RVD
or NFS
.[
SUNNFS
.]
filesystems, while also authenticating the user
to the file server using \fIKerberos\fR.
.[
KTech
.]
.PP
\fILogin\fR uses the user's login name as a
\fIHesiodName\fR to retrieve the user's \fI/etc/passwd\fR and
group membership information.
The actual password field is not used; rather the \fIKerberos\fR
service authenticates the user.  \fILogin\fR queries \fIHesiod\fR
to determine which \fIKerberos\fR server to invoke.
By convention, the username is also the name of the user's default
filesystem.  
The \fIlogin\fR program runs the \fIattach\fR program (\fIq.v.\fR)
with the user's login name as an argument to mount the user's
home directory.
.PP
Athena users receive their mail on POP
.[
POP
.]
(post-office protocol) servers.
We have modified the MH
programs \fIinc\fR and \fImsgchk\fR to
query \fIHesiod\fR for the location of the user's POP server. 
.PP
The \fIlpr\fR program is compiled with a special version of the
\fI/etc/printcap\fR access library that queries \fIHesiod\fR if
the printer name cannot be found in the local \fI/etc/printcap\fR.
.PP
There are optional implementations of \fBgetpwnam()\fR, \fBgetgrnam()\fR, and
their inverse counterparts that query \fIHesiod\fR for name-to-UID,
name-to-GID translation, and vice-versa.  The same library includes
an implementation of \fBgetservent()\fR that queries \fIHesiod\fR
in preference to lookup in the file \fI/etc/services\fR.
.NH
\f(BIHesiod\fB Resource Records and Data Files
.PP
Appendix B lists samples of the resource records that we store
on behalf of \fIHesiod\fR client applications.  The format of the
ASCII strings returned by \fIHesiod\fR is application-specific.
In the case of queries that have an inverse operation, such as
queries with the \fIHesiodNameTypes\fR, \fBpasswd\fR, and \fBuid\fR,
the \fBuid\fR resource records are \fICNAMEs\fR for the corresponding
\fBpasswd\fR records.
.PP
The BIND boot file on each workstation, \fI/etc/named.boot\fR,
refers to an auxiliary
cache file, \fI/etc/named.hes\fR, that specifies the authoritative
name servers for \fIHesiod\fR queries.
.NH
Programming with the \f(BIHesiod\fB Library
.PP
There are only two subroutines, \fBhes_resolve()\fR and \fBhes_error()\fR,
that are usually invoked by the
applications programmer when using \fIHesiod\fR.  The subroutine
\fBhes_resolve()\fR
is the primary interface into the \fIHesiod\fR
name server.  It takes two string arguments, the
name to be resolved, the \fIHesiodName\fR, and a type
indicating the type of service associated with this name,
the \fIHesiodNameType\fR.
\fBhes_resolve()\fR
returns a pointer to an array of strings,
much like \fBargv[]\fR,
containing all the data that matched the query, one match per
array slot.  The array is NULL terminated.
A second call to
\fBhes_resolve()\fR
will overwrite any previously-returned data, so applications that
require data to be maintained across multiple calls to 
\fBhes_resolve()\fR
should copy the returned values into data areas they maintain.  
.PP
Note that a call to
\fBhes_resolve()\fR
may return more than one match.  The semantics of
using or choosing between multiple matches
is dependent on the particular application.  In general, however,
multiple matches are considered "equivalent", and any of them
could be used equally well.  This is exploited, for example, by the
\fIattach\fR command that attaches a remote file system to the
workstation.  In the case of system libraries, multiple copies of which
are considered equivalent, the
attach command iterates through all matches, stopping after the first
successful attach.  Because \fIHesiod\fR is based on the Internet
Domain Naming scheme, no interpretation can or should be given to the order
in which matches are returned.
.PP
If
\fBhes_resolve()\fR
returns NULL, then no data could be found, either
because the name server had no matching records or an error occurred.
The function \fBhes_error()\fR takes no arguments and
returns a small integer indicating the
type of error, if any, encountered in the last call to
\fBhes_resolve()\fR.
.PP
It is important to emphasize that \fIHesiod\fR knows nothing about the data
it stores; any meaning given to the \fIHesiodName\fR, the \fIHesiodNameType\fR
and the
data returned by \fIHesiod\fR is completely imposed by the application.
The format of the data stored by \fIHesiod\fR is application-specific,
and would be defined by the application programmer.
.KS
.nf
.sp
\fC#include <hesiod.h>

char *HesiodName, *HesiodNameType;
char **hp;

hp = hes_resolve(HesiodName, HesiodNameType);
if (hp == NULL) {
	err = hes_error();
	switch(err) {
	.
	.
	.
	}
} else {
	/* do your thing with hp */
	while(*hp != NULL) process(*hp++);
}\fR
.fi
.KE
.PP
The error values returned by \fBhes_error()\fR are one of the following:
.in +2
.nf
.sp
\fC#define HES_ER_UNINIT      -1
#define HES_ER_OK           0
#define HES_ER_NOTFOUND     1
#define HES_ER_CONFIG       2
#define HES_ER_NET          3\fR
.in -2
.sp
.fi
.LP
The most common values returned by \fBhes_error()\fR are HES_ER_OK,
meaning no error, and HES_ER_NOTFOUND, meaning that the desired
name was not found in the \fIHesiod\fR data base.
HES_ER_CONFIG indicates a problem
with the optional per-machine \fIHesiod\fR configuration file, \fI/etc/hesiod.conf\fR.
HES_ER_UNINIT will never be returned by \fBhes_error()\fR, unless it
is called before the first time \fBhes_resolve()\fR is called.
HES_ER_NET indicates that the request never received a response from
the \fIHesiod\fR name server.  This can be due to a variety of network
problems: for example, the host making the request might be disconnected
from the network, an intervening gateway might be down, or
no \fIHesiod\fR name servers responded.  No further information about
the state of the network is available because the
domain system on which \fIHesiod\fR is based uses datagrams with retries
as the communications interface.
.PP
HES_ER_NOTFOUND is a negative acknowledgement indicating that the desired
name/\fIHesiodNameType\fR pair was not found in the \fIHesiod\fR
database.  An application receiving this error message can
consider this an authoritative response.  Of course, this may be
due to an omission in the database, or simply reflect a delay
between the time \fIHesiod\fR data was asked to be placed into the
database, and the actual \fIHesiod\fR updates, which occur several times
each day.
.PP
In the case of a \fIHesiod\fR error of HES_ER_NET, it may be prudent
for an application to assume that this situation is temporary, and
that a later call to \fBhes_resolve()\fR will either return the desired data
or a definitive reply of HES_ER_NOTFOUND.
HES_ER_CONFIG indicates a problem with the \fIHesiod\fR configuration
file, a situation that requires intervention by a wizard and
will not resolve itself spontaneously.  Because no query to the
\fIHesiod\fR name server is actually made, no conclusion can
be drawn about the validity of the name to be resolved.  The standard Athena
distribution of the \fIHesiod\fR library does not require a configuration
file; its built-in defaults suffice, so this situation should not be
encountered frequently.
.PP
A general design strategy for applications using \fIHesiod\fR is to have
a contingency plan in place in case \fIHesiod\fR does not respond,
is configured incorrectly or does not know the name.  This may be
built-in to the application, such as new versions of \fIlpr\fR that
revert to using the old printcap libraries if
\fIHesiod\fR printer information is not available.  Another popular scheme, exploited
by the \fIMH\fR application \fIinc\fR and the EMACS tool, \fImovemail\fR, is to
allow the value of an environment variable, in this case, \fBMAILHOST\fR, to
override the call to \fIHesiod\fR to retrieve a person's mailhost, using his username
as the key.  Thus, a user can temporarily "hard-wire" appropriate
values to allow applications to proceed.  Not every application can
be programmed in such a fashion, but it is prudent to try to design
applications with this in mind.
.NH
Database Size and Performance
.PP
A measure of how successful \fIHesiod\fR has been in its
deployment over the past six months is how infrequently
problems have appeared.  For the most part, applications make
\fIHesiod\fR queries and receive answers with millisecond delays.
Today, the \fIHesiod\fR database for Project Athena
contains almost three megabytes of data: roughly 9500 \fI/etc/passwd\fR
entries, 10000 \fI/etc/group\fR entries, 6500 file system entries and
8600 post office records.  There are three primary \fIHesiod\fR
nameservers distributed across the campus network.
.PP
BIND has proven itself remarkably robust in
accomodating such a large, monolithic database.  One problem has been
noticed: the time to load the
primary nameservers (which are updated from the Athena SMS every six hours) has
increased markedly as the size of our data has grown.
At this point, it takes approximately 20 minutes
to reload a primary nameserver running on a VAX 750
and each primary nameserver's working set is
approximately 10 megabytes.
By staggering the times to reload each of the primary \fIHesiod\fR servers,
this has not proved to be
a large operational problem.  However, it does point out an area that should
be examined for improved performance.
Because the \fIHesiodNameType\fR component in
the domain name passed to BIND identifies a potentially
separate start of authority, the
\fIHesiod\fR database could be split across two or more primary nameservers,
each authoritative for a subset of the full database.  This
would reduce the time to load each nameserver and the size of its working set.
.NH
Acknowledgements
.PP
Jerry Saltzer, Technical Director of Project Athena,
provided a great deal of assistance and guidance
in the design of the \fIHesiod\fR name server.  Thanks, too, go to Dan Geer
and Jeff Schiller for their assistance during the design and deployment
stages.  Clifford Neuman, presently at the University
of Washington, and Felix Hsu of Digital Equipment Corporation
participated in the early design of the system.
.br
.1C
.SH
Appendix A: \f(BIHesiodNameTypes\fB in Use at Athena
.PP
Here is a list of some of the presently-defined HesiodNameTypes, the type of
information provided as a \fIHesiodName\fR, and the applications programs that
use such queries.

.TS
box;
c c c c
----
l l l l.
\fIHesiodName\fR	\fIHesiodNameType\fR	Used By	Info Returned

workstation name	"cluster"	\fIgetcluster\fR	workstation cluster information
filesystem name	"filsys"	\fIattach/detach\fR	RVD and NFS file system info
username	"pobox"	MH \fIinc/movemail\fR	location and type of mailbox
username	"passwd"	\fItoehold/login\fR	Athena-wide /etc/passwd entry
		\fBgetpwent()\fR, \fIet. al.\fR
uid (ASCII)	"uid"	\fBgetpwent()\fR, \fIet. al.\fR	Athena-wide UID to username mapping
group name	"group"	\fBgetgrent()\fR, \fIet. al.\fR	Athena-wide /etc/group entry
			(no membership list)
group name	"grplist"	\fBgetgrent()\fR, \fIet. al.\fR	Athena-wide group membership mapping
gid (ASCII)	"gid"	\fBgetgrent()\fR, \fIet. al.\fR	Athena-wide GID to group name mapping
printer name	"pcap"	\fBpgetent()\fR	Athena-wide /etc/printcap entry
service name	"service"	\fBgetservent()\fR	Athena-wide /etc/services entry
service name	"sloc"	\fIOn-Line Consulting (OLC)\fR	Host name to contact for this service
		\fIKerberos\fR	(for those services that do not reside on every host)
.TE
.bp
.SH
Appendix B -- Sample Resource Records from Current \f(BIHesiod\fR Database Files
.LP
.nf
\fC# filsys.db
# format of data is 
#	filesystem-type name-on-server server-hostname mount-mode mount-point
dyer.filsys     HS      TXT "NFS /mit/dyer eurydice w /mit/dyer"
dyfeigen.filsys HS      TXT "NFS /mit/lockers/dyfeigen zeus w /mit/dyfeigen"
dyim.filsys     HS      TXT "NFS /mit/lockers/dyim zeus w /mit/dyim"
bldg1-rtsys.filsys      HS      TXT "RVD rtsys oath r /srvd"
bldg1-rtsys.filsys      HS      TXT "RVD rtsys persephone r /srvd"

# gid.db
# format of data is
#	canonical name with this group id
481.gid HS      CNAME   10.01.group
483.gid HS      CNAME   10.01a.group
484.gid HS      CNAME   10.01b.group
639.gid HS      CNAME   10.01sa.group
640.gid HS      CNAME   10.01sb.group
638.gid HS      CNAME   10.01t.group

# group.db
# format of data is
#	/etc/group entry
10.01.group     HS      TXT 10.01:*:481:
10.01a.group    HS      TXT 10.01a:*:483:
10.01b.group    HS      TXT 10.01b:*:484:
10.01sa.group   HS      TXT 10.01sa:*:639:
10.01sb.group   HS      TXT 10.01sb:*:640:
10.01t.group    HS      TXT 10.01t:*:638:

# grplist.db
# format of data is
#	groupname1:gid1:groupname2:gid2:...
10.01.grplist   HS      TXT "10.01:481:10.01t:638"
10.01ta.grplist HS      TXT "10.01t:638"

# passwd.db
# format of data is
#	/etc/passwd entry
dyer.passwd     HS      TXT "dyer:*:17287:101:Steve Dyer,,,,:/mit/dyer:/bin/csh"

# pobox.db
# format of data is
#	post-office-type server-host-name mailbox-name
dyer.pobox      HS      TXT "POP E40-PO.MIT.EDU dyer"

# printcap.db
# format of data is
#	/etc/printcap entry
nil.pcap    HS      TXT "nil|LPS-40:rp=nil:rm=castor.mit.edu:sd=/usr/spool/printer/nil:"
.bp
# service.db
# format of data is
#	service-name protocol port-number
discard.service HS      TXT "discard tcp 9"
discard.service HS      TXT "discard udp 9"
nntp.service    HS      TXT "nntp tcp 119"

# sloc.db
# format of data is
#	host name where this service is offered
zephyr.sloc     HS      TXT ARILINN.MIT.EDU
zephyr.sloc     HS      TXT NESKAYA.MIT.EDU
zephyr.sloc     HS      TXT ORPHEUS.MIT.EDU
zephyr.sloc     HS      TXT PRIAM.MIT.EDU

# uid.db
# format of data is
#	canonical name with this user id
17287.uid       HS      CNAME   dyer.passwd

.sp
.br
.[
$LIST$
.]
