This text was published in the may 2002 issue
of the DaemonNews magazine.
Open Source Hackers' Guide Through The Galaxy
A Tour through the NetBSD Source Tree
Part III - Kernel
Hubert Feyrer, January 2002
Part III - Kernel
This is the third part of our tour through the NetBSD source
tree. After we have talked about the various components that build up
the userland, we will concentrate on the kernel source now. It is
located in /usr/src/sys, with the /sys symlink being a well-known
abbreviation to reach the system's kernel source.
Let's remember what happens when building a kernel: after editing the
kernel config file located in /sys/arch/<arch>/conf and running
config(8) on it, a number of files are created in
/sys/arch/<arch>/compile/KERNELNAME. The header files contain
data about what and how many devices to include, as well as other data
for the system's configuration. Besides that, a Makefile is created,
that is used to build the kernel from source. The interesting point to
note here is that there is only one Makefile that will locate and
compile all the needed sources and place the object files in the
.../compile/KERNELNAME directory. In NetBSD, there is no recursive
tree-walk of the whole source tree utilizing several Makefiles to
build the various sub-trees of the kernel source. This allows building
kernels for several configurations and platforms from the same source,
without different builds tripping across each other.
Still, the various parts of the NetBSD kernel are placed in various
subdirectories that we will have a closer look at now. Under
/usr/src/sys, there are:
We have now described all the important directories that are available
in the NetBSD source tree. To get used to the directory structure, it
is recommented that you browse the directories and have a look at the
various files to fully explore things.
- adosfs, coda, filecorefs, isofs, msdosfs, nfs, ntfs:
- These are various filesystems used directly by NetBSD to access
data. Some of the filesystems' primary goal is to help in
exchanging data between the machine's native operating system
(AmigaOS's adosfs, Acorn Computers RISC OS's filecorefs, ...),
while others implement filesystems
that can be found on many systems (isofs, nfs, ...).
- The Unix (User) File System is the base of the native filesystem
used in NetBSD. Ancient (AT&T) Unix filesystems only allowed up
to 14 chars long filenames, there were no symlinks etc. The
problems were solved by the Berkely computer scientists
implementing BSD Unix. Their filesystem implementation serves as
a base for several filesystems based on it these days, using
various ways of data layout on the disk.
The filesystems are stored in the "ufs" subdirectory, filesystems
contained in there include
- ext2fs: Linux' ext2fs
- lfs: Log structured filesystem
- mfs: Memory filesystem, for things like in-core /tmp
- ufs: The native NetBSD filesystem
- ffs: General routines of the Berkeley Fast File System,
utilized by the other UFS-based filesystems, including
things like softdeps.
- This directory contains further filesystems that aren'd directly
related to physical storage. Instead they implement various
layered filesystems for services like data translation
or routines for implementing kernel features. Using the virtual
filesystem operations table, it is easy to change behaviour of a
operation upon certain conditions, e.g. mapping operations to
deadfs on a file who's filedescriptors were revoke(2)'d.
The filesystems included here are:
- deadfs: Implements operations that don't modify any data and
instead return indications of invalid IO. Used to revoke(2)
- fdesc: Maps a process' file descriptors into filesystem
space, depending on the accessing process. Can be mounted on
/dev/fd using mount_fdesc(8).
- fifofs: Implements FIFOs using Unix domain sockets internally
- genfs: Generic filesystem functions that mostly return
errors of some kind - bad filedescriptor, bad operation, or
just does no operation at all. Used for implementing deadfs
- kernfs: This filesystem is usually mounted under /kern and
provides various informations about the running system, like
kernel version, system time etc.
- nullfs: Used to "mirror" one directory tree onto another
directory, providing the same tree on both mount
points. Also known as loopback mount - see mount_null(8) for
- overlay: The operation of this filesystem is similar to the
null filesystem, the implementation allows using this
filesystem as a base for further layered filesystems though,
as all VFS operations are defined. See mount_overlay(8) for
- portal: The portal filesystem provides an service that
allows descriptors such as sockets to be made available in
the filesystem namespace following conversion rules given in
a config file. See the mount_portal(8) manpage for further
- procfs: Similar to kernfs, this filesystem is usually
mounted on /proc and allows accessing various data about
processes. It is used by ps(1) and other utilities. See
mount_proc(8) for more information.
- specfs: Implements routines to access special devices. The
filesystem provides a filesystem interface, and calls the
device-specific routines depending on the device's type,
major and minor number.
- syncfs: Operations used to implement the ioflush kernel
thread that writes out modified pages to disk.
- umapfs: A filesystem for re-mapping UIDs/GIDs, useful
e.g. when mounting a NFS volume from a server that has a
different set of UIDs/GIDs than the local machine.
- union: This layered filesystem allows merging two
filesystems, providing a view as if they were mounted on the
same mountpoint. Modifications go either to the "upper" or
to the "lower" layer, which allows mounting a CDROM
(read-only :), and mounting an empty but writable directory
over it, making it e.g. possible to do a compile on a source
expanded on the CDROM. See mount_union(8) for further
- This directory contains code for emulating binary compatibility
with various non-NetBSD operating systems as well as with old
NetBSD binaries. It includes:
- aout: This subsystem is used to run native NetBSD a.out
binaries on systems that made the transition to the ELF
executable format. As for most emulations, the shared
library loader ld.so, shared libs etc. are looked for in
- common: Various common routines used by all emulations like
system call table translation routines; also contains compat
code for prior NetBSD releases, see the COMPAT_* kernel
options in options(4).
- freebsd: mostly a few glue routines for running FreeBSD/i386
a.out and ELF binaries; See the compat_freebsd(8) manpage
for details on setting things up!
- hpux: To run native HP/UX programs on the Motorola based
hp300/hp400 machines. Adjusts a fair number of calls,
including terminal IO, signals, IO, etc.
- ibcs2: This code implements the Intel Binary Compatibility
Suite version2 used for running SCO programs on i386, but
also for general compatibility with AT&T System V.3
which is used on the VAX port. Maybe it should have been
named COMPAT_SVR3 - the compat_ibcs2(8) manpage contains
- linux: Code to run a.out and ELF Linux binaries for a
number of hardware platforms, including alpha, arm32, i386,
powerpc, mips, m68k, sparc and sparc64. One of the special
things of the Linux emulation is that Linux uses a different
system call table on each port, which makes maintaining
things a bit more interesting. The code is seperated in a
"common" directory that applies to all platforms, and
various architecture specific directories for different
CPUs. The compat_linux(8) manpage contains
more information on using the system, and there are also
several packages in pkgsrc that help in setting up the
necessary shared libraries etc. to run Linux binaries like
Netscape or Acrobat Reader.
- m68k4k: Some of the m68k ports used to use a pagesize of 4k
instead of the 8k common today. This code helps in
maintaining binary compatibility with old binaries that
still use 4k.
- netbsd32: Used by 64bit systems like sparc64 to run native 32bit
binaries, mapping the programs' 32bit args to the 64bit args
used by LP64 systems' kernels.
- osf1: The compat_osf1(8) system allows running OSF/1 (AKA
Digital Unix AKA Tru64) on the Alpha platform.
- ossaudio: This software layer provides Open Sound System
compatible ioctl calls that are then mapped to the native
NetBSD audio model by this code. Enabled when compiling in
support for Linux and/or FreeBSD binary compatibility.
- pecoff: This subsystem allows running programs that are in
the PEcoff executable format, which is found on the
Microsoft Windows platform. Of course mapping system calls
is a real challenge here, as the API to present to the upper
layer is definitely nothing that is even remotely near to
the API used on all the Unix-like compat systems, and as
such there's no easy mapping of the calls to NetBSD
functions. Much of the work is done by libraries in the
userspace instead, which then talk to the X server etc. See
the compat_pecoff(8) manpage for further details.
- sunos: If users still have SPARC or m68k applications built
for SunOS 4.x, this emulation layer will help run them. See
compat_sunos(8) for more information.
- svr4: The System V compat system allows binary compatibility
for several systems, e.g. Solaris (SunOS 5.x) on i386, sparc and
sparc64, Amix on m68k and SCO/Xenix on i386. The compat_svr4
manpage contains further information.
- ultrix: For pmax and other MIPS based systems as well as VAX
systems, to run Ultrix binaries. See compat_ultrix(8).
- vax1k: For VAX binaries that still use 1k pagesizes, this
allows running them. No idea where these originate -
probably very historic. :)
- The /sys/conf directory contains the main list of files to
include into kernel builds as well as scripts and files
used to update the OS version and compile it into the kernel.
The operating system's version is stored in the "osrelease.sh"
script, which is used from a number of places to determine the OS
- This directory contains code for various data encryption
standards (arc4, blowfish, DES, Rijndael etc.) that is subject to
crypto export regulations. The code is use by the IPsec kernel
- The DDB kernel debugger that can be used to do post
mortem debugging is found here. The debugger is used on all
- This directory contains device drivers that use the machine
independent bus_dma(9) and bus_space(9) interfaces and that work
on all platforms that support the necessary bus glue
routines. There are several subdirectories grouping drivers by
The directory structure is mostly oriented towards the bus system
that a hardware device attaches to, not towards the functionality it
provides. There are no special categories for things like audio,
network etc. - these are in their bus-specific directories like
pci, isa etc. containing (only) the bus-specific attachment
- bus interface: cardbus, eisa, ieee1394, isa, isapnp, mca,
pci, pcmcia, sbus, tc, usb, vme, qbus, xmi
- functionality: ata, i2c, i2o, mii, ofw, pckbc, raidframe,
rasops, rcons, scsipi, sysmon, wscons, wsfont
- general interfaces that are backed by bus-specific drivers:
audio, midi, rnd
If a chip implements some functionality like audio, network or
scsi, it is often used on several cards that all have the
same chip, but different bus interfaces - ISA, PCI, etc. To
prevent maintaining several drivers that have identical core
functionality, NetBSD drivers are seperated into bus-glue code
kept in the bus-specific directories mentioned above, and the
core functionality of the integrated circuit. Naming conventions
help identifying e.g. network cards (if_*), but aren't
implemented throughoutly, unfortunately.
The drivers for the core functionality are stored in the "ic"
subdirectory, with the file names indicating the IC's chip
% ls /sys/dev/ic
CVS cac.c isp_target.c pckbc.c
Makefile cacreg.h isp_target.h pckbcvar.h
README.ncr5380sbc cacvar.h isp_tpublic.h pdq.c
ac97.c cd1190reg.h ispmbox.h pdq_ifsubr.c
ac97reg.h cd1400reg.h ispreg.h pdqreg.h
- An IP-based debugger interface to a remote machine. Another way
to debug the NetBSD besides the DDB kernel debugger and gdb,
which can be used for debugging both userland and kernel.
- This directory contains the core kernel code including a number
- loaders for executables in various formats (a.out, EOF,
COFF, scripts ...)
- process and (kernel) thread management
- signal delivery and handling
- terminal IO subsystem
- sockets and other interprocess comunication primitives
- virtual filesystem layer, providing the framework used by
the filesystems in /sys/miscfs.
- many auxiliary routines used from all places
- Throughout the NetBSD kernel, there are many tasks that are used
from many places, and that are stored within a few libraries that
are used only in the kernel:
- libkern: This is basically what libc is for the
userland, with functions used for providing various
arithmetic operations that can't be inlined by gcc as well
as string/memory copy/comparison operations.
- libsa: The StandAlone library provides functions used for
loading the kernel, when there's no operating system running
yet and thus many of the services provided by the NetBSD
operating system are not available. The library includes
code for netbooting (rarp, RPC, NFS),
locating/loading the kernel from an UFS, LFS, ISO 9660 or
tar-structured media, memory management and others.
- libz: In-kernel decompression library for loading gzip
- This directory contains source for several standalone programs
that aren't used by NetBSD currently.
- NetBSD supports loadable kernel modules, and the sources are in
this directory. LKMs include a floppy driver for mac68k, various
binary emulations, IPfilter logging and several filesystems.
- NetBSD's networking framework contains many routines that are
independent of a special protocol, and that are used by several
networking protocols/stacks. The components are included in this
directory, functions include packet filtering (BPF), access
routines for all hardware cards (arcnet, ATM, ethernet, fddi,
IEEE 802.11, PPP, token ring etc.) that hand device access to
drivers in the /sys/dev directory, routing code etc.
- The code in this directory implements the kernel part of the
Appletalk protocol stack. The userland part is not included in
NetBSD, it can be installed from pkgsrc/net/netatalk(-sun).
- netccitt, netiso:
- Not in widespread use these days, NetBSD compes with an ISO/OSI
protocol stack which is located in these directories.
- Internet stuff - the NetBSD TCP/IP (v4) stack. Documentation on
this is available in section 9 of the NetBSD manual pages as well
as in Richard Steven's "TCP/IP Illustrated" books.
- Internet, next generation - this directory contains the KAME IPv6
stack that is shipping with NetBSD. See http://www.kame.net/ for
- Key management for IPsec - see the ipsec(4) manpage for more
- The code in this directory implements native mode ATM to
transport other protocols like IP.
- NetBSD has support for the Xerox network service protocol, which
can be found in this directory. Not in widespread use any more
today, the protocol is described in the first edition of Richard
Stevens' "TCP/IP Network Programming" book.
- This directory contains only header files that get installed
- The code in this directory implements NetBSD' New Virtual Memory
system that replaced the old Mach-based VM system some time
ago. See the uvm(4) manpage for more information.
- This directory has only the header files of the old Mach based
virtual memory system left, for use with various programs. The VM
system itself is not used any longer.
- Code specific to one hardware platform is collected under this
directory. Directories are present for each port as well as for
CPU-specific functions that are shared by several ports that use
the same CPU, avoiding redundancy.
Port-specific directories contain several subdirectories, with
the following ones being present for all ports:
- conf: contains kernel config files, a list of files specific
to the port and a template for the Makefile used to build a
- compile: This directory is initially empty, it gets
populated by config(8) with directories that contain a
Makefile and headerfiles to build a kernel.
- <port>: Port-specific functions, CPU/MMU/CPU initialisation
code, etc. - all the machine specific code that cannot be
shared across various hardware architectures.
- include: machine specific include files that describe the
CPU and MMU layout, data formats used by the FPU, limits,
- stand: This directory contains sources for loading the
kernel into the system - usually it contains code for
bootblocks, secondary stage bootloaders, netboot miniroots
and other facilities used to boot the system.
Further directories may exist in the arch specific directories
that contain bus-specific/non-machine independent device drivers
which don't fit into /sys/dev as they work on one port
only. Ideally, a port only uses machine independent drivers, of
(c) Copyright 20020110 Hubert Feyrer
$Id: tour-de-source-3kernel.html,v 1.2 2002/01/29 00:59:29 feyrer Exp $