hubertf's NetBSD Blog
Send interesting links to hubert at feyrer dot de!
[20090927] Looking at the new kernel modules in NetBSD-current
In contrast to the current and previous NetBSD releases, NetBSD-current and the next major release (6.0) uses a new system for kernel modules. Unlike the "old" loadable kernel modules (LKMs), the new module framework supports dependencies between modules, and loading of kernel modules on demand.

Today, I've found time to install NetBSD-current/i386, and configure things that I use here - /kern, /proc, and some NFS, in addition to a local disk. Now, looking at the list of loaded kernel modules reveals:

% modstat 
compat          misc    builtin 0       -       -
coredump        misc    filesys 1       3067    -
exec_elf32      misc    filesys 0       7225    coredump
exec_script     misc    filesys 0       1187    -
ffs             vfs     boot    0       166292  -
kernfs          vfs     filesys 0       11131   -
nfs             vfs     filesys 0       145345  -
procfs          vfs     filesys 0       28068   -
ptyfs           vfs     filesys 0       8975    - 
Interesting points here are that nfs, kernfs and procfs are just listed in /etc/fstab, and the related filesystem modules are loaded automatically, without a need to worry if they are needed or not. In fact I just assumed NFS is in the GENERIC kernel. Seems it's loaded as module! ;)

Another interesting module is "coredump", which is loaded by the module to execure 32bit ELF programs, exec_elf32. This is an example of module dependencies, and again no manual intervention was needed.

So what modules are there? First, let's remember that kernel modules are object code that implements facilities for the running kernel, and which interfaces closely with the running kernel. As such, they need to match the kernel version, ideally. When one of the kernel's API or ABI interfaces changes, it's best to rebuild all modules. For NetBSD, the kernel's version is bumped e.g. from 5.99.15 to 5.99.16 for such an interface change, which helps tracking those changes.

Back to the question of what modules are there. Now that we know kernel modules are closely tied to the version of the kernel (which still is in the file /netbsd, btw), associated modules -- for the example of NetBSD/i386 5.99.15 -- can be found in /stand/i386/5.99.15/modules:

% cd /stand/i386/5.99.15/modules
% ls -F
accf_dataready/     drm/                lfs/                ptyfs/
accf_httpready/     efs/                mfs/                puffs/
adosfs/             exec_aout/          miniroot/           putter/
aio/                exec_elf32/         mqueue/             radeondrm/
azalia/             exec_script/        msdos/              smbfs/
cd9660/             ext2fs/             nfs/                sysvbfs/
coda/               fdesc/              nfsserver/          tmpfs/
coda5/              ffs/                nilfs/              tprof/
compat/             filecore/           ntfs/               tprof_pmi/
compat_freebsd/     fss/                null/               udf/
compat_ibcs2/       hfs/                overlay/            umap/
compat_linux/       i915drm/            portal/             union/
compat_ossaudio/    kernfs/             ppp_bsdcomp/        vnd/
compat_svr4/        ksem/               ppp_deflate/
coredump/           layerfs/            procfs/

% ls */*.kmod
accf_dataready/accf_dataready.kmod      layerfs/layerfs.kmod
accf_httpready/accf_httpready.kmod      lfs/lfs.kmod
adosfs/adosfs.kmod                      mfs/mfs.kmod
aio/aio.kmod                            miniroot/miniroot.kmod
azalia/azalia.kmod                      mqueue/mqueue.kmod
cd9660/cd9660.kmod                      msdos/msdos.kmod
coda/coda.kmod                          nfs/nfs.kmod
coda5/coda5.kmod                        nfsserver/nfsserver.kmod
compat/compat.kmod                      nilfs/nilfs.kmod
compat_freebsd/compat_freebsd.kmod      ntfs/ntfs.kmod
compat_ibcs2/compat_ibcs2.kmod          null/null.kmod
compat_linux/compat_linux.kmod          overlay/overlay.kmod
compat_ossaudio/compat_ossaudio.kmod    portal/portal.kmod
compat_svr4/compat_svr4.kmod            ppp_bsdcomp/ppp_bsdcomp.kmod
coredump/coredump.kmod                  ppp_deflate/ppp_deflate.kmod
drm/drm.kmod                            procfs/procfs.kmod
efs/efs.kmod                            ptyfs/ptyfs.kmod
exec_aout/exec_aout.kmod                puffs/puffs.kmod
exec_elf32/exec_elf32.kmod              putter/putter.kmod
exec_script/exec_script.kmod            radeondrm/radeondrm.kmod
ext2fs/ext2fs.kmod                      smbfs/smbfs.kmod
fdesc/fdesc.kmod                        sysvbfs/sysvbfs.kmod
ffs/ffs.kmod                            tmpfs/tmpfs.kmod
filecore/filecore.kmod                  tprof/tprof.kmod
fss/fss.kmod                            tprof_pmi/tprof_pmi.kmod
hfs/hfs.kmod                            udf/udf.kmod
i915drm/i915drm.kmod                    umap/umap.kmod
kernfs/kernfs.kmod                      union/union.kmod
ksem/ksem.kmod                          vnd/vnd.kmod

% find . -type f -print | wc -l
There are directories with major kernel subsystems in the named directory, each one containing various files with the ".kmod" extension, for kernel modules. Subsystems include kernel accept filters, various file systems, compatibility modules, execution modules for various binary formats, and many others. Currently there are 58 kernel modules, and I guess we can expect more in the future.

P.S.: I've seen one confusion WRT systems that use kernel modules to whatever extent, as they shrink the size of the actual kernel binary: Even with kernel modules, an operating system is still a monolithic kernel: The modules are tied in closely into the system once loaded, ending in a monolithic system. In contrast, a "microkernel" is something very different, and it doesn't have anything to do with kernel modules. :-)

[Tags: , , ]

[20080713] Another source-changes catch-up (late may until second week of july 2008)
The following list gives changes to NetBSD-current between end of may to second week of july. Note that NetBSD is currently in a feature-freeze to prepare the 5.0 release, so there are more stability improvements going in than new features being added:
  • Work on the wrstuden-revivesa is ongoing. The old Scheduler Activations (SA) based threading code that was removed from NetBSD after 4.0 is adapted for NetBSD-current, so any applications that depend on SAs can continue to run. This is important for binary compatibility.
  • More changes towards the new kernel modules (kmod) framework:
    • file systems' sysctl init code is now ran in a fashion so that the modules can either be linked statically into the kernel, or loaded as module during runtime, without recompiling the code. (this used to be done via some #defines previously, which either expanded to code for the LKM, or to code for static inclusion).
    • the uaudio driver can now be compiled as kmod. More work is done to actually attach audio to newly found devices, though.
  • Wasabi's journaling filesystem support was added on the simonb-wapbl branch. There are still a number of issues to be resolved before this gets to flight under real life conditions.
  • Support for LVM as part of this year's Google Summer of Code was added on the haad-dm branch. Currently it is possible to create a logical volume, newfs and mount it with the Linux lvm2tools lvcreate utility - the NetBSD driver is API-compatible with Linux.
  • After TNF has changed its copyright from 4-clause to 2-clause, other holders of material in NetBSD's code base have made similar changes.
  • The yamt-pf42 branch was merged, which merges in a newer PF packet filter from OpenBSD 4.2.
  • Management of processor sets and thread affinity was added, see the cpuset(3), affinity(3), pthread_setaffinity_np(3) and pthread_getaffinity_np(3) manpages as well as the cpuctl(8) and psrset(8) commands.
  • The Red-Black-Tree code was optimized more, and moved in a place so that the same code can be used both from userland (libc) and kernel code.
  • ifconfig(8) was changed to allow easy adding/removal of features such as address families (inet, inet6, iso, atalk) and protocols (802.11, 802.3ad, CARP) via the Makefile.
  • SSH was extended with the HPN-SSH patch, which aims at improving performance of SCP and the underlying SSH2 protocol by dynamically allocating buffers. See the HPN-SSH homepage for more information.

[Tags: , ]

[20080612] More kernel works: audio, benchmarks, modules
In the past few weeks, Andrew Doran has made another bunch of changes to NetBSD's kernel area, including interrupts in NetBSD's audio framework, benchmarks of the system, and the handling of kernel modules.

SMP & audio: One area that hasn't been changed for moving towards fine-grained kernel locking was NetBSD's audio subsystem. As audio recording and playback is mostly done via interrupts, and as latency in those is critical, the audio subsystem was moved to the new interrupt handling system. The work can be found on the ad-audiomp branch, more information is available in Andrew's posting about the MP safe audio framework and drivers.

Benchmarking: Changing a system from inside out is a huge technical task. On the way, performance measurements and tuning are needed to make sure that the previous performance is still achieved while getting better performance in the desired development area. As a result, benchmarks results from Sun's libmicro benchmark suite were posted, which allow comparison not only against Linux and FreeBSD, but also between NetBSD-current and NetBSD 4.0, in order to identify if any bad effects were added. All performance tests were made on a machine with 8 CPUs, and the areas tested cover "small" (micro) areas like various system calls. Of course this doesn't lead to a 1:1 statement on how the systems will perform in a real-life scenario like e.g. in a database performance test, but it still help identifying problems and gives better hints where tuning can be done.

Another benchmark that was also made in that regard comes from Gregory McGarry, who has published performance measurements previously. This time, Gregory has run the lmbench 3.0 benchmark on recent NetBSD and FreeBSD systems as well as a number of previous NetBSD releases - useful for identifying performance degradation, too!

One other benchmark on dispatch latency run was made by Andrew Doran: on a machine that was (CPU-wise) loaded by some compile jobs, he started a two threads on a CPU that wasn't distracted by device interrupts, and measured how fast the scheduler reacted when one thread woke up the other one. The resulting graph shows that the scheduler handles the majority of requests in less than 10us - good enough for some realtime applications?

Kernel modules are another area that's under heavy change right now, and after recent changes to load modules from the bootloader and the kernel, the kernel build process was now changed so that pre-built kernel modules can be linked into a new kernel binary, resulting in a non-modular kernel. Eventually, this could mean that src/sys is built into separate modules, and that the (many) existing kernels that are present for each individual platform -- GENERIC, INSTALL is already gone, ALL, etc. etc. -- can be simply linked from pre-compiled modules, without recompiling things over again for each kernel. Of course the overal goal here is to speed up the system (and kernel!) build time, while maintaining maximum flexibility between modules and non-modular kernels.

With the progress in kernel modules, it is a question of time when the new kernel module handling supercedes the existing loadable kernel modules to such an extent that the latter will be completely removed from the system -- at least the latter was alredy proposed, but I'd prefer to see some documentation of the new system first. We'll see what comes first! (Documentation writers are always welcome! :-)

[Tags: , , ]

[20080527] The great source-changes catch-up for late March, April, and May 2008
Ok, after more weeks of slacking, some gems that I've found noteworthy, i.e. that have some "enduser" effect, where I also included developers and programmers in that group. I.e. not purely cosmetic/internal changes. "Fun stuff", i.e., not the hard labor that's still needed, and much appreciated! Here we go:

Changes related to SMP:

  • Yamamoto Takashi has started the yamt-nfs-mp branch to make the NFS client MP-safe
  • After merge of the yamt-lazymbuf branch, the send(2) and recv(2) system calls are MP-safe
  • Other system calls that have been made MP-safe are for NTP, PMC, reboot, sysarch and time. With the exception of the Darwin and Irix emulations, all system falls are now MP-safe!
  • Progress on the wrstuden-revivesa branch to get back support for Scheduler Activations. Much of the code that was removed when Andrew's 1:1 threading was added is put back in a way that both threading mechanisms can co-exist. Affected areas are the interface to the generic scheduler and locking.
Changes related to networking:
  • In the networking code, stats for ICMP, ICMP6, UDP, TCP, IP and IPv6 were changed from a structure to an array of uint64_t values by Jason Thorpe. This removes a few structs from the kernel header files. The change is ABI compatible with the old structures, as such tools like netstat(1) will continue to work.
  • Also, while moving towards a multi-threaded network stack, stats for protocols like UDP6, IP, PIM6, ARP, IGMP, IPSEC, IPSEC_FAST, PF_KEY, Appletalk DDP, and CARP are accounted on a per-cpu base, and routines were added to support collating per-cpu-gathered network statistics.
  • ifconfig(8) got a major overhaul towards improved modularity and extensibility. The internal parser's cleaner, and it should be easier to add new commands.
  • In the search for replacing the ISC DHCP client dhclient(8) with something smaller, Rob Marpled's DHCP Client Daemon dhcpcd(8) was imported. It is 1/6 of the size, yet has about all the features plus adds support for more modern RFCs like IPv4LL (RFC 3927), Classless Static Routes (RFC 3442) and Node-specific Client Identifiers (RFC 4361).
  • Kernel support for adding/removing link-layer (i.e. MAC/ethernet) addresses using SIOCALIFADDR AND SIOCDLIFADDR, respectively. Corresponding ifconfig(8) changes were announced to come soon.
Many other changes:
  • Progress on the mjf-devfs2 branch: adding wedge support, devfsd is started by init(8) before going multiuser
  • Thor Lancelot Simons has extended the crypto(4) interface to handle asynchronous operations. Ioctl calls to create, submit/retrieve and destroy multiple sessions were added, which should make it easier to write new crypto applications overall. The code for this was contributed to TNF by Coyote Point Systems, Inc.
  • Clauses 3 and 4 were removed from TNF licenses. There's an outstanding press release on this one, but interested parties are OK to remove those closes from existing older code, following UCB's prior example.
  • Kernel-option MULTIPROCESSOR is now mandatory on i386
  • i386 and amd64 now have a default boot.cfg file, allowing to boot either single user or multi user (= normally), wither with or without ACPI and/or SMP.
  • For new kernel modules, the suffix will be .kmod (over the old .o, see your /etc/rc.lkm.conf), and the designated place is /kernel/modules for now. A number of drivers that aren't in GENERIC are now built as modules to allow easier testing or loading when needed, and the x86 boot loader can now load new style modules and pass them to the kernel. Also, the miniroot (ramdisk) for kernels can be loaded as a module as well - this will mean one kernel for both installation and running - no more separate INSTALL kernels, yai!

    Curious parties can try the new scheme by setting MKMODULAR=yes in their /etc/mk.conf file and by adding option MODULAR to the kernel config, see src/share/mk/bsd.README for more information.

  • The UDF DVD/etc. file system now supports writing. Use the new mmcformat(8) tool to format rewritable CD/DVD discs and newfs_udf(8) to create the filesystem.
  • Support for the LC_MESSAGE, LC_MONETARY and LC_NUMERIC locale categories was added.
  • src/external was made as new directory where "external" sources from 3rd party projects will live in the future. This includes the sources that are in src/gnu/dist, src/dist etc. right now. Src/external has subdirectories named by license, and src/gnu/dist will become src/external/gpl (or so) in the future, OpenLDAP was the first package to be imported here, it lives in src/external/bsd/openldap.
  • Numerous drivers were added:
    • isv(4) for the IDEC Supervision/16 ISA image capture board
    • finsio(4) for the Hardware Monitor in the Fintek LPC Super I/O chips
    • amdtemp(4) for AMD CPU Temperature Sensors
    • hpqlb(4) for the HP Quick Launch buttons on the HP Pavilion notebooks
    • acpidalb(4) for PNP0C32 Hotkeys AKA "direct application launch buttons"
    • cpi(4) for the Creative Systems Inc. Hurdler CPI parallel printer card
    • siisata(4) for the Silicon Image SteelVine SATA-II controllers
    • lii(4) for the Atheros/Attansic L2 Fast-Ethernet chip found e.g. on the Asus EeePC
    • uberry(4) to charge a RIM BlackBerry on a USB port
  • Various imports and updates of in-tree 3rd party software:
    • IPfilter was updated to 4.1.29
    • libevent was updated to 1.4.4-stable
    • OpenSSL was updated to a snapshot from 20080509
    • OpenSSH 5.0 was imported
    • ATF 0.5 was imported
    • libarchive 2.5.4b was imported
    • nvi 1.81.6 was imported
    • OpenLDAP 2.4.9 was imported in the new src/external directory
    • nawk was imported from the 20070501 Bell Labs sources
    • + probably others that I've missed

[Tags: , ]

[20080508] More kernel works: preemption and realtime, devfs, modules, testing
The following kernel-related projects were raised in the past few weeks:
  • Kernel Preemption: Andrew Doran has continued his work towards fine-grained locking, and he has proposed a patch to implement kernel preemption, i.e. that in a realtime environment, high-priority processes can interrupt system calls running inside the kernel.

    Handling the Floating Point Unit (FPU) was added later on -- the FPU needs special attention as saving and restoring is expensive, and doesn't need to be done in many cases. But if a program uses it, care must be taken to handle the case. The exact handling is explained by Christoph Egger.

    While there, Christoph also outlined the roadmap for getting realtime support in NetBSD - there are still a number of bits missing, but being able to preempt the kernel is a good first step!

  • Fine-grained socket locking: In order to allow fine-grained locking (instead of blocking all other processes from entering the kernel, as is done in the "biglock" SMP approach), many kernel subsystems need to be changed. The socket system is the core part of interprocess communication, and Andrew Doran has changed it to use fine-grained locking now.

    In that context, the question of what code still runs with the biglock held, and Andrew gave an overview where more work is needed: some file systems (lfs, ext2fs, nfs), most of the drivers, protocols like TCP/IP, Veriexec, and some machine-dependent parts.

    Veriexec-Hacker Brett Lymn added details on the status of Veriexec with respect to its transition towards fine-grained locking.

  • Kernel modules and ramdisk: A change in kernel modules was proposed some time ago, and Andrew Doran has used this scheme now to unify the way many ports handle the install media: There, the kernel loaded contains a ramdisk (miniroot) image inside the kernel, which is then used as root-filesystem for the kernel, containing the install tools.

    In order to split things and eventually use a stock GENERIC kernel for both running and installing, Andy has changed the x86 boot process to load the miniroot as a kernel module.

    When booting it may be useful to select one of several ramdisks: one for installing, and one for resuing the system, For this, the recently introduced boot.cfg file was extended to handle kernel modules in the boot menu.

    Izumi Tsutsui has made an ISO with all changes for testing available.

  • Device File System (devfs): Another area of the kernel where a lot of work is currently being done by Matt Fleming is NetBSD's device driver infrastructure, esp. under aspects of dynamic attaching, detaching, and suspending (power management!). To talk to the various drivers, device nodes in the /dev directory are kept right now, but those are static and need to be updated when a new driver is added. Matt is working on a Device Filesystem (devfs) that dynamically created /dev from the list of devices inside the kernel. The fileysstem will also handle dynamic creation and deletion of nodes, and as an important case it will also keep permissions across reboots, if someone changes permissions manually.

    The work is at a very mature point right now and needs some testing - see Matt's mail to the tech-kern list for more information!

  • Testing driver attachment: While talking about testing of device drivers, David Young has reminded driver developers to test individual drivers' detachment and re-attachment, suspension and resumption after changes. He has also posted a how-to for those tests, using drvctl(8). (The manpage needs some updating, sorry -- UTSL :-)

[Tags: , , , , , , ]

[20080428] Recent development related to puffs, ReFUSE, rump, and more (Updated)
NetBSD's kernel is under very active development these days, and while many changes are related to improve SMP, it's not the only area. An area where very interesting and unique work is being done is the filesystem interfaces that Antti Kantee is working. Things started out as a past year's Google "userfs" SoC project to implement an interface for running filesystem code in userland. The project was imported into NetBSD some time ago. On top of that, a library that mimics the Linux interface for filesystems in userland. Following the Linux name FUSE, the re-implementation is called ReFUSE (pun intended :). See the webpage about puffs, refuse, FUSE on the NetBSD website for more information.

Another project that was started by Antti after his work to run filesystem code in userland is "rump". The project allows to use "ordinary" filesystems that usually run inside the kernel, and mimic an environment similar to what's available inside the kernel, and move the whole filesystem into userland - verbatime, with no code changes! This allows to develop filesystem code in userland, and later on move it inside the kernel with no further changes - a bit step forward for filesystem development!

This all sounds rather easy, but as filesystems need to move data between storage and memory, a big issue in filesystems is interfacing with the virtual memory subsystem, and adding interfaces like puffs and ReFUSE also needs to consider VM for efficient transfers and caching.

Work in this area is still ongoing, and I've asked Antti about his recent achievements in this area[1]. While the only user-visible change is caching and performance improvements in the Secure Shell filesystem's handler "mount_psshfs", most of the changes are on the inside. Antti wrote me: ``The interesting ones from a programmer's perspective are probably:

  • Splitting userspace transport out of puffs in the kernel (putter)
  • Using putter to implement support for userspace block/char device drivers (pud). pud does still not have a userspace library similar to libpuffs. libpuffs needs to become libputter and lib{puffs,pud}.

  • Removing special case handling for the puffs user/kernel protocol transport. This means that file system requests can now be read/written like any other protocol. This is covered in the AsiaBSDCon 2008 paper "Send and Receive of File System Protocols: Userspace Approach With puffs"

    With some minor work in libpuffs, it possible to e.g. do an ffs mount from a remote site with the help of rump.

Finally, while not really useful for anything except puffs development, I think the following is cool from the perspective of completeness:

  • Add support to rump to be able to run the puffs kernel module in userspace. This means that that *any* puffs file system (incl. rump ones) can be mounted so that requests pass once through the puffs kernel module running in the kernel and once through the puffs kernel module running in userspace before being delivered to the file system driver. Example:
      sys/rump/fs/bin/syspuffs> ./syspuffs mount_psshfs server.address /path 

With puffs and rump, there are two very interesting and active projects doing research in filesystems on NetBSD, which may lead to changes in the way filesystems are understood in the Unix world. While there, a third project that may be worth watching in this regards is this year's Google hurdt Summer of Code project by Marek Dopiera, which aims at implementing Hurd translators for NetBSD,

Update: Antti dropped me a note that another project related to filesystems is this year's "fs-utils" SoC project. The goal is to create a userland tool to manipulate filesystem images, and the idea is to reuse kernel code with the ukfs library. That way, no redundancy between kernel sources and userland sources are created, and both areas benefit from mutual testing and code maturity.

[Tags: , , , , , ]

[20080118] Another stab at kernel modules
Currently, NetBSD supports loadable kernel modules via the LKM interface. The interface supports a few types of kernel modules, e.g. for file systems, system calls and executable file formats. Support for loadable device drivers is currently limited, and source code for LKMs needs to be adjusted to the interface, so the same code cannot be used inside the kernel and as module. Andrew Doran has done some work on improved support for kernel modules. His improvements include
  • an in-kernel loader/linker so there's no need to rely on running the ld(1), and rely on have a working userland, thus
  • module dependencies so that one module can request to load other modules automatically
  • support to load modules from the boot loader and provide them to the kernel
  • use the same code for kernel modules and in-kernel use, so things that are currently used inside the kernel can be moved to a module easily, without changes in code.
The current state of the work is that this is a first version of the code that needs quite some more work. For more information, see Andrew's posting, which also includes examples for testing, and future directions that need to be done to replace LKMs. More thoughts on what else needs to do are outlines in Andrew's second mail on the subject.

[Tags: , ]

[20070618] Kernel-tuning without recompiling
NetBSD's i386 GENERIC kernel has ACPI enabled nowadays. Given that there's more than enough (i386ish) hardware out there that plainly doesn't work with ACPI, it's sort of inconvenient to have the default kernel not work. Possible workarounds for this situation are offering a kernel that has ACPI disabled (like the GENERIC_NOACPI kernel that I've added just in time for NetBSD 4.0), or using userconf to disable ACPI[1]. The drawback is that you either need a special kernel, or that it's not permanent.

A possible solution is available in OpenBSD's config(8) command: By running "config -e /kernel", userconf commands can be "saved" into the kernel binary, preventing the need to re-run userconf on every boot.

Jared McNeill has proposed another approach for NetBSD now: Instead of modifying the kernel binary, have the bootloader read a list of (userconf) commands, and have the kernel execute them automatically.

Instead of introducing yet another config file format, Jared has opted for (re)using the proplib API functions to load the config file from disk and pass it on to the kernel. Those crying "YEEK, XML!" now can rest assured: there's a policy in NetBSD that XML is not used for config files that the user needs to edit, and the idea is to use userconf as usual, then dump the settings to the config file and use that on the next boot, see Jared's second patch for the most recent code version.

With this scheme, there's a common file where boot-time information can be stored, and the eventual idea is not only to have all ports' bootloaders read that file, but also store further information into the file to make settings other than those available via userconf today: Jared's ideas include storing bootloader settings (timeout, serial console speed, ...) and kernel tuneables like PCI_*_FIXUPs in there. I guess we can stay tuned to see what will happen on this front!

[1] What is userconf? Make sure you have "options USERCONF" in your kernel, then interrupt the bootloader and type "boot -c". You can then type "disable acpi" to, well, disable ACPI. It works for other drivers as well, but it won't be persistant and has to be done on every boot.

[Tags: , , ]

[20070208] Merging newlock2: consequences on in-kernel locking, SMP and threading
Andrew Doran has made substantial progress on the newlock2 branch, and he is now ready to merge the branch into NetBSD-current. Some of the changes this will bring are (citing from Andrew's mail, mostly):
  • A new set of synchronization primitives in the kernel designed to make programming for multiprocessor systems easier and more efficient: mutexes, reader / writer locks, condition variables, sleep queues and MI memory barrier operations.
  • A number of underlying kernel facilties have been made 'multiprocessor safe' including the scheduler, ktrace and the general purpose method of kernel synchronisation: sleep & wakeup
  • Some application facilities have been made MP safe and can now run without the "big lock" on multiprocessor systems, including signalling, SysV messaging, and system calls that inspect process state, for example: wait().
  • The number of system calls that will run without the big lock went from 1 up to 56, with more in the pipeline. For workloads that are fork intensive and make heavy use of signals this will show a small yet quantifiable benefit on multi-way systems.
  • The branch introduces a new 1:1 threading model that allows multithreaded applications to take advantage of all available CPUs in a multi-way system. The scheduler activations implementation used from NetBSD 2.0 through NetBSD 4.0 provides execellent performance on single CPU systems, but restricts any instance of a threaded application to a single CPU in the system. Given that multicore and multi-CPU systems are increasingly commonplace and that single threaded CPUs are rapidly disappearing from the market, we made the decision to move to a new threading model, on the basis that providing increased concurrency is now the most important factor in ensuring good performance for threaded workloads.
  • Those following source-changes already know what that new 1:1 threading model means for the scheduler-activations based m:n threading model: it's gone.
Read Andrew's mail for all the details, and esp. on how to update your system after the merge if you run -current!

[Tags: , , ]

[20061208] Driver development hints
There is a news item about OpenBSD driver development hints over at the OpenBSD Journal. I guess much of this applies to NetBSD as well, and it's nice to start with. More data is available in the NetBSD Internals Guide, Jochen Kunz's Writing Device Drivers and of course all section 9 manpages.

(If someone wants to include Jochen's text into the NetBSD Internals Guide, that'd be great... just like any other work in that area. Any takers? Send your patches to netbsd-docs@, feel free to CC: me!)

[Tags: , , ]

Previous 10 entries

Tags: , 2bsd, 3com, 501c3, 64bit, acl, acls, acm, acorn, acpi, acpitz, adobe, Advocacy, advocacy, advogato, aes, afs, aiglx, aio, airport, alereon, alex, alix, alpha, altq, am64t, amazon, amd64, anatomy, ansible, apache, apm, apple, arkeia, arla, arm, art, Article, Articles, ascii, asiabsdcon, aslr, asterisk, asus, atf, ath, atheros, atmel, audio, audiocodes, autoconf, avocent, avr32, aws, axigen, azure, backup, balloon, banners, basename, bash, bc, beaglebone, benchmark, bigip, bind, blackmouse, bldgblog, blog, blogs, blosxom, bluetooth, board, bonjour, books, boot, boot-z, bootprops, bozohttpd, bs2000, bsd, bsdca, bsdcan, bsdcertification, bsdcg, bsdforen, bsdfreak, bsdmac, bsdmagazine, bsdnexus, bsdnow, bsdstats, bsdtalk, bsdtracker, bug,, busybox, buttons, bzip, c-jump, c99, cafepress, calendar, callweaver, camera, candy, capabilities, card, carp, cars, cauldron, ccc, ccd, cd, cddl, cdrom, cdrtools, cebit, centrino, cephes, cert, certification, cfs, cgd, cgf, checkpointing, china, christos, cisco, cloud, clt, cobalt, coccinelle, codian, colossus, common-criteria, community, compat, compiz, compsci, concept04, config, console, contest, copyright, core, cortina, coverity, cpu, cradlepoint, cray, crosscompile, crunchgen, cryptography, csh, cu, cuneiform, curses, curtain, cuwin, cvs, cvs-digest, cvsup, cygwin, daemon, daemonforums, daimer, danger, darwin, data, date, dd, debian, debugging, dell, desktop, devd, devfs, devotionalia, df, dfd_keeper, dhcp, dhcpcd, dhcpd, dhs, diezeit, digest, digests, dilbert, dirhash, disklabel, distcc, dmesg, Docs, Documentation, donations, draco, dracopkg, dragonflybsd, dreamcast, dri, driver, drivers, drm, dsl, dst, dtrace, dvb, ec2, eclipse, eeepc, eeepca, ehci, ehsm, eifel, elf, em64t, Embedded, embedded, emips, emulate, encoding, envsys, eol, espresso, etcupdate, etherip, euca2ools, eucalyptus, eurobsdcon, eurosys, Events, exascale, ext3, f5, facebook, falken, fan, faq, fatbinary, features, fefe, ffs, filesystem, fileysstem, firefox, firewire, fireworks, flag, flash, flashsucks, flickr, flyer, fmslabs, force10, fortunes, fosdem, fpga, freebsd, freedarwin, freescale, freex, freshbsd, friendlyAam, friendlyarm, fritzbox, froscamp, fsck, fss, fstat, ftp, ftpd, fujitsu, fun, fundraising, funds, funny, fuse, fusion, g4u, g5, galaxy, games, gcc, gdb, gentoo, geode, getty, gimstix, git, gnome, google, google-soc, googlecomputeengine, gpio, gpl, gprs, gracetech, gre, groff, groupwise, growfs, grub, gumstix, guug, gzip, hackathon, hackbench, hal, hanoi, happabsd, hardware, Hardware, haze, hdaudio, heat, heimdal, hf6to4, hfblog, hfs, history, hosting, hotplug, hp, hp700, hpcarm, hpcsh, hpux, html, httpd, hubertf, hurd, i18n, i386, i386pkg, ia64, ian, ibm, ids, ieee, ifwatchd, igd, iij, image, images, imx233, imx7, information, init, initrd, install, intel, interix, internet2, interview, interviews, io, ioccc, iostat, ipbt, ipfilter, ipmi, ipplug, ipsec, ipv6, irbsd, irc, irix, iscsi, isdn, iso, isp, itojun, jail, jails, japanese, java, javascript, jetson, jibbed, jihbed, jobs, jokes, journaling, kame, kauth, kde, kerberos, kergis, kernel, keyboardcolemak, kirkwood, kitt, kmod, kolab, kvm, kylin, l10n, landisk, laptop, laptops, law,, ldap, lehmanns, lenovo, lfs, libc, license, licensing, linkedin, links, linksys, linux, linuxtag, live-cd, lkm, localtime, locate.updatedb, logfile, logging, logo, logos, lom, lte, lvm, m68k, macmini, macppc, macromedia, magicmouse, mahesha, mail, makefs, malo, mame, manpages, marvell, matlab, maus, max3232, mbr95, mbuf, mca, mdns, mediant, mediapack, meetbsd, mercedesbenz, mercurial, mesh, meshcube, mfs, mhonarc, microkernel, microsoft, midi, mini2440, miniroot, minix, mips, mirbsd, missile, mit, mixer, mobile-ip, modula3, modules, money, mouse, mp3, mpls, mprotect, mtftp, mult, multics, multilib, multimedia, music, mysql, named, nas, nasa, nat, ncode, ndis, nec, nemo, neo1973, netbook, netboot, netbsd,, nethack, nethence, netksb, netstat, netwalker, networking, neutrino, nforce, nfs, nis, npf, npwr, nroff, nslu2, nspluginwrapper, ntfs-3f, ntp, nullfs, numa, nvi, nvidia, nycbsdcon, office, ofppc, ohloh, olimex, olinuxino, olpc, onetbsd, openat, openbgpd, openblocks, openbsd, opencrypto, opendarwin, opengrok, openmoko, openoffice, openpam, openrisk, opensolaris, openssl, or1k, oracle, oreilly, oscon, osf1, osjb, paas, packages, pad, pae, pam, pan, panasonic, parallels, pascal, patch, patents, pax, paypal, pc532, pc98, pcc, pci, pdf, pegasos, penguin, performance, pexpect, pf, pfsync, pgx32, php, pie, pike, pinderkent, pkg_install, pkg_select, pkgin, pkglint, pkgmanager, pkgsrc,, pkgsrcCon, pkgsrccon, Platforms, plathome, pleiades, pocketsan, podcast, pofacs, politics, polls, polybsd, portability, posix, postinstall, power3, powernow, powerpc, powerpf, pppoe, precedence, preemption, prep, presentations, prezi, Products, products, proplib, protectdrive, proxy, ps, ps3, psp, psrset, pthread, ptp, ptyfs, Publications, puffs, puredarwin, pxe, qemu, qnx, qos, qt, quality-management, quine, quote, quotes, r-project, ra5370, radio, radiotap, raid, raidframe, rants, raptor, raq, raspberrypi, rc.d, readahead, realtime, record, refuse, reiserfs, Release, Releases, releases, releng, reports, resize, restore, ricoh, rijndael, rip, riscos, rng, roadmap, robopkg, robot, robots, roff, rootserver, rotfl, rox, rs323, rs6k, rss, ruby, rump, rzip, sa, safenet, san, savin, sbsd, scampi, scheduler, scheduling, schmonz, sco, screen, script, sdf, sdtemp, secmodel, Security, security, sed, segvguard, seil, sendmail, serial, serveraptor, sfu, sge, sgi, sgimips, sh, sha2, shark, sharp, shisa, shutdown, sidekick, size, slackware, slashdot, slides, slit, smbus, smp, sockstat, soekris, softdep, softlayer, software, solaris, sony, sound, source, source-changes, spanish, sparc, sparc64, spider, spreadshirt, spz, squid, ssh, sshfs, ssp, statistics, stereostream, stickers, stty, studybsd, subfile, sudbury, sudo, summit, sun, sun2, sun3, sunfire, sunpci, support, sus, suse, sushi, susv3, svn, swcrypto, symlinks, sysbench, sysctl, sysinst, sysjail, syslog, syspkg, systat, systrace, sysupdate, t-shirt, tabs, talks, tanenbaum, tape, tcp, tcp/ip, tcpdrop, tcpmux, tcsh, teamasa, tegra, teredo, termcap, terminfo, testdrive, testing, tetris, tex, TeXlive, thecus, theopengroup, thin-client, thinkgeek, thorpej, threads, time, time_t, timecounters, tip, tk1, tme, tmp, tmpfs, tnf, toaster, todo, toolchain, top, torvalds, toshiba, touchpanel, training, translation, tso, tty, ttyrec, tulip, tun, tuning, uboot, ucom, udf, ufs, ukfs, ums, unetbootin, unicos, unix, updating, upnp, uptime, usb, usenix, useradd, userconf, userfriendly, usermode, usl, utc, utf8, uucp, uvc, uvm, valgrind, vax, vcfe, vcr, veriexec, vesa, video, videos, virtex, virtualization, vm, vmware, vnd, vobb, voip, voltalinux, vpn, vpnc, vulab, w-zero3, wallpaper, wapbl, wargames, wasabi, webcam, webfwlog, wedges, wgt624v3, wiki, willcom, wimax, window, windows, winmodem, wireless, wizd, wlan, wordle, wpa, wscons, wstablet, X,, x11, x2apic, xbox, xcast, xen, Xen, xfree, xfs, xgalaxy, xilinx, xkcd, xlockmore, xmms, xmp, xorg, xscale, youos, youtube, zaurus, zdump, zfs, zlib

'nuff. Grab the RSS-feed, index, or go back to my regular NetBSD page

Disclaimer: All opinion expressed here is purely my own. No responsibility is taken for anything.

Access count: 23192982
Copyright (c) Hubert Feyrer