hubertf's NetBSD Blog
Send interesting links to hubert at feyrer dot de!
 
[20100212] Musing about git's object store efficiency
I'm currently looking at git to see what it can and cannot do, and one thing I've looked today is how effective the backing store mechanism is. To recall: CVS stores a list of patches between versions in a single file, and git stores each new revision in full in a separate file in the so-called object store. Is that an issue for NetBSD? Let's see;

One of the more frequently updated files is the i386 port's GENERIC kernel config file, which is at revision 1.963 right now. This means that since it's import into CVS, 963 different revisions have been made. In CVS, all those files are kept in a single GENERIC,v file. In git, this puts 963 files on the file system. A bit of a difference.

Looking at the space requirements for storing the repository data itself, the GENERIC,v file is 883,233 bytes[1]. Extracting all 963 versions from revision 1.1 to revision 963 results in disk space usage of 32,805,828 bytes[2,3]. And that's not counting the overhead of 962 inodes and the related directory bookkeeping.

In other words, the git model requires about 37 times the space that CVS does.

Sure the example file is not exactly one with an average number of revisions, and I know that git offers some more efficient storage methods via "pack" files, but investigating those is left as an exercise to the reader. :-)


[1] Obtained via rsync from cvs.netbsd.org:
% ls -la GENERIC,v 
-r--r--r--  1 feyrer  wheel  883233 Feb 12 16:57 GENERIC,v 

[2]

% mkdir extracted
% chdir extracted
% sh -c 'for i in `jot 964`; do echo $i ; co -p -r1.$i ../GENERIC >GENERIC-`printf %04d $i` ; done'

[3]

% cat extracted/* | wc -c
 32805828 


[Tags: , ]


[20080225] Mondo catch-up on source-changes (~Aug '07 'till Feb '08)
In the context of Mark Kirby stopping his NetBSD CVS Digest, I've felt an urge to catch up on source-changes, and put up some of the items here that I haven't found mentioned or announced elsewhere (or that I've plainly missed) after digging through some 7,000 mails. All those changes are available in NetBSD-current today and that will be in NetBSD 5.0:

  • Support C99 complex arithmetic was added by importing the "cephes" math library
  • POSIX Message queues were added
  • bozohttpd was added as httpd.
  • the x86 bootloader now reads /boot.cfg to configure banner text, console device, timeout etc. - see boot.cfg(5)
  • ifconfig(8) now has a "list scan" command to scan for access points
  • SMP (multiprocessor) support is now enabled in i386 and amd64 GENERIC kernels
  • Processor-sets, affinity and POSIX real-time extensions were added, along with the schedctl(8) program to control scheduling of processes and threads.
  • systrace was removed, due to security concerns
  • the refuse-based Internet Access Node file system was committed, which provides a filesystem interface to FTP and HTTP, similar to the old alex file system, see http://mail-index.netbsd.org/source-changes/2007/08/28/0081.html
  • LKMs don't care for options MULTIPROCESSOR and LOCKDEBUG, i.e. it's easier to reuse LKMs between debugging/SMP and non-debugging/SMP kernels now.
  • PCC, the Portable C Compiler that originates in the very beginnings of Unix, was added to NetBSD. The idea is that it is used as alternative to the GNU C Compiler in the long run.
  • In addition to the iSCSI target (server) code that is already in NetBSD 4.0, there'a also a refuse-based iSCSI initiator (client) now, see http://mail-index.netbsd.org/source-changes/2007/11/08/0038.html
Plus:
  • Many driver updates and new drivers, see your nearest GENERIC kernel config file
  • Many security updates, see list of security advisories
  • Many 3rd software packages that NetBSD ships with were updated: ipsec-tools (racoon), GCC 4.1, Automated Testing Framework 0.4, OpenSSH 4.7, wpa_supplicant and hostapd 0.6.2, OpenPAM Hydrangea
The above list is a mixed list of items. There are a number of areas where there is very active development going on in NetBSD. Andrew Doran is further working on SMP, fine-grained locking inside the kernel and interrupt priority handling. Antti Kantee has has done more work on his filesystems work (rump, puffs, refuse/fuse), and Jared McNeill and Jörg Sonnenberger have continued their work on NetBSD's power management framework. Those changes are large and far-reaching, and I've yet to look at them before I can report more here.

So much on this subject for now. If someone's willing to help out with continuing Mark Kirby's NetBSD CVS Digest either using his software-setup or by simply reading the list and writing a monthly/weekly digest of the "interesting" changes, I'd appreciate this very much. Put me on CC: for your postings! :)

[Tags: , , , , , , , , , , , , , ]


[20060808] CVS and stickiness
For the past few weeks, I've tried to build NetBSD-current on my slow old PC, and it always bombed out in src/distrib/i386/cdroms, complaining that my bootxx_cd9660 is busted:
/home/cvs/src-current/obj.i386/tooldir/bin/nbinstallboot  -t raw  \
	-mi386  bootxx /home/cvs/src-current/obj.i386/destdir/usr/\
	mdec/bootxx_cd9660
nbinstallboot: Invalid magic in stage1 bootstrap 0 != 7886b6d1
nbinstallboot: Set bootstrap operation failed 
This worked fine a few weeks ago, and the only major change that happened in NetBSD since then was the switch from gcc3 to gcc4. Suspecting some breakage there, I started building everying without any optimisation today ("nbinstallboot" needs HOST_CFLAGS="", also "bootxx" and "bootxx_cd9660"), but that didn't change anything.

I've verified that daily releng builds work, so this was probably a problem on my side, but where? I didn't want to blindly rebuild the whole toolchain on this slow PC, so tried investigating. Comparing /usr/mdec/bootxx_cd990 from my own and the releng build showed that there *was* some difference, so I continued looking in src/sys/arch/i386/stand/bootxx/bootxx_cd9660 to see what the matter was. Using hexdump -C showed that there was a difference between my bootxx_cd9660 and the releng one, and after getting the intermediate files of the build (bootxx_cd9660.tmp, cdboot.o) from a helpful being on #NetBSD, nm(1) showed that my version of cdboot.o lacked several symbols, e.g. a "start1".

As the cdboot.o file is made directly from a cdboot.S file, there's probably not much chance for the compiler to break things, and I didn't really believe that the assembler would add symbols on its own. Asking other people, they confirmed that they had "start1" in their cdboot.S files, while my copy of the same file lacked such a symbol. From there it was just a quick look at src/sys/arch/i386/stand/cdboot/CVS/Entries to fine the problem:

miyu% cat CVS/Entries
/Makefile/1.6/Wed Jun 28 20:23:05 2006//
/cdboot.S/1.2/Mon Aug  7 23:24:18 2006//T1.2
D 
Apparently I used "cvs update -r1.2 cdboot.S" some time ago to get that specific version, and forgot to tell CVS to remove that sticky tag to get the latest version on later 'cvs update' runs. Also, 'cvs update' doesn't tell that a file is sticky and so this was never detected, until it exploded. Now if the CVS update would print something for sticky files as it does for modified files, that would have saved me some time this evening. Doh!

Next thing to do: cd src ; cvs up -A, just to be on the safe side.

[Tags: , , ]


[20050928] Comparison of Version Control Systems
Subversion over CVS or not but what then else is an ongoing debate. Personally SVN may be nice from a user PoV, but when I have to setup an Apache and WebDAV and whatnot, then I prefer staying to CVS, thanks. To get some ideas of what other VCSs there out there and how they compare among each other, there is a nice Version Control System Comparison I found on Bluephod today. From a quick glance, I think I should look at Monotone a bit more...

[Tags: , ]


[20050320] Switching CVS servers easily
After copying a few old trees checked out from CVS today, and updating them afterwards, I got the dreaded list of conflicts in files that I've never touched:
 ...
 P distrib/sets/lists/xserver4/md.i386
 P distrib/sets/lists/xserver4/mi
 cvs update: move away distrib/utils/sysinst/Makefile; it is in the way
 C distrib/utils/sysinst/Makefile
 cvs update: move away distrib/utils/sysinst/Makefile.inc; it is in the way
 C distrib/utils/sysinst/Makefile.inc
 cvs update: move away distrib/utils/sysinst/SPELLING.en; it is in the way
 C distrib/utils/sysinst/SPELLING.en
 ...  
Remembering that this is usually caused by having different values in the CVS/Root files, this was the case for me too - I had a generic address of a CVS server in most of the files, but the files under src/distrib had a IPv4-only address in there, which caused the problem.

To end the problem for all times, I remembered of a nice trick I saw recently (I think mentioned by someone from the NetBSD admins team) to use one file for all CVS/Root files, and hardlink that into all "CVS"-directories:

 % cd .../src
 % cat CVS/Root >r
 % find . -name Root | grep CVS/Root \
 ? | sh -c 'while read r ; do echo rm $r ; echo ln r $r ; done' \
 ? | sh -v
 % rm r 
This command first copies the contents of .../src/CVS/Root to a temporary file "r" - make sure it contains the value to be used anywhere! After that, all CVS/Root files are searched, removed and (hard)linked to the temporary file, which is then removed.

Now whenever the CVS repository needs to be switched, updating a single file is enough, due to all the .../CVS/Root files being hardlinks to the same file (watch the link count of 6494):

 % ls -la CVS/Root
 -rw-r--r--  6494 feyrer  wheel  32 Mar 20 20:39 CVS/Root
 % cat bin/ls/CVS/Root 
 anoncvs@anoncvs.netbsd.org:/cvsroot
 % echo anoncvs@anoncvs.fr.netbsd.org:/cvsroot >CVS/Root
 % cat bin/ls/CVS/Root 
 anoncvs@anoncvs.fr.netbsd.org:/cvsroot 



[Tags: ]


Tags: , 2bsd, 34c3, 3com, 501c3, 64bit, acl, acls, acm, acorn, acpi, acpitz, adobe, adsense, Advocacy, advocacy, advogato, aes, afs, aiglx, aio, airport, alereon, alex, alix, alpha, altq, am64t, amazon, amd64, anatomy, ansible, apache, apm, apple, arkeia, arla, arm, art, Article, Articles, ascii, asiabsdcon, aslr, asterisk, asus, atf, ath, atheros, atmel, audio, audiocodes, autoconf, avocent, avr32, aws, axigen, azure, backup, balloon, banners, basename, bash, bc, beaglebone, benchmark, bigip, bind, blackmouse, bldgblog, blog, blogs, blosxom, bluetooth, board, bonjour, books, boot, boot-z, bootprops, bozohttpd, bs2000, bsd, bsdca, bsdcan, bsdcertification, bsdcg, bsdforen, bsdfreak, bsdmac, bsdmagazine, bsdnexus, bsdnow, bsdstats, bsdtalk, bsdtracker, bug, build.sh, busybox, buttons, bzip, c-jump, c99, cafepress, calendar, callweaver, camera, can, candy, capabilities, card, carp, cars, cauldron, ccc, ccd, cd, cddl, cdrom, cdrtools, cebit, centrino, cephes, cert, certification, cfs, cgd, cgf, checkpointing, china, christos, cisco, cloud, clt, cobalt, coccinelle, codian, colossus, common-criteria, community, compat, compiz, compsci, concept04, config, console, contest, copyright, core, cortina, coverity, cpu, cradlepoint, cray, crosscompile, crunchgen, cryptography, csh, cu, cuneiform, curses, curtain, cuwin, cvs, cvs-digest, cvsup, cygwin, daemon, daemonforums, daimer, danger, darwin, data, date, dd, debian, debugging, dell, desktop, devd, devfs, devotionalia, df, dfd_keeper, dhcp, dhcpcd, dhcpd, dhs, diezeit, digest, digests, dilbert, dirhash, disklabel, distcc, dmesg, Docs, Documentation, donations, draco, dracopkg, dragonflybsd, dreamcast, dri, driver, drivers, drm, dsl, dst, dtrace, dvb, ec2, eclipse, eeepc, eeepca, ehci, ehsm, eifel, elf, em64t, embedded, Embedded, emips, emulate, encoding, envsys, eol, espresso, etcupdate, etherip, euca2ools, eucalyptus, eurobsdcon, eurosys, Events, exascale, ext3, f5, facebook, falken, fan, faq, fatbinary, features, fefe, ffs, filesystem, fileysstem, firefox, firewire, fireworks, flag, flash, flashsucks, flickr, flyer, fmslabs, force10, fortunes, fosdem, fpga, freebsd, freedarwin, freescale, freex, freshbsd, friendlyAam, friendlyarm, fritzbox, froscamp, fsck, fss, fstat, ftp, ftpd, fujitsu, fun, fundraising, funds, funny, fuse, fusion, g4u, g5, galaxy, games, gcc, gdb, gentoo, geode, getty, gimstix, git, gnome, google, google-soc, googlecomputeengine, gpio, gpl, gprs, gracetech, gre, groff, groupwise, growfs, grub, gumstix, guug, gzip, hackathon, hackbench, hal, hanoi, happabsd, hardware, Hardware, haze, hdaudio, heat, heimdal, hf6to4, hfblog, hfs, history, hosting, hotplug, hp, hp700, hpcarm, hpcsh, hpux, html, httpd, hubertf, hurd, i18n, i386, i386pkg, ia64, ian, ibm, ids, ieee, ifwatchd, igd, iij, image, images, imx233, imx7, information, init, initrd, install, intel, interix, internet2, interview, interviews, io, ioccc, iostat, ipbt, ipfilter, ipmi, ipplug, ipsec, ipv6, irbsd, irc, irix, iscsi, isdn, iso, isp, itojun, jail, jails, japanese, java, javascript, jetson, jibbed, jihbed, jobs, jokes, journaling, kame, kauth, kde, kerberos, kergis, kernel, keyboardcolemak, kirkwood, kitt, kmod, kolab, kvm, kylin, l10n, landisk, laptop, laptops, law, ld.so, ldap, lehmanns, lenovo, lfs, libc, license, licensing, linkedin, links, linksys, linux, linuxtag, live-cd, lkm, localtime, locate.updatedb, logfile, logging, logo, logos, lom, lte, lvm, m68k, macmini, macppc, macromedia, magicmouse, mahesha, mail, makefs, malo, mame, manpages, marvell, matlab, maus, max3232, mbr95, mbuf, mca, mdns, mediant, mediapack, meetbsd, mercedesbenz, mercurial, mesh, meshcube, mfs, mhonarc, microkernel, microsoft, midi, mini2440, miniroot, minix, mips, mirbsd, missile, mit, mixer, mobile-ip, modula3, modules, money, mouse, mp3, mpls, mprotect, mtftp, mult, multics, multilib, multimedia, music, mysql, named, nas, nasa, nat, ncode, ncq, ndis, nec, nemo, neo1973, netbook, netboot, netbsd, netbsd.se, nethack, nethence, netksb, netstat, netwalker, networking, neutrino, nforce, nfs, nis, npf, npwr, nroff, nslu2, nspluginwrapper, ntfs-3f, ntp, nullfs, numa, nvi, nvidia, nycbsdcon, office, ofppc, ohloh, olimex, olinuxino, olpc, onetbsd, openat, openbgpd, openblocks, openbsd, opencrypto, opendarwin, opengrok, openmoko, openoffice, openpam, openrisk, opensolaris, openssl, or1k, oracle, oreilly, oscon, osf1, osjb, paas, packages, pad, pae, pam, pan, panasonic, parallels, pascal, patch, patents, pax, paypal, pc532, pc98, pcc, pci, pdf, pegasos, penguin, performance, pexpect, pf, pfsync, pgx32, php, pie, pike, pinderkent, pkg_install, pkg_select, pkgin, pkglint, pkgmanager, pkgsrc, pkgsrc.se, pkgsrcCon, pkgsrccon, Platforms, plathome, pleiades, pocketsan, podcast, pofacs, politics, polls, polybsd, portability, posix, postinstall, power3, powernow, powerpc, powerpf, pppoe, precedence, preemption, prep, presentations, prezi, products, Products, proplib, protectdrive, proxy, ps, ps3, psp, psrset, pthread, ptp, ptyfs, Publications, puffs, puredarwin, pxe, qemu, qnx, qos, qt, quality-management, quine, quote, quotes, r-project, ra5370, radio, radiotap, raid, raidframe, rants, raptor, raq, raspberrypi, rc.d, readahead, realtime, record, refuse, reiserfs, Release, releases, Releases, releng, reports, resize, restore, ricoh, rijndael, rip, riscos, rng, roadmap, robopkg, robot, robots, roff, rootserver, rotfl, rox, rs323, rs6k, rss, ruby, rump, rzip, sa, safenet, san, sata, savin, sbsd, scampi, scheduler, scheduling, schmonz, sco, screen, script, sdf, sdtemp, secmodel, security, Security, sed, segvguard, seil, sendmail, serial, serveraptor, sfu, sge, sgi, sgimips, sh, sha2, shark, sharp, shisa, shutdown, sidekick, size, slackware, slashdot, slides, slit, smbus, smp, sockstat, soekris, softdep, softlayer, software, solaris, sony, sound, source, source-changes, spanish, sparc, sparc64, spider, spreadshirt, spz, squid, ssh, sshfs, ssp, statistics, stereostream, stickers, storage, stty, studybsd, subfile, sudbury, sudo, summit, sun, sun2, sun3, sunfire, sunpci, support, sus, suse, sushi, susv3, svn, swcrypto, symlinks, sysbench, sysctl, sysinst, sysjail, syslog, syspkg, systat, systrace, sysupdate, t-shirt, tabs, talks, tanenbaum, tape, tcp, tcp/ip, tcpdrop, tcpmux, tcsh, teamasa, tegra, teredo, termcap, terminfo, testdrive, testing, tetris, tex, TeXlive, thecus, theopengroup, thin-client, thinkgeek, thorpej, threads, time, time_t, timecounters, tip, tk1, tme, tmp, tmpfs, tnf, toaster, todo, toolchain, top, torvalds, toshiba, touchpanel, training, translation, tso, tty, ttyrec, tulip, tun, tuning, uboot, ucom, udf, ufs, ukfs, ums, unetbootin, unicos, unix, updating, upnp, uptime, usb, usenix, useradd, userconf, userfriendly, usermode, usl, utc, utf8, uucp, uvc, uvm, valgrind, vax, vcfe, vcr, veriexec, vesa, video, videos, virtex, virtualization, vm, vmware, vnd, vobb, voip, voltalinux, vpn, vpnc, vulab, w-zero3, wallpaper, wapbl, wargames, wasabi, webcam, webfwlog, wedges, wgt624v3, wiki, willcom, wimax, window, windows, winmodem, wireless, wizd, wlan, wordle, wpa, wscons, wstablet, X, x.org, x11, x2apic, xbox, xcast, Xen, xen, xfree, xfs, xgalaxy, xilinx, xkcd, xlockmore, xmms, xmp, xorg, xscale, youos, youtube, zaurus, zdump, zfs, zlib

'nuff. Grab the RSS-feed, index, or go back to my regular NetBSD page

Disclaimer: All opinion expressed here is purely my own. No responsibility is taken for anything.

Access count: 35746415
Copyright (c) Hubert Feyrer