hubertf's NetBSD Blog
Send interesting links to hubert at feyrer dot de!
 
[20161109] Looking at the scheduler issue again (Updated)
I've encountered a funny scheduler behaviour the other day in a Xen enviroment. The behaviour was that CPU load was not distributed evenly on all CPUs, i.e. in my case on a 2-CPU-system, two CPU-bound processes fought over the same CPU, leaving the other one idle.

I had another look at this today, and was able to reproduce the behaviour using VMWare Fusion with two CPU cores on both NetBSD 7.0_STABLE as well as -current, both with sources as of today, 2016-11-08. I've also made a screenshot available that shows the issue on both systems. I have also filed a problem report to document the issue.

The one hint that I got so far was from Michael van Elst that there may be a rounding error in sched_balance(). Looking at the code, there is not much room for a rounding error. But I am not familiar enough (at all) with the code, so I cannot judge if crucial bits are dropped here, or how that function fits in the whole puzzled.

Update: Pondering on the "rounding error", I've setup both VMs with 4 CPUs, and the behaviour shown there is that load is distributed to about 3 and a half CPU - three CPUs under full load, and one not reaching 100%. There's definitely something fishy in there. See screenshot.

Splitting up the four CPUs on different processor sets with one process assigned to each set (using psrset(8)) leads to an even load distribution here, too. This leads me to thinking that the NetBSD scheduling works well between different processor sets, but is busted within one set.

[Tags: , , ]



[20161105] NetBSD 7.0/xen scheduling mystery, and how to fix it with processor sets
Today I had a need to do some number crunching using a home-brewn C program. In order to do some manual load balancing, I was firing up some Amazon AWS instances (which is Xen) with NetBSD 7.0. In this case, the system was assigned two CPUs, from dmesg:
    # dmesg | grep cpu
    vcpu0 at hypervisor0: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, id 0x306e4
    vcpu1 at hypervisor0: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, id 0x306e4
I started two instances of my program, with the intent to have each one use one CPU. Which is not what happened! Here is what I observed, and how I fixed things for now.

I was looking at top(1) to see that everything was running fine, and noticed funny WCPU and CPU values:

      PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
      2791 root      25    0  8816K  964K RUN/0     16:10 54.20% 54.20% myprog
      2845 root      26    0  8816K  964K RUN/0     17:10 47.90% 47.90% myprog
I expected something like WCPU and CPU being around 100%, assuming that each process was bound to its own CPU. The values I actually saw (and listed above) suggested that both programs were fighting for the same CPU. Huh?!

top's CPU state shows:

    load averages:  2.15,  2.07,  1.82;               up 0+00:45:19        18:00:55
    27 processes: 2 runnable, 23 sleeping, 2 on CPU
    CPU states: 50.0% user,  0.0% nice,  0.0% system,  0.0% interrupt, 50.0% idle
    Memory: 119M Act, 7940K Exec, 101M File, 3546M Free
Which is not too useful. Typing "1" in top(1) lists the actual per-CPU usage instead:
    load averages:  2.14,  2.08,  1.83;               up 0+00:45:56        18:01:32
    27 processes: 4 runnable, 21 sleeping, 2 on CPU
    CPU0 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
    CPU1 states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
    Memory: 119M Act, 7940K Exec, 101M File, 3546M Free
This confirmed my suspicion that both processes were bound to one CPU, and that the other one was idling. Bad! But how to fix?

One option is to kick your operating system out of the window, but I still like NetBSD, so here's another solution: NetBSD allows to create "processor sets", assign CPU(s) to them and then assign processes to the processor sets. Let's have a look!

Processor sets are manipulated using the psrset(8) utility. By default all CPUs are in the same (system) processor set:

    # psrset
    system processor set 0: processor(s) 0 1
First step is to create a new processor set:
    # psrset -c
    1
    # psrset
    system processor set 0: processor(s) 0 1
    user processor set 1: empty
Next, assign one CPU to the new set:
    # psrset -a 1 1
    # psrset
    system processor set 0: processor(s) 0
    user processor set 1: processor(s) 1
Last, find out what the process IDs of my two (running) processes are, and assign them to the two processor sets:
    # ps -u 
    USER  PID %CPU %MEM   VSZ  RSS TTY     STAT STARTED     TIME COMMAND
    root 2791 52.0  0.0  8816  964 pts/4   R+    5:28PM 22:57.80 myprog
    root 2845 50.0  0.0  8816  964 pts/2   R+    5:26PM 23:33.97 myprog
    #
    # psrset -b 0 2791
    # psrset -b 1 2845
Note that this was done with the two processes running, there is no need to stop and restart them! The effect of the commands is imediate, as can be seen in top(1):
    load averages:  2.02,  2.05,  1.94;               up 0+00:59:32        18:15:08
    27 processes: 1 runnable, 24 sleeping, 2 on CPU
    CPU0 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
    CPU1 states:  100% user,  0.0% nice,  0.0% system,  0.0% interrupt,  0.0% idle
    Memory: 119M Act, 7940K Exec, 101M File, 3546M Free
    Swap:

      PID USERNAME PRI NICE   SIZE   RES STATE      TIME   WCPU    CPU COMMAND
     2845 root      25    0  8816K  964K CPU/1     26:14   100%   100% myprog
     2791 root      25    0  8816K  964K RUN/0     25:40   100%   100% myprog
Things are as expected now, with each program being bound to its own CPU.

Now why this didn't happen by default is left as an exercise to the reader. Hints that may help:

    # uname -a
    NetBSD foo.eu-west-1.compute.internal 7.0 NetBSD 7.0 (XEN3_DOMU.201509250726Z) amd64
    # dmesg
    ...
    hypervisor0 at mainbus0: Xen version 4.2.amazon
    VIRQ_DEBUG interrupt using event channel 3
    vcpu0 at hypervisor0: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, id 0x306e4
    vcpu1 at hypervisor0: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz, id 0x306e4 
AWS Instance type: c3.large
AMI ID: NetBSD-x86_64-7.0-201511211930Z-20151121-1142 (ami-ac983ddf)

[Tags: , , , , , ]


[20161030] NetBSD 7.0.2 released
Why 7.0.2? Following NetBSD's release scheme, there are major releases (e.g. 7.0) with subsequent updates (e.g. 7.1). Those "major" release and their updates include both new features as well as bug fixes - the latter one again with and without security relevance. New code, new risks - as a result for getting updates, existing interfaces may change and lead to incompatibiltites. This may affect either binary compatibility between programs and their required shared libraries, as well - though rare - incompatible chances on the source code level.

NetBSD takes quite some effort to keep such incompatibilites low, yet they happen. The only real solutions is: no updates. "Never change a running system" is nice for availability, but it poses security risks. The time when a big server uptime was considered a sign of good system administration are gone. Today, a long update means the system (probably) runs outdated and as such vulnerable code.

So to solve the problem a compromise is needed: little updates, but crucial security updates do get done. Which is where NetBSD's "minor" release like NetBSD 7.0.2 come into play. With its set of changes, a number of external software packages got security-related updates (e.g. OpenSSL, NTP, BIND, X), and a smaller number of security related changes were also added, e.g. a race condition in mail.local(8), crashes in the Networking File System (NFS) and the native Fast File System (FFS) plus some platform-specific crashes on MIPS, PowerPC and SPARC64.

For more information on downloading and installation see the release announcement as well as the platform-specific install documentation, e.g. for NetBSD 7.0.2/arm64's INSTALL.html file.

[Tags: , , , , , , ]



[20161007] Interview with spz@ on BSDnow
There is an interview of Petra "spz@" at BSDnow. She talks about how she got into Unix and NetBSD, and talks about all the different hats she has in the NetBSD Project and The NetBSD Foundation, TNF. The interview starts at Minute 26 - have a look!

[Tags: , ]


[20160521] Catching up: audio-mixing, arm, x86 and amd64 platform improvements and security
A few noteworthy things have happened in NetBSD land, and being lazy I will collect them in one blog posting. Here we go:
  • In-kernel audio mixing: So far, NetBSD's audio device can only be opened once. If more than one application wants to play sound, the first one wins. This is suboptimal if you want to (say) play some MP3s but also get some occasional noise from your webbrowser.

    Now, Nathanial Sloss has made a stab at this, providing several implementation choices. Challenges in the task are that sounds with different quality (sampling rate, mono/stereo etc.) need to be brought to one common quality before mixing and passing on to the actual audio hardware. Further fun is added by the delay this process adds. See the discussion on tech-kern for all the gory details!

  • Freescale i.MX7 support: Ryo Shimizu has committed support for the Freescale i.MX7 processor and the Atmark Techno Armadillo-IoT G3 board. according to his posting to port-arm (dmesg included), UART, Ethernet, USB, SDHC, RTC, GPIO, WDOG and MULTIPROCESSOR work. Interesting thing of the platform is that is has two Cortex-A7 cores and one Cortex-M4 core, the latter without MMU. Ideas on how to use the latter are welcome! :)

  • PIE binaries with PaX, ASLR+MPROTECT are now the default for i386. ASLR and MPROTECT can be turned off either globally or per-binary if any problems should arise. Be sure to document those exceptions in your risk management! :-)

    More information: PaX, PIE, ASLR, MPROTECT.

  • Platform improvements for i386 and amd64. For amd64, Maxime Villard writes:
     - I cleaned up the asm code and fixed several comments, which makes the
       boot process much easier to understand.
     - I fixed the alignment for the text segment, so that it can be covered by
       more large pages [1] - thereby reducing TLB contention.
     - I fixed a bug in the way the secondary CPUs are launched [2], which
       caused them to crash if they tried to access an X-less page.
     - I took rodata out of the text+rodata chunk, and put it in the data+bss+
       PRELOADED_MODULES+BOOTSTRAP_TABLES chunk [3]. rodata was no longer large
       page optimized, and had RWX permissions.
     - I retook rodata out of the rodata+data+bss+PRELOADED_MODULES+
       BOOTSTRAP_TABLES chunk, and made the kernel map it independently without
       the W permision [4].
     - I made the kernel map rodata without the X permission, by using the NOX
       bit on its pages [5] (now that the secondary CPUs could handle that
       properly).
     - I took the data+bss chunk out of the data+bss+PRELOADED_MODULES+
       BOOTSTRAP_TABLES chunk, and made the kernel map it independently without
       X permission [6].
     - I made the kernel remap rodata and data+bss with large pages and proper
       permissions [7] - which reduces once again TLB contention.
    
    See Maxime's posting to tech-kern for all the footnotes. Likewise, Maxime also tackled i386, and besides the changes from amd64, here is the list of changes from his email:
     - on non-PAE i386, NOX does not exist. Therefore the mappings all have an
       additional X permission. To benefit from X-less mappings, your CPU must
       support PAE, and your kernel must be GENERIC_PAE.
     - the segments are not large-page-aligned, which means that probably some
       parts of the segments are still mapped with normal pages. It is still more
       optimized than it used to be, but not as much as amd64 is.
    


[Tags: , , , , , , , , ]


[20160501] Bootstrap pkgsrc under 'bash on Windows'
Much bruha was made about Windows running Linux userland recently. Leaving out the fact that emulating other operating systems is something that NetBSD does for ages, there is one real challenge that every Linux user faces when he has set up his operating system: getting software installed easily. And of course there is only one truely portable answer to that question: use pkgsrc, of course!

The process is pretty much straight forward, and Ryo ONODERA has verified the prerequired Windows versions and Linux packages, and has sent instructions on how to bootstrap pkgsrc on Windows 10. Now who's the first one to post a screenshot with output of pkgsrc/misc/cowsay running "cowsay hello pkgsrc"? :-)

[Tags: , , , ]



[20160430] OpenHUB's NetBSD Project Statistics
This flew by on Twitter (thanks ajcc @6LR61!), and I think it's neat so I point to it here: BlackDuck's OpenHUB has a number of NetBSD project statistics, generated automatically. Statis include activity and vulnerability reports, languages, lines-of-code statistics (with comment and blank lines), 30 day and 12 month activity reports with commit and contributor numbers, number of contributers per month since 1993 and more. In a nutshell, NetBSD consists of 5902 years of effort. Have a look!

[Tags: ]


[20160424] NetBSD and Google's Summer of Code 2016: Projects announced
This year, NetBSD is part of Google's Summer of Code again, and the students that will work on NetBSD projects and what their project proposals this year are have been announced: Have a look at the links to learn more about the students and the projects. To all the students - welcome to NetBSD! :-)

[Tags: ]


[20160422] Two more NetBSD Security Advisories: compatibility layers, Bozohttpd
Two more security advisories have been released:

[Tags: , , ]


[20160416] NetBSD Security Advisories: ntp, libXfont, calendar
NetBSD has released a number of security advisories:
  • 2016-001: Multiple vulnerabilities in ntp daemon
  • 2016-002: BDF file parsing issues in libXfont
  • 2016-003: Privilege escalation in calendar(1)
See the advisories for more information on NetBSD releases that are and are not affected, the severity of the vulnerability as well as the date by which which NetBSD release branch was fixed.

The advisories also contain an abstract of the problem as well as in-depth technicals with solutions and workarounds. Go and have a look!

[Tags: , , ]



More recent 10 entriesPrevious 10 entries
Disclaimer: All opinion expressed here is purely my own. No responsibility is taken for anything.

Access count: 24352598
Copyright (c) Hubert Feyrer