March 12th, 2008

(no subject)

A couple bits of news; We tracked down our problem with the performance drop above 30 threads on Nick Piggin's mysql benchmark to conservative settings for the pthread adaptive spinning. We see a big gain relative to where we were before. Frankly at this point we're splitting hairs with linux and I don't really care where we stand. We had a tremendous problem and we resolved it. Time to move on..

I removed kernel support for our M:N threading library last night. 8.0 will only support 1:1. This will open the way to a lot of optimizations in the signal and sleeping paths. Hopefully reducing the total number of locks required in the sleepq path to a minimum.

There are some 'interesting' pipe benchmarks floating around. You can read about it on the lkml and the author's website:

http://213.148.29.37/PipeBench/
http://lkml.org/lkml/2008/3/5/61

I say 'interesting' of course because FreeBSD is doing way better than linux. ;) pipes, the next battleground? I don't know but it's worth a read anyway.

I also have a patch to implement cpu affinity for our callout mechanism. This is for time based callbacks. The legacy callouts may have order dependencies or may not tolerate concurrency. So by default they are all scheduled on the first callout thread. There is one callout thread per-cpu and they have a kind of 'medium' affinity for that cpu, however, if they are overloaded by some interrupt work another cpu can complete the callouts. This removes the need to do any kind of load balancing across callout handlers because the scheduler can do a better job anyway. New callouts can specify any cpu when setting a timer and then they have an affinity for that thread until a different cpu is requested. All migration is explicit.

Hopefully having callout affinity will benefit our tcp stack where Robert Watson is experimenting with different kinds of affinity for tcp sessions. It will also discourage migration of threads who are sleeping on time based events like select and nanosleep().

(no subject)

I have an opteron with older slower memory that I reproduced the pipe tests on to see if it was any different on a 64bit system. I'm not going to paste the full results but here's a couple of data points:

linux-2.6.24
64[writer]: 97.235 wall (2.031 usr, 68.674 sys), 10.531 Mb/sec
1024[writer]: 13.300 wall (0.145 usr, 9.039 sys), 76.991 Mb/sec
65536[writer]: 3.068 wall (0.001 usr, 1.718 sys), 333.766 Mb/sec

FreeBSD 8.0-CURRENT undermydesk (no cpu switch patches though)
64[writer]: 53.163 wall (1.057 usr, 42.083 sys), 19.261 Mb/sec
1024[writer]: 5.325 wall (0.118 usr, 4.146 sys), 192.284 Mb/sec
65536[writer]: 0.567 wall (0.000 usr, 0.130 sys), 1805.509 Mb/sec

So on this machine we start of 2x as fast and end up 5.5x as fast. The numbers pretty much follow a curve through those points. This verifies the data taken from the old 32bit HTT machine they tested on. I don't intend to post configs and so on as the original lkml thread is plenty rigorous enough.

I forgot to mention earlier. The FreeBSD Alan Cox has committed super-pages! We're seeing some great gains from that. This allows the kernel to automatically use large TLBs for conforming regions of memory. It has a component that ensures that large, contiguous, chunks of physical memory will be available to support this. There is also a defragmenting/compacting piece. There's some great work going into FreeBSD 8.0 already!