jeffr_tech ([info]jeffr_tech) wrote,
@ 2008-03-12 19:22:00
Previous Entry  Add to memories!  Tell a Friend  Next Entry
I have an opteron with older slower memory that I reproduced the pipe tests on to see if it was any different on a 64bit system. I'm not going to paste the full results but here's a couple of data points:

linux-2.6.24
64[writer]: 97.235 wall (2.031 usr, 68.674 sys), 10.531 Mb/sec
1024[writer]: 13.300 wall (0.145 usr, 9.039 sys), 76.991 Mb/sec
65536[writer]: 3.068 wall (0.001 usr, 1.718 sys), 333.766 Mb/sec

FreeBSD 8.0-CURRENT undermydesk (no cpu switch patches though)
64[writer]: 53.163 wall (1.057 usr, 42.083 sys), 19.261 Mb/sec
1024[writer]: 5.325 wall (0.118 usr, 4.146 sys), 192.284 Mb/sec
65536[writer]: 0.567 wall (0.000 usr, 0.130 sys), 1805.509 Mb/sec

So on this machine we start of 2x as fast and end up 5.5x as fast. The numbers pretty much follow a curve through those points. This verifies the data taken from the old 32bit HTT machine they tested on. I don't intend to post configs and so on as the original lkml thread is plenty rigorous enough.

I forgot to mention earlier. The FreeBSD Alan Cox has committed super-pages! We're seeing some great gains from that. This allows the kernel to automatically use large TLBs for conforming regions of memory. It has a component that ensures that large, contiguous, chunks of physical memory will be available to support this. There is also a defragmenting/compacting piece. There's some great work going into FreeBSD 8.0 already!



(12 comments) - (Post a new comment)


[info]sas_spidey01
2008-03-13 06:56 am UTC (link)
TLB == Translation Lookaside Buffer?

(Reply to this) (Thread)


[info]nathan
2008-03-13 08:03 am UTC (link)
yeah, and from what i remember the (intel) x86/x64 have two tlbs, one for large pages (2mb/4mb depending on pae) and another for regular pages (4kb)

i did similar work for our kernel but i wasn't far enough along prior to shipping, and i haven't kept it in sync with current vm codebase.

(Reply to this) (Parent)(Thread)


[info]nathan
2008-03-13 08:08 am UTC (link)
similar, aside from defragmenting/compacting. the physical page allocator minimized/removed the need for this :)

if i wasn't so caught up in other stuff i'd spend some time trying to get it merged back in to the current kernel. right now allocating a large page is a crapshoot after the box has been up for a while due to fragmentation from 4k pages.

(Reply to this) (Parent)(Thread)


[info]jeffr_tech
2008-03-13 08:29 am UTC (link)
On newer processors there are even 1GB TLBs and some intermediate size like 128m or 256m. The larger sizes are useful in the kernel so you always have a valid virtual address for every physical address and you can access them very cheaply.

(Reply to this) (Parent)(Thread)


[info]nathan
2008-03-13 08:31 am UTC (link)
i knew they have this in ia64, so it has carried over to current x64 (intel or amd?) procs?

(Reply to this) (Parent)


[info]jeffr_tech
2008-03-13 08:23 am UTC (link)
Yes, that's the tlb. It caches virtual to physical mappings for the processor. Most modern processors have multiple sizes of entries but the pages obviously have to be physically and virtually contiguous. There are very few TLBs relative to the size of memory so it's a highly contested resource. If you have a TLB miss it can turn into several cache misses as well as you walk the page tables to discover the real physical address.

(Reply to this) (Parent)


(Anonymous)
2008-03-13 08:23 am UTC (link)
>This allows the kernel to automatically use large TLBs for conforming regions of memory. It has a component that ensures that large, contiguous, chunks of physical memory will be available to support this.

Sorry for the question again, but will this be an option for MFC?

(Reply to this) (Thread)


[info]jeffr_tech
2008-03-13 08:24 am UTC (link)
I'm not involved in that project, however, I doubt it. These kinds of big features often involve ABI breaking commits that we don't allow in a stable branch. It also gives people motivation to try the new releases. ;)

(Reply to this) (Parent)

few points
(Anonymous)
2008-03-13 10:13 am UTC (link)
The pipe benchmark is interesting. I don't know why Linux is so much slower, but I would be interested if somebody works it out (or: why FreeBSD is much faster :)). However, the pipe benchmark is a bit useless because it doesn't even touch the data before it is sent or after it is read... this makes the results basically meaningless as a performance indicator (still technically interesting if you are working on actual implementations). Don't go down false optimization path with this one.

Regarding superpages... you mean it allows the *userspace* to automatically use large TLBs? I would hope the kernel is already using them?

(Reply to this) (Thread)

Re: few points
[info]jeffr_tech
2008-03-13 10:44 am UTC (link)
Well I don't intend to optimize our pipe implementation. It's really the domain of alc these days and I think it's doing just fine. ;) The claim that touching the data will break FreeBSD's perf is false anyway. We don't do page flipping.

Yeah, we use large pages in the kernel. Superpages does it automatically for applications.

http://www.cs.rice.edu/~ssiyer/r/superpages/

The paper is well worth reading. The original implementation was done on alpha in early 2000s and is just now getting into freebsd! I think sgi also did something like this way back when.

(Reply to this) (Parent)(Thread)

Re: few points
[info]dga
2008-03-13 07:09 pm UTC (link)
Ah, sweet. Thanks for pointing that out -- that was quite possibly my favorite paper at OSDI 2002, and I was wondering if we'd see it emerge in practice.

(Reply to this) (Parent)

super pages are super cool
[info]zdzichu.openid.pl
2008-03-17 09:55 am UTC (link)
Solaris also utilises big pages (http://blogs.sun.com/deniss/entry/ultrasparc_t1_low_power_and). Linked blog claims 10% performance gain on Oracle load just from avoiding TLB misses.

(Reply to this)


(12 comments) - (Post a new comment)

Create an Account
Forgot your login or password?
Login w/ OpenID
English • Español • Deutsch • Русский…