Status of superpages support

Superpages are mostly done in -CURRENT and everyone who wants to test them is welcome. There are a few caveats; interested users should read this post by the author of superpages implementation. This posting is quoted and annotated in the second part of this article.

Superpages are not enabled by default but support for them is a standard
part of the amd64 and i386 kernel on HEAD. You can enable superpages on
either of those architectures by simply setting the tunable
vm.pmap.pg_ps_enabled to a non-zero value at boot time.

E.g. add vm.pmap.pg_ps_enabled=1 to /boot/loader.conf to enable them.

In general, I would strongly encourage people to enable superpages if their
system uses Pentium 4, Core 2, or AMD tri/quad-core ("k10") processors. On
these processors, superpages may or may not help your application's
performance, but they are very unlikely to ever hurt it. Single/dual-core
Opterons and Athlon 64s ("k8") work fine. In fact, I did most of the
development on Opterons. However, it's "50-50" whether or not a given
application will speed up or slow down on a k8-family processor because of
their small TLB for large page mappings.

This is quite clear. The most benefits from superpages will be received with a Barcelona class Opteron or Phenom CPU and with recent Xeon (e.e. Harpertown 54xx). For a detailed explanation why, read the paper describing TLB behaviour in scientific benchmarks.

It's worth mentioning that there are other secondary benefits to
superpages. They reduce the memory consumed by pv entries ("reverse
physical to virtual mappings"), and they reduce the cost of forking an
address space.

This situation is often encountered in web servers and database servers that use the "prefork" (or similar) model of serving clients (e.g. Apache, PostgreSQL). A frequent manifestation of this problem is the kernel message "approaching the limit of PV entries..." seen on very loaded servers. Superpages will help there.

The performance benefits are a function of two parameters: The first being
the underlying machine, mainly, the structure of its TLB. The second has to
do with the application's memory access patterns. Roughly speaking,
applications that use a large amount of memory will benefit, but there are
other factors.

An interesting factor would be how applications with a lot of IPCs (e.g. web servers) are responding to superpages. I intend to find out :)

#1 Superpages in 7.x

Added on 2009-05-20T01:14 by Ivan Voras

MFCed into 7.2.

Comments !