Huge pages

When upgrading to version 11.70, it’s good to review the list of new features you can take advantage of. IBM has helpfully produced a technical white paper for this purpose. One of these features is huge (or large) pages on Linux, of benefit if your system has a large amount of memory allocated to Informix. The primary advantage is a reduction in the size of the (separate) pages tables used by processes and the system to map to physical memory.

In Linux huge pages were first supported by the 2.6 kernel and this feature was later back-ported to 2.4.21, although the implementation is not the same. This blog post mostly concerns itself with how huge pages work in x86_64 Linux 2.6 kernels although I’ll try and point out where any differences may lie in other implementations.

You can have a look at how much space is allocated to page tables in your system by looking in /proc/meminfo at parameter PageTables. For example I have an 40 Gb Informix instance running on a server with 128 Gb of memory and page table entries totalling 1004 Mb (nearly 1 Gb) are needed to support the system.

Standard pages are 4 kb and point to a block of physical memory. In fact each page has a separate entry in a process page table, which then maps to separate system page table which in turn maps to physical memory. These page tables can contain a mix of standard and huge pages. By using huge pages, the block size increases to 2 Mb on x64_64 (16 Mb on Power PC, 256 Mb on Itanium and 1 Mb on System z). My 40 Gb instance would need 10,485,760 page table entries to support it using standard pages but just 20,480 entries using huge pages. A page table entry can be up to 64 bytes.

In fact the gains are even better than this because modern CPUs use a Translation Lookaside Buffer (TLB) to cache the page tables and these are of a fixed size, typically able to hold a few thousand entries. There is a good Wiki article that explains this in more detail. Page tables containing lots of standard pages therefore lead to more TLB misses where the operating system has to fetch other parts of the page table from system memory.

Huge pages are not used system-wide. Your system administrator must allocate an area of memory to huge pages as follows:

sysctl -w vm.nr_hugepages=<no. of huge pages>

Note that <no. of huge pages> is the number of pages and not in Mb. On 2.4 kernels the parameter is vm.hugetlb_pool.

It may be necessary to reboot your server so that Linux can allocate the memory.

For an Informix instance or set of instances running on a server, a sensible size would be the total memory footprint of all the instances. This can be easily obtained by running onstat –. You might also want to allocate some space for dynamic memory allocations too, although these can use standard pages if no huge page are available.

On Linux (and Solaris) Informix will automatically use huge pages if enough huge pages have been allocated and where the RESIDENT flag is set in onconfig to -1 or to a high enough value to cover your segments. You can control this with the IFX_LARGE_PAGES environment variable.

It’s important to understand that huge pages can only be used by processes that support them and cannot be swapped out so you need to leave normal pages for the operating system and any other processes.

On start-up, the server will put a message in the online log to show that huge pages are being used:

10:00:00 IBM Informix Dynamic Server Started.
10:00:00 Shared memory segment will use huge pages.
10:00:00 Segment locked: addr=0x44000000, size=39520829440
10:00:00 Shared memory segment will use huge pages.
10:00:00 Segment locked: addr=0x977a00000, size=42949672960

You might expect that onstat -g seg would then show you that huge pages are in use for a given segment but this is not the case.

What happens if the server needs to allocate an extra virtual segment? As usual the value of SHMADD will determine the size of the segment and Informix will check to see if there are sufficient huge pages available for it. If not, it will use normal pages and a message like the below will appear in the online log:

10:30:00 Warning: Server is unable to lock huge pages in memory.
Switching to normal pages.
10:30:00 Dynamically allocated new virtual shared memory segment (size 1048576KB)
10:30:00 Memory sizes:resident:38594560 KB, virtual:43013440 KB, no SHMTOTAL limit
10:30:00 Segment locked: addr=0x1378f50000, size=1073741824

You can monitor huge page use, again using /proc/meminfo:

> cat /proc/meminfo | grep HugePages
HugePages_Total: 40960
HugePages_Free:   2518
HugePages_Rsvd:    883

Comparing this with the output from onstat -g seg I have:

Segment Summary:
id         key        addr             size             ovhd     class blkused  blkfree 
38404103   52bb4801   44000000         39520829440      463568504 R*    9648205  435     
38436872   52bb4802   977a00000        42949672960      503318520 V*    8793768  1691992 
Total:     -          -                82470502400      -        -     18441973 1692427

No obvious relationship? I know from the online log that both the virtual and resident segments are using huge pages. If we take the total huge pages, subtract the free and add the reserved, we get (40960 - 2518 + 883) = 39325 pages. If we convert that into bytes: (39325 * 2048 * 1024), we get 82470502400 which is the total size of the two segments.

Advertisements

3 Comments on “Huge pages”

  1. Ben Thompson says:

    One extra thing to watch for with huge pages is the situation where there are old shared memory segments still in memory after an engine crash. When aborting Informix sends a request to the operating system to clear out memory for security reasons. This clear out can take a little while if your instance has a large memory footprint.

    In the mean time it possible to attempt to restart the engine. If there are insufficient huge pages, Informix may attempt to start your instance in normal pages instead and you may not have enough normal pages to support your instance if you dedicated most of the server memory to huge pages. This can lead to swapping and very poor performance.

    An automatic start-up script can check there are sufficient huge pages before attempting to start Informix and when manually starting you can check the online log to see whether huge pages were used.

  2. Ben Thompson says:

    One Linux-specific defect to be aware of around huge pages:
    IC86077: LINUX: WHEN USING HUGE MEMORY AND RESIDENT SET TO -1, SERVER IS UNABLE TO ADD V SEGMENTS AFTER THE INITIAL V ALLOCATIONS
    http://www-01.ibm.com/support/docview.wss?crawler=1&uid=swg1IC86077


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s