Huge pagesPosted: 8 March, 2013 | |
When upgrading to version 11.70, it’s good to review the list of new features you can take advantage of. IBM has helpfully produced a technical white paper for this purpose. One of these features is huge (or large) pages on Linux, of benefit if your system has a large amount of memory allocated to Informix. The primary advantage is a reduction in the size of the (separate) pages tables used by processes and the system to map to physical memory.
In Linux huge pages were first supported by the 2.6 kernel and this feature was later back-ported to 2.4.21, although the implementation is not the same. This blog post mostly concerns itself with how huge pages work in x86_64 Linux 2.6 kernels although I’ll try and point out where any differences may lie in other implementations.
You can have a look at how much space is allocated to page tables in your system by looking in /proc/meminfo at parameter PageTables. For example I have an 40 Gb Informix instance running on a server with 128 Gb of memory and page table entries totalling 1004 Mb (nearly 1 Gb) are needed to support the system.
Standard pages are 4 kb and point to a block of physical memory. In fact each page has a separate entry in a process page table, which then maps to separate system page table which in turn maps to physical memory. These page tables can contain a mix of standard and huge pages. By using huge pages, the block size increases to 2 Mb on x64_64 (16 Mb on Power PC, 256 Mb on Itanium and 1 Mb on System z). My 40 Gb instance would need 10,485,760 page table entries to support it using standard pages but just 20,480 entries using huge pages. A page table entry can be up to 64 bytes.
In fact the gains are even better than this because modern CPUs use a Translation Lookaside Buffer (TLB) to cache the page tables and these are of a fixed size, typically able to hold a few thousand entries. There is a good Wiki article that explains this in more detail. Page tables containing lots of standard pages therefore lead to more TLB misses where the operating system has to fetch other parts of the page table from system memory.
Huge pages are not used system-wide. Your system administrator must allocate an area of memory to huge pages as follows:
sysctl -w vm.nr_hugepages=<no. of huge pages>
Note that <no. of huge pages> is the number of pages and not in Mb. On 2.4 kernels the parameter is vm.hugetlb_pool.
It may be necessary to reboot your server so that Linux can allocate the memory.
For an Informix instance or set of instances running on a server, a sensible size would be the total memory footprint of all the instances. This can be easily obtained by running onstat –. You might also want to allocate some space for dynamic memory allocations too, although these can use standard pages if no huge page are available.
On Linux (and Solaris) Informix will automatically use huge pages if enough huge pages have been allocated and where the RESIDENT flag is set in onconfig to -1 or to a high enough value to cover your segments. You can control this with the IFX_LARGE_PAGES environment variable.
It’s important to understand that huge pages can only be used by processes that support them and cannot be swapped out so you need to leave normal pages for the operating system and any other processes.
On start-up, the server will put a message in the online log to show that huge pages are being used:
10:00:00 IBM Informix Dynamic Server Started.
10:00:00 Shared memory segment will use huge pages.
10:00:00 Segment locked: addr=0x44000000, size=39520829440
10:00:00 Shared memory segment will use huge pages.
10:00:00 Segment locked: addr=0x977a00000, size=42949672960
You might expect that onstat -g seg would then show you that huge pages are in use for a given segment but this is not the case.
What happens if the server needs to allocate an extra virtual segment? As usual the value of SHMADD will determine the size of the segment and Informix will check to see if there are sufficient huge pages available for it. If not, it will use normal pages and a message like the below will appear in the online log:
10:30:00 Warning: Server is unable to lock huge pages in memory.
Switching to normal pages.
10:30:00 Dynamically allocated new virtual shared memory segment (size 1048576KB)
10:30:00 Memory sizes:resident:38594560 KB, virtual:43013440 KB, no SHMTOTAL limit
10:30:00 Segment locked: addr=0x1378f50000, size=1073741824
You can monitor huge page use, again using /proc/meminfo:
> cat /proc/meminfo | grep HugePages HugePages_Total: 40960 HugePages_Free: 2518 HugePages_Rsvd: 883
Comparing this with the output from onstat -g seg I have:
Segment Summary: id key addr size ovhd class blkused blkfree 38404103 52bb4801 44000000 39520829440 463568504 R* 9648205 435 38436872 52bb4802 977a00000 42949672960 503318520 V* 8793768 1691992 Total: - - 82470502400 - - 18441973 1692427
No obvious relationship? I know from the online log that both the virtual and resident segments are using huge pages. If we take the total huge pages, subtract the free and add the reserved, we get
(40960 - 2518 + 883) = 39325 pages. If we convert that into bytes:
(39325 * 2048 * 1024), we get 82470502400 which is the total size of the two segments.