The Java heap, where every Java object is allocated, is the area of memory you're most intimately connected with when writing Java applications. The JVM was designed to insulate us from the host machine's peculiarities, so it's natural to think about the heap when you think about memory. You've no doubt encountered a Java heap
OutOfMemoryError
— caused by an object leak or by not making the heap big enough to store all your data — and have probably learned a few tricks to debug these scenarios. But as your Java applications handle more data and more concurrent load, you may start to experience OutOfMemoryError
s that can't be fixed using your normal bag of tricks — scenarios in which the errors are thrown even though the Java heap isn't full. When this happens, you need to understand what is going on inside your Java Runtime Environment (JRE).Java applications run in the virtualized environment of the Java runtime, but the runtime itself is a native program written in a language (such as C) that consumes native resources, including native memory. Native memory is the memory available to the runtime process, as distinguished from the Java heap memory that a Java application uses. Every virtualized resource — including the Java heap and Java threads — must be stored in native memory, along with the data used by the virtual machine as it runs. This means that the limitations on native memory imposed by the host machine's hardware and operating system (OS) affect what you can do with your Java application.
This article is one of two covering the same topic on different platforms. In both, you'll learn what native memory is, how the Java runtime uses it, what running out of it looks like, and how to debug a native
OutOfMemoryError
. This article covers AIX and focuses on the IBM® Developer Kit for Java. The other article covers Windows and Linux and does not focus on any particular Java runtime.A recap of native memory
I'll start by explaining the limitations on native memory imposed by the OS and the underlying hardware. If you're familiar with managing dynamic memory in a language such as C, then you may want to skip to the next section.
Hardware limitations
Many of the restrictions that a native process experiences are imposed by the hardware, not the OS. Every computer has a processor and some random-access memory (RAM), also known as physical memory. A processor interprets a stream of data as instructions to execute; it has one or more processing units that perform integer and floating-point arithmetic as well as more advanced computations. A processor has a number of registers — very fast memory elements that are used as working storage for the calculations that are performed; the register size determines the largest number that a single calculation can use.
The processor is connected to physical memory by the memory bus. The size of the physical address (the address used by the processor to index physical RAM) limits the amount of memory that can be addressed. For example, a 16-bit physical address can address from 0x0000 to 0xFFFF, which gives 2^16 = 65536 unique memory locations. If each address references a byte of storage, a 16-bit physical address would allow a processor to address 64KB of memory.
Processors are described as being a certain number of bits. This normally refers to the size of the registers, although there are exceptions — such as 390 31-bit — where it refers to the physical address size. For desktop and server platforms, this number is 31, 32, or 64; for embedded devices and microprocessors, it can be as low as 4. The physical address size can be the same as the register width but could be larger or smaller. Most 64-bit processors can run 32-bit programs when running a suitable OS.
Table 1 lists some popular architectures with their register and physical address sizes:
Table 1. Register and physical address size of some popular processor architectures
Architecture | Register width (bits) | Physical address size (bits) |
---|---|---|
(Modern) Intel x86 | 32 | 32 36 with Physical Address Extension (Pentium Pro and above) |
x86 64 | 64 | Currently 48-bit (scope to increase later) |
PPC64 | 64 | 50-bit at POWER 5 |
390 31-bit | 32 | 31 |
390 64-bit | 64 | 64 |
Operating systems and virtual memory
If you were writing applications to run directly on the processor without an OS, you could use all memory that the processor can address (assuming enough physical RAM is connected). But to enjoy features such as multitasking and hardware abstraction, nearly everybody uses an OS of some kind to run their programs.
In multitasking OSs, including AIX, more than one program uses system resources, including memory. Each program needs to be allocated regions of physical memory to work in. It's possible to design an OS such that every program works directly with physical memory and is trusted to use only the memory it has been given. Some embedded OSs work like this, but it's not practical in an environment consisting of many programs that are not tested together because any program could corrupt the memory of other programs or the OS itself.
Virtual memory allows multiple processes to share physical memory without being able to corrupt one another's data. In an OS with virtual memory (such as AIX and many others), each program has its own virtual address space — a logical region of addresses whose size is dictated by the address size on that system (so 31, 32, or 64 bits for desktop and server platforms). Regions in a process's virtual address space can be mapped to physical memory, to a file, or to any other addressable storage. The OS can move data held in physical memory to and from a swap area when it isn't being used, to make the best use of physical memory. When a program tries to access memory using a virtual address, the OS in combination with on-chip hardware maps that virtual address to the physical location. That location could be physical RAM, a file, or the swap partition. If a region of memory has been moved to swap space, then it's loaded back into physical memory before being used. Figure 1 shows how virtual memory works by mapping regions of process address space to shared resources:
Figure 1. Virtual memory mapping process address spaces to physical resources
Each instance of a native program runs as a process. On AIX a process is a collection of information about OS-controlled resources (such as file and socket information), a virtual address space, and at least one thread of execution.
Although a 32-bit address can reference 4GB of data, a program is not given the entire 4GB address space for its own use. As with other OSs (such as Windows and Linux) the address space is divided up into sections, only some of which are available for a program to use; the OS uses the rest. Compared to Windows and Linux, the AIX memory model is more complicated and can be tuned more precisely.
The AIX 32-bit memory model is divided and managed as 16 256MB segments. Figure 2 shows the layout of the default 32-bit AIX memory model:
Figure 2. The default AIX memory model
The uses of the different segments are:
- Segment 0: AIX kernel data (not directly accessible by a user program)
- Segment 1: Application text (executable code)
- Segment 2: Thread stacks and native heap (the area controlled with malloc/free)
- Segments 3-C and E: Memory mapped regions (including files) and shared memory
- Segment D: Shared library text (executable code)
- Segment F: Shared library data
The large memory model allows a programmer or a user to annex some of the shared/mapped segments for use as native heap either by supplying a linker option when the executable is built or by setting the
LDR_CNTRL
environment variable before the program is started. To enable the large memory model at run time, set LDR_CNTRL=MAXDATA=0xN0000000
where N
is between 1
and 8
. Any value outside this range will cause the default memory model to be used. In the large memory model, the native heap starts at segment 3; segment 2 is only used for the primordial (initial) thread stack.When you use the large memory model, the segment allocation is static; that is, if you request four data segments (for 1GB of native heap) but then only allocate one segment (256MB) of native heap, the other three data segments are unavailable for memory mapping.
If you want a native heap larger than 2GB and you are running AIX 5.1 or later, you can use the AIX very large memory model. The very large memory model, like the large memory model, can be enabled for an executable at compile time with a linker option or at run time using the
LDR_CNTRL
environment variable. To enable the very large memory model at run time, set LDR_CNTRL=MAXDATA=0xN0000000@DSA
where N
is between 0
and D
if you use AIX 5.2 or greater, or between 1
and A
if you are using AIX 5.1. The value of N
specifies the number of segments that can be used for native heap but, unlike in the large memory model, these segments can be used for mmapping if necessary.The IBM Java runtime uses the very large memory model unless it's overridden with the
LDR_CNTRL
environment variable.Setting
N
between 1
and A
will use the segments between 3 and C for native storage as you would expect. From AIX 5.2, setting N
to B
or higher changes the memory layout — it no longer uses segments D and F for shared libraries and allows them to be used for native storage or mmapping. Setting N
to D
gives the maximum 13 segments (3.25GB) of native heap. Setting N
to 0
allows segments 3 through F to be used for mmapping — the native heap is held in segment 2. Figure 3 shows the different address space layouts used with the different AIX memory models:Figure 3. AIX memory models for various values of
MAXDATA
A native memory leak or excessive native memory use will cause different problems depending on whether you exhaust the address space or run out of physical memory. Exhausting the address space typically only happens with 32-bit processes — because the maximum 4GB of address space is easy to allocate. A 64-bit process has a user space of hundreds or thousands of gigabytes and is hard to fill up even if you try. If you do exhaust the address space of a Java process, then the Java runtime can start to show the odd symptoms I'll describe later in the article. When running on a system with more process address space than physical memory, a memory leak or excessive use of native memory will force the OS to swap out some of the virtual address space. Accessing a memory address that has been swapped is a lot slower than reading a resident (in physical memory) address because it must be loaded from the hard drive.
If you are simultaneously trying to use so much RAM-backed virtual memory that your data cannot be held in physical memory, the system will thrash — that is, spend most of its time copying memory back and forth from swap space. When this happens, the performance of the computer and the individual applications will become so poor the user can't fail to notice there's a problem. When a JVM's Java heap is swapped out, the garbage collector's performance becomes extremely poor, to the extent that the application may appear to hang. If multiple Java runtimes are in use on a single machine at the same time, the physical memory must be sufficient to fit all of the Java heaps.
The Java runtime is an OS process that is subject to the hardware and OS constraints I outlined in the preceding section. Runtime environments provide capabilities that are driven by some unknown user code; that makes it impossible to predict which resources the runtime environment will require in every situation. Every action a Java application takes inside the managed Java environment can potentially affect the resource requirements of the runtime that provides that environment. This section describes how and why Java applications consume native memory.
The Java heap and garbage collection
The Java heap is the area of memory where objects are allocated. The IBM Developer Kits for Java Standard Edition have one physical heap, although some specialist Java runtimes such as IBM WebSphere Real Time have multiple heaps. The heap can be split up into sections such as the IBM gencon policy's nursery and tenured areas. Most Java heaps are implemented as contiguous slabs of native memory.
The heap's size is controlled from the Java command line using the
-Xmx
and -Xms
options (mx
is the maximum size of the heap, ms
is the initial size). Although the logical heap (the area of memory that is actively used) will grow and shrink according to the number of objects on the heap and the amount of time spent in garbage collection (GC), the amount of native memory used remains constant and is dictated by the -Xmx
value: the maximum heap size. The memory manager relies on the heap being a contiguous slab of memory, so it's impossible to allocate more native memory when the heap needs to expand; all heap memory must be reserved up front.Reserving native memory is not the same as allocating it. When native memory is reserved, it is not backed with physical memory or other storage. Although reserving chunks of the address space will not exhaust physical resources, it does prevent that memory from being used for other purposes. A leak caused by reserving memory that is never used is just as serious as leaking allocated memory.
The IBM garbage collector on AIX minimises the use of physical memory by decommitting (releasing the backing storage for) sections of the heap as the used area of heap shrinks.
For most Java applications, the Java heap is the largest user of process address space, so the Java launcher uses the Java heap size to decide how to configure the address space. Table 2 lists the default memory model configuration for different ranges of heap size. You can override the memory model by setting the
LDR_CNTRL
environment variable yourself before starting the Java launcher. If you are embedding the Java runtime or writing your own launcher, you will need to configure the memory model yourself — either by specifying the appropriate linker flag or by setting LDR_CNTRL
before starting your launcher.Table 2. Default
LDR_CNTRL
settings for different heap sizesHeap range | LDR_CNTRL setting | Maximum native heap size | Maximum mapped space (without occupying native heap) |
---|---|---|---|
-Xmx0M to -Xmx2304M | MAXDATA=0xA0000000@DSA | 2.5GB | 256MB |
-Xmx2304M to -Xmx3072M | MAXDATA=0xB0000000@DSA | 2.75GB | 512MB |
> -Xmx3072M | MAXDATA=0x0@DSA | 256MB | 3.25GB |
The JIT compiler compiles Java bytecode to optimised native binary code at run time. This vastly improves the run-time speed of Java runtimes and allows Java applications to run at speeds comparable to native code.
Compiling bytecode uses native memory (in the same way that a static compiler such as
gcc
requires memory to run), but the output from the JIT (the executable code) also mist be stored in native memory. Java applications that contain many JIT-compiled methods use more native memory than smaller applications.Classes and classloaders
Java applications are composed of classes that define object structure and method logic. They also use classes from the Java runtime class libraries (such as
java.lang.String
) and may use third-party libraries. These classes need to be stored in memory for as long as they are being used.The IBM implementation from Java 5 onward allocates slabs of native memory for each classloader to store class data in. The shared-classes technology in Java 5 and above maps an area of shared memory into the address space where read-only (and therefore shareable) class data is stored. This reduces the amount of physical memory required to store class data when multiple JVMs run on the same machine. Shared classes also improves JVM start-up time.
The shared-classes system maps a fixed-size area of shared memory into the address space. The shared class cache might not be completely occupied or might contain classes that you are not currently using (that have been loaded by other JVMs), so it's quite likely that using shared classes will occupy more address space (although less physical memory) than running without shared classes. It's important to note that shared classes doesn't prevent classloaders being unloaded — but it does cause a subset of the class data to remain in the class cache. See Resources for more information about shared classes.
Loading more classes uses more native memory. Each classloader also has a native-memory overhead — so having many classloaders each loading one class uses more native memory than having one classloader that loads many classes. Remember that it's not only your application classes that need to fit in memory; frameworks, application servers, third-party libraries, and Java runtimes contain classes that are loaded on demand and occupy space.
The Java runtime can unload classes to reclaim space, but only under strict conditions. It's impossible to unload a single class; classloaders are unloaded instead, taking all the classes they loaded with them. A classloader can only be unloaded only if:
- The Java heap contains no references to the
java.lang.ClassLoader
object that represents that classloader. - The Java heap contains no references to any of the
java.lang.Class
objects that represent classes loaded by that classloader. - No objects of any class loaded by that classloader are alive (referenced) on the Java heap.
java.lang.String
) or any application classes loaded through the application classloader can't be released.Even when a classloader is eligible for collection, the runtime only collects classloaders as part of a GC cycle. The IBM gencon GC policy (enabled with the
-Xgcpolicy:gencon
command-line argument) unloads classloaders only on major (tenured) collections. If an application is running the gencon policy and creating and releasing many classloaders, you can find that large amounts of native memory are held by collectable classloaders in the period between tenured collections. See Resources to find out more about the different IBM GC policies.It's also possible for classes to be generated at run time, without you necessarily realising it. Many JEE applications use JavaServer Pages (JSP) technology to produce Web pages. Using JSP generates a class for each .jsp page executed that will last the lifetime of the classloader that loaded them — typically the lifetime of the Web application.
Another common way to generate classes is by using Java reflection. When using the
java.lang.reflect
API, the Java runtime must connect the methods of a reflecting object (such as java.lang.reflect.Field
) to the object or class being reflected on. This "accessor" can use the Java Native Interface (JNI), which requires very little setup but is slow to run, or it can build a class dynamically at run time for each object type you want to reflect on. The latter method is slower to set up but faster to run, making it ideal for applications that reflect on a particular class often.The Java runtime uses the JNI method the first few times a class is reflected on, but after being used a number of times, the accessor is inflated into a bytecode accessor, which involves building a class and loading it through a new classloader. Doing lots of reflection can cause many accessor classes and classloaders to be created. Holding references to the reflecting objects causes these classes to stay alive and continue occupying space. Because creating the bytecode accessors is quite slow, the Java runtime can cache these accessors for later use. Some applications and frameworks also cache reflection objects, thereby increasing their native footprint.
You can control the reflection accessor behaviour using system properties. The default inflation threshold (the number of times a JNI accessor is used before being inflated into a bytecode accessor) for the IBM Developer Kit for Java 5.0 is 15. You can modify this by setting the
sun.reflect.inflationThreshold
system property. You can set this on the Java command line with -Dsun.reflect.inflationThreshold=N
. If you set the inflationThreshold
to 0
or less, then the accessors will never be inflated. This can be useful if you find that your application is creating many sun.reflect.DelegatingClassloader
s (the classloaders used to load the bytecode accessors).Another (much misunderstood) setting also affects the reflection accessors.
-Dsun.reflect.noInflation=true
disables inflation entirely but, counterintuitively, causes bytecode accessors to be used for everything. Using -Dsun.reflect.noInflation=true
increases the amount of address space consumed by reflection classloaders because many more are created.You can measure how much memory is being used for classes and JIT code at Java 5 and above by taking a javacore dump. A javacore is a plain-text file containing a summary of the Java runtime's internal state when the dump was taken — including information about allocated native memory segments. Newer versions of the IBM Developer Kit for Java 5 and 6 summarise the memory use in the javacore, for older versions (prior to Java 5 SR10 and Java 6 SR3) this article's sample code package includes a Perl script you can to collate and present the data (see Downloads). To run it you need the Perl interpreter, which is available for AIX and other platforms. See Resources for more details.
Javacores are produced when
OutOfMemoryError
s are thrown (which will probably occur if you run out of address space). You can also trigger one by sending SIGQUIT
to the Java process (kill -3 <pid>
). To summarise the memory segment usage, run:perl get_memory_use.pl javacore.<date>.<time>.<pid>.txt |
The output from the script looks like this:
perl get_memory_use.pl javacore.20080111.081905.1311.txt |
JNI
JNI allows native code (applications written in native languages such as C and C++) to call Java methods and vice versa. The Java runtime itself relies heavily on JNI code to implement class-library functions such as file and network I/O. A JNI application can increase the native footprint of a Java runtime in three ways:
- The native code for a JNI application is compiled into a shared library or executable that's loaded into the process address space. Large native applications can occupy a significant chunk of the process address space simply by being loaded.
- The native code must share the address space with the Java runtime. Any native-memory allocations or memory mappings performed by the native code take memory away from the Java runtime.
- Certain JNI functions can use native memory as part of their normal operation. The
GetTypeArrayElements
andGetTypeArrayRegion functions
can copy Java heap data into native memory buffers for the native code to work with. Whether a copy is made or not depends on the runtime implementation; the IBM Developer Kit for Java 5.0 and higher does make a native copy. The change was made to avoid pinning objects on the heap (having to fix them in memory because code outside of the JVM had a reference to it); this means the Java heap cannot be fragmented (as it could at 1.4.2), but has increased the runtime's native footprint. Accessing large amounts of Java heap data with a copying implementation can use a correspondingly large amount of native heap.
The new I/O (NIO) classes added in Java 1.4 introduced a new way of performing I/O based on channels and buffers. As well as I/O buffers backed by memory on the Java heap, NIO added support for direct
ByteBuffer
s (allocated using the java.nio.ByteBuffer.allocateDirect()
method) that are backed by native memory rather than Java heap. Direct ByteBuffer
s can be passed directly to native OS library functions for performing I/O — making them significantly faster in some scenarios because they can avoid copying between Java heap and native heap.It's easy to become confused about where direct
ByteBuffers
are being stored. The application still uses an object on the Java heap to orchestrate I/O operations, but the buffer that holds the data is held in native memory — the Java heap object only contains a reference to the native heap buffer. A non-direct ByteBuffer
holds its data in a byte[]
array on the Java heap. Figure 4 shows the difference between direct and non-direct ByteBuffer
objects:Figure 4. Memory topology for direct and non-direct
java.nio.ByteBuffer
sDirect
ByteBuffer
objects clean up their native buffers automatically but can only do so as part of Java heap GC — so they do not automatically respond to pressure on the native heap. GC occurs only when the Java heap becomes so full it can't service a heap-allocation request or if the Java application explicitly requests it (not recommended because it can cause performance problems).The pathological case would be that the native heap becomes full and one or more direct
ByteBuffers
are eligible for GC (and could be freed to make some space on the native heap), but the Java heap is mostly empty so GC doesn't occur.Threads
Every thread in an application requires memory to hold its stack (the area of memory used to hold local variables and maintain state when calling functions). Depending on implementation, a Java thread can have separate native and Java stacks. In addition to stack space, each thread requires some native memory for thread-local storage and internal data structures.
The stack size varies by Java implementation and by architecture. Some implementations allow you to specify the stack size for Java threads. Values between 256KB and 756KB are typical.
Although the amount of memory used per thread is quite small, for an application with several hundred threads, the total memory use for thread stacks can be large. Running an application with many more threads than available processors to run them is usually inefficient and can result in poor performance as well as increased memory usage.
A Java runtime copes quite differently with running out of Java heap compared to running out of native heap, although both conditions can present with similar symptoms. A Java application finds it extremely difficult to function when the Java heap is exhausted — because it's difficult for a Java application to do anything without allocating objects. The poor GC performance and
OutOfMemoryError
s that signify a full Java heap are produced as soon as the Java heap fills up.In contrast, once a Java runtime has started up and the application is in steady state, it can continue to function with complete native heap exhaustion. It doesn't necessarily show any odd behaviour, because actions that require a native-memory allocation are much rarer than actions that require Java-heap allocations. Although actions that require native memory vary by JVM implementation, some popular examples are: starting a thread, loading a class, and performing certain kinds of network and file I/O.
Native out-of-memory behaviour is also less consistent than Java heap out-of-memory behaviour, because there's no single point of control for native heap allocations. Whereas all Java heap allocations are under control of the Java memory-management system, any native code — whether it's inside the JVM, the Java class libraries, or application code — can perform a native-memory allocation and have it fail. The code that attempts the allocation can then handle it however its designer wants: it could throw an
OutOfMemoryError
through the JNI interface, print a message on the screen, silently fail and try again later, or do something else.The lack of predictable behaviour means there's no one simple way to identify native-memory exhaustion. Instead, you need to use data from the OS and the Java runtime to confirm the diagnosis.
To help you see how native memory exhaustion affects the Java runtime, this article's sample code (see Downloads) contains some Java programs that trigger native-heap exhaustion in different ways. The examples use a native library written in C to consume all of the native process space and then try to perform some action that uses native memory. The examples are supplied already built, although instructions on compiling them are provided in the README.html file in the sample package's top-level directory.
The
com.ibm.jtc.demos.NativeMemoryGlutton
class provides the gobbleMemory()
method, which calls malloc
in a loop until nearly all native memory is exhausted. When it has completed its task, it prints the number of bytes allocated to standard error like this:Allocated 1953546736 bytes of native memory before running out |
The output for each demo has been captured for an IBM Java runtime running on 32-bit AIX. Binaries for the sample programs are provided in the samples pack (see Downloads).
The version of the IBM Java runtime used was:
java version "1.5.0" |
Trying to start a thread when out of native memory
The
com.ibm.jtc.demos.StartingAThreadUnderNativeStarvation
class tries to start a thread when the process address space is exhausted. This is a common way to discover that your Java process is out of memory because many applications start threads throughout their lifetime.The output from
StartingAThreadUnderNativeStarvation
is:$ ./run_thread_demo_linux_aix_32.sh |
Calling
java.lang.Thread.start()
tries to allocate memory for a new OS thread. This attempt fails and causes an OutOfMemoryError
to be thrown. The JVMDUMP
lines notify the user that the Java runtime has produced its standard OutOfMemoryError
debugging data.Trying to handle the first
OutOfMemoryError
caused a second — the :OutOfMemoryError, ENOMEM error in ZipFile.open
. Multiple OutOfMemoryError
s are common when the native process memory is exhausted because some of the default OutOfMemoryError
-handling routines may need to allocate native memory. This may sound unhelpful, but most OutOfMemoryError
s thrown by Java applications are caused by a lack of Java heap memory, which wouldn't prevent the runtime from allocating native storage. The only thing that distinguishes the OutOfMemoryError
s thrown in this scenario from those thrown because of Java-heap exhaustion is the message.Trying to allocate a direct
ByteBuffer
when out of native memoryThe
com.ibm.jtc.demos.DirectByteBufferUnderNativeStarvation
class tries to allocate a direct (that is, natively backed) java.nio.ByteBuffer
object when the address space is exhausted. It produces the following output:$ ./run_directbytebuffer_demo_aix_32.sh |
In this scenario, you can see many
JVMDUMP
information messages caused by the OutOfMemoryError
being thrown. The several UTE
error messages produced by the Java trace engine report that it can't allocate a native buffer. These UTE
error messages are common symptoms of a native out-of-memory condition because the trace engine is enabled and active by default. Finally, two OutOfMemoryError
s have printed — a secondary failure to allocate in the zip
library and the original error from java.nio.DirectByteBuffer
.The first thing to do when faced with a
java.lang.OutOfMemoryError
or an error message about lack of memory is to determine which kind of memory has been exhausted. The easiest way to do this is to first check if the Java heap is full. If the Java heap did not cause the OutOfMemory
condition, then you should analyse the native heap usage.Checking the Java heap
To check the Java heap utilization, you can either look in the javacore file produced when the
OutOfMemoryError
was thrown, or use verbose GC data. The javacore file is usually produced in the working directory of the Java process and has a name of the form javacore.<date>.<time>.<pid>.txt. If you open the file in a text editor, you can find a section that looks like this:0SECTION MEMINFO subcomponent dump routine |
This section shows how much Java heap was free when the javacore was produced. Note that the values are in hexadecimal format. If the
OutOfMemoryError
was thrown because a heap allocation could not be satisfied, then the GC trace section will show this:1STGCHTYPE GC History |
J9AllocateObject() returning NULL!
means that the object allocation routine completed unsuccessfully and an OutOfMemoryError
will be thrown.It's also possible for an
OutOfMemoryError
to be thrown because the garbage collector is running too frequently (a sign that the heap is full and the Java application will be making little or no progress). In this case, you would expect the Heap Space Free value to be very small, and the GC history will show one of these messages:1STGCHTYPE GC History |
1STGCHTYPE GC History |
The
-verbose:gc
command-line option produces trace data containing GC statistics including the heap occupancy. This information can be plotted with the IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer (GCMV) tool to show if the Java heap is growing. See Resources for links to articles describing how to collect and plot verbose:gc
data.Measuring native heap usage
If you have determined that your out-of-memory condition was not caused by Java heap exhaustion, the next stage is to profile your native-memory usage.
If you are familiar with AIX process tuning, you may monitor the native process size using your favourite toolchain. One option is to use the IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer tool (GCMV).
GCMV was originally written to plot verbose GC logs, allowing users to view changes in Java heap usage and performance when tuning the garbage collector. GCMV was later extended to allow it to plot other data sources, including Linux and AIX native memory logs. GCMV is shipped as a plug-in for ISA. See Resources for a link to an article describing how to download and install ISA and GCMV, as well as how to use GCMV for debugging GC performance problems.
To plot an AIX native-memory profile with GCMV, you must first collect native-memory data using a script. GCMV's AIX native-memory parser reads output from the AIX
svmon
command. A script is provided in the GCMV help documentation that collects data in the correct form. To find the script:- Download and install ISA Version 4 (or above) and install the GCMV tool plug-in (see Resources for details).
- Start ISA.
- Bring up the ISA help menu by clicking Help >> Help Contents from the menu bar.
- Find the AIX native-memory instructions in the left-hand pane under Tool:IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer >> Using the Garbage Collection and Memory Visualizer >> Supported Data Types >> Native memory >> AIX native memory.
Figure 5. Location of the GCMV AIX memory monitoring script in the ISA help dialog
To use the script, move it onto your AIX machine and start the Java process to be monitored. Use
ps
to get the process identifier (PID) of the Java process, then start the monitoring script (where pid is the ID of the process to be monitored, and output_file is the file to store the memory log in — the file that GCMV will plot):sh aix_memory_monitor.sh pid > output_file |
To plot the memory log:
- In ISA, select Analyze Problem from the Launch Activity drop-down menu.
- Select the Tools tab near the top of the Analyze Problem panel.
- Select IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer.
- Click the Launch button near the bottom of the tools panel.
- Click the Browse button and locate the log file. Click OK to launch GCMV.
Figure 6 shows a plot of the native memory footprint of a Java stress test. The grey highlight shows the warm-up phase where the native memory footprint increases and then flattens-off as the process reaches steady state.
Figure 6. Plot of AIX native memory showing warm-up phase
It's also possible to have a native footprint that correlates with workload. If your application creates more threads to handle incoming workload or allocates native-backed storage such as direct
ByteBuffer
s proportionally to how much load is being applied to your system, it's possible that you'll run out of native memory under high load.Running out of native memory because of JVM warm-up-phase native-memory growth and growth proportional to load are examples of trying to do too much within the available space. In these scenarios your options are:
- Reduce your native memory use. Reducing your Java heap size is a good place to start.
- Restrict your native memory use. If you have native memory growth that changes with load, find a way to cap the load or the resources that are allocated because of it.
- Increase the amount of address space available to you. You can set the
LDR_CNTRL
environment variable to specify a different memory-model configuration or consider moving to 64-bit.
When faced with a leak your options are limited. It may be possible to increase the amount of address space with the
LDR_CNTRL
environment variable (so there is more room to leak into), but that will only buy you time before you eventually run out of memory. If you have enough physical memory and address space, you can allow the leak to continue on the basis that you will restart your application before the process address space is exhausted.What's using my native memory?
Once you have determined you are running out of native memory, the next logical question is: What's using that memory? AIX does not store information about which code path allocated a particular chunk of memory by default, so this information is not easy to get.
Your first step when trying to understand where your native memory has gone is to work out roughly how much native memory will be used based on your Java settings. You can estimate a rough lower bound based on the following guidelines:
- The Java heap occupies the
-Xmx
value. - Each Java thread has a native stack and a Java stack. On AIX, this uses at least 256KB per thread.
- Direct
ByteBuffer
s occupy at least the values supplied to theallocate()
routine.
The many memory debuggers available on AIX typically fall into one of the following categories:
- Preprocessor level. These require a header to be compiled in with the source under test. It's possible to recompile your own JNI libraries with one of these tools to track a native memory leak in your code. Dmalloc is an example of this kind of tool (see Resources).
- Linker level. These require the binaries under test to be relinked with the library under test. This is feasible for individual JNI libraries but not recommended for entire Java runtimes because it's unlikely that you would be supported running with modified binaries. Ccmalloc is an example of this kind of tool (see Resources).
- Runtime-linker level. These use the
LD_PRELOAD
environment variable to preload a library that replaces the standard memory routines with instrumented versions. They do not require recompilation or relinking of source code, but many of them do not work well with Java runtimes. (Tools like NJAMD, available on other operating systems such as Linux, don't support AIX well.) - OS level. AIX provides the
MALLOCDEBUG
tool to debug native memory leaks.
MALLOCDEBUG
to diagnose memory leaks. Here, I'll focus on the output from a leaking Java application. You'll work through an example of using MALLOCDEBUG
to debug a JNI application with a native memory leak. The samples pack for this article (see Downloads) contains a Java application called
LeakyJNIApp
; it runs in a loop calling a JNI method that leaks native memory. By default, it runs until native memory is exhausted; to make it finish, pass a run time in seconds as a command-line argument.Configure the environment for
malloc
debugging by setting the MALLOCDEBUG
and MALLOCTYPE environment variables
:export MALLOCTYPE=debug |
You add the
stack_depth:3
parameter to limit the stack trace collected when a malloc
is called. The JVM has a unique thread-stack structure that can confuse stack-walking applications and cause crashes. By restricting the stack depth to three levels, you should avoid unexpected behaviour.With the environment configured, run the
LeakyJNIApp
for 10 seconds and capture the stderr
output that contains the malloc
log:./run_leaky_jni_app_aix_32.sh 10 2>memory_log.txt |
The memory_log.txt file now contains details of leaked memory blocks:
Allocation #1175: 0x328B0C00 |
You may be able to spot the culprit by inspecting the memory log file. Alternatively, you can summarise the memory log using the format_mallocdebug_op.sh script provided with "Isolate and resolve memory leaks using MALLOCDEBUG on AIX Version 5.3."
Running the summary script on the memory_log.txt file produces this output:
$ ./format_mallocdebug_op.sh memory_log.txt |
This shows a leak coming from
LeakyJNIApp.nativeMethod()
.Several proprietary debugging applications also provide similar function. More tools (both open source and proprietary) are being developed all the time, and it's worth researching the current state of the art.
OS and third-party tools can make debugging easier, but they don't remove the need for sound debugging techniques. Some suggested steps are:
- Extract a test case. Produce a stand-alone environment that you can reproduce the native leak with. It will make debugging much simpler.
- Narrow the test case as far as possible. Try stubbing out functions to identify which code paths are causing the native leak. If you have your own JNI libraries, try stubbing them out entirely one at a time to determine if they are causing the leak.
- Reduce the Java heap size. The Java heap is likely to be the largest consumer of virtual address space in the process. By reducing the Java heap, you make more space available for other users of native memory. When you have a native memory leak, it buys time to allow the program to run for longer.
- Correlate the native process size. Once you have a plot of native-memory use over time, you can compare it to application workload and GC data. If the leak rate is proportional to the level of load, it suggests that the leak is caused by something on the path of each transaction or operation. If the native process size drops significantly when a GC happens, it suggests that you are not seeing a leak — you are seeing a buildup of objects with a native backing (such as direct
ByteBuffer
s). You can reduce the amount of memory held by native-backed objects by reducing the Java heap size (thereby forcing collections to occur more frequently) or by managing them yourself in an object cache rather than relying on the garbage collector to clean up for you.
It's easy to hit native out-of-memory conditions with 32-bit Java runtimes because the address space is relatively small. The 2 to 4GB of user space that 32-bit OSs provide is often less than the amount of physical memory attached to the system, and modern data-intensive applications can easily scale to fill the available space.
If your application cannot be made to fit in a 32-bit address space, you can gain a lot more user space by moving to a 64-bit Java runtime. If you can run a 64-bit Java runtime on AIX, it will open the door to huge Java heaps and fewer address-space related-headaches, thanks to a 448-petabyte address space.
Moving to 64-bit is not a universal solution to all native-memory woes, however; you still need sufficient physical memory to hold all of your data. If your Java runtime won't fit in physical memory, then performance will be intolerably poor because the OS is forced to thrash Java runtime data back and forth from swap space. For the same reason, moving to 64-bit is no permanent solution to a memory leak — you are just providing more space to leak into, which will only buy time between forced restarts.
It's not possible to use 32-bit native code with a 64-bit runtime; any native code (JNI libraries, JVM Tool Interface [JVMTI], JVM Profiling Interface [JVMPI], and JVM Debug Interface [JVMDI] agents) must be recompiled for 64-bit. A 64-bit runtime's performance can also be slower than the corresponding 32-bit runtime on the same hardware. A 64-bit runtime uses 64-bit pointers (native address references), so the same Java object on 64-bit takes up more space than an object containing the same data on 32-bit. Larger objects mean a bigger heap to hold the same amount of data while maintaining similar GC performance, which makes the OS and hardware memory system slower. Surprisingly, a larger Java heap does not necessarily mean longer GC pause times, because the pause time is largely dictated by the amount of live data on the heap — which might not have increased — and some GC algorithms are more effective with larger heaps.
Although historically the performance of 64-bit runtimes has been less than the corresponding 32-bit runtime, the situation is much improved in the IBM Developer Kit for Java 6.0. The addition of compressed references technology (enabled with the
-Xcompressedrefs
command-line argument) allows you to use large Java heaps (up to between 20 and 30GB at Service Refresh 2) while using 32-bit object addressing. This removes the "object bloat" that caused much of the slowdown in previous 64-bit runtimes.A comparative study of Java runtime performance is beyond this article's scope — but if you are considering a move to 64-bit, it's worth testing your application early on 64-bit and, where possible, using the IBM Developer Kit for Java 6 to take advantage of compressed references.
An understanding of native memory is essential when you design and run large Java applications, but it's often neglected because it's associated with the grubby machine and OS details that Java was designed to save us from. The JRE is a native process that must work in the environment defined by these grubby details. To get the best performance from your Java application, you must understand how the application affects the Java runtime's native-memory use.
Running out of native memory can look similar to running out of Java heap, but it requires a different set of tools to debug and solve. The key to fixing native-memory issues is to understand the limits imposed by the hardware and OS that your Java application is running on, and to combine this with knowledge of the OS tools for monitoring native-memory use. By following this approach, you'll be equipped to solve some of the toughest problems your Java application can throw at you.
Description | Name | Size | Download method |
---|---|---|---|
Native memory example code | j-nativememory-aix.zip | 33KB | HTTP |
Javacore memory analysis script | j-nativememory-aix2.zip | 3KB | HTTP |
Resources
Learn
- "Garbage collection with the IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer" (Holly Cummins, developerWorks, October 2007): Find out how to download and install GCMV and use it to analyse verbose garbage-collection data.
- "Java technology, IBM style: Class sharing" (Ben Corrie, developerWorks, May 2006): Read about the IBM Shared Classes feature included in the IBM Developer Kit for Java 5.0.
- "Java technology, IBM style: Garbage collection policies, Part 1 (Mattias Persson, developerWorks, May 2006): Gain an understanding of the different GC policies available in the IBM Developer Kit for Java 5.0.
- "Isolate and resolve memory leaks using MALLOCDEBUG on AIX Version 5.3" (Katiyar Manish and Vaarun Vijairaghavan, developerWorks, November 2006): Learn how to take advantage of
MALLOCDEBUG
, themalloc
subsystem monitoring tool shipped with AIX Version 5.3. - "Troubleshooting Java on AIX: Collecting data for memory issues" (Roger Leuckie, Dawn Patterson, and Rajeev Palanki, developerWorks, April 2004): Get instructions for collecting information for analyzing memory-related issues associated with Java applications running on AIX.
- "The Support Authority: Introducing the IBM Guided Activity Assistant" (Dave Draeger et al., developerWorks, May 2007): The IBM Guided Activity Assistant provides workflows to help you debug common problems — including Java out-of-memory conditions.
- Guided Debugging for Java: The IBM SDK for Java includes guided walkthroughs to help you solve common Java programming problems.
- Dmalloc: Download the Debug Malloc library.
- ccmalloc: Download the ccmalloc memory-debugger library.
- IBM Support Assistant (ISA): This free support framework contains tools such as Garbage Collection and Memory Visualizer and the IBM Guided Activity Assistant, which can walk you through debugging a native out-of-memory condition.
- IBM AIX Toolbox download information: Obtain open source binaries (including Perl) for AIX.
- IBM Monitoring and Diagnostic Tools for Java: Visit the IBM Java tooling page.
댓글 없음:
댓글 쓰기