2012년 6월 19일 화요일

[TechNote] Too Many Open Files error message


Too Many Open Files error message


관련 내용: http://wassupport.blogspot.kr/2012/05/technote-guidelines-for-setting-ulimits.html

Problem(Abstract)

This technote explains how to debug the "Too many open files" error message on Microsoft Windows, AIX, Linux and Solaris operating systems.

Cause

System configuration limitation.
When the "Too Many Open Files" error message is written to the logs, it indicates that all available file handles for the process have been used. In a majority of cases, this is the result of file handles being leaked by some part of the application. This technote explains how to collect output that identifies what file handles are in use at the time of the error condition.

Resolving the problem


Windows
By default, Windows does not ship with a tool to debug this type of problem. However, Microsoft provides a tool that you can download called Process Explorer. This tool identifies the open handles associated with the Java™ process and determines which handles are being opened, but not closed. These handles result in the "Too many open files" error message.

It is important that you change the Refresh Rate. Select View > Update Speed, and change it to 5 seconds

There is also a Microsoft utility called Handle that you can download from the following URL:
http://www.microsoft.com/technet/sysinternals/ProcessesAndThreads/Handle.mspx

This tool is a command line version of Process Explorer. The URL above contains the usage instructions.


AIX
To determine if the number of open files is growing over a period of time, issue lsof to report the open files against a PID on a periodic basis. For example:
    lsof -p (PID of process) -r (interval in seconds, 1800 for 30 minutes) > lsof.out

This output does not give the actual file names to which the handles are open. It provides only the name of the file system (directory) in which they are contained. The lsof command indicates if the open file is associated with an open socket or a file. When it references a file, it identifies the file system and the inode, not the file name.

Run the following command to determine the file name:

# df -kP filesystem_from_lsof | awk '{print $6}' | tail -1
note the filesystem name

# find filesystem_name -inum inode_from_lsof -print
shows the actual file name

To increase the number, change or add the nofiles=XXXXX parameter in the /etc/security/limits file, or by using the command: chuser nofiles=XXXXX user_id

You can also use svmon:

# svmon -P java_pid -m | grep pers
(this opens files in the format: filesystem_device:inode)

Use the same procedure as above for finding the actual file name.


Linux 2.1.72 and above on Intel-based systems
To determine if the number of open files are growing over a period of time, issue lsof to report the open files against a PID on a periodic basis. For example:

lsof -p (PID of process) -r (interval in seconds, 1800 for 30 minutes) > lsof.out

The output will provide you with all of the open files for the specified processID. You will be able to determine which files are opened and which files are growing over time.


Solaris
Run the following commands to monitor open file (socket) descriptors on Solaris:
  1. ulimit -a > ulimit.out
  2. /usr/proc/bin/pfiles [PID of process that has too many open files] > pfiles.out
  3. lsof -p [PID of process that has too many open files] > lsof.out
  4. To determine if the number of open files is growing over a period of time, issue lsof to report the open files against a PID on a periodic basis. For example:

    lsof -p (PID of process) -r (interval in seconds, 1800 for 30 minutes) > lsof.out

    Sample output of lsof.out:
    COMMAND PID NODE NAME
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log
    java 6116 2640007 /WebSphere/AppServer/eAAS/debug.log

After analyzing the above collected data, if you need to increase the limit for number of open file descriptors, run the following command:
ulimit -n nnnn (Where is nnnn is the desired number of open files)

Other lsof information
If lsof is not installed on your system, refer to the following technote to download this utility: lsof: A Free Tool Offered by Purdue University - Useful in DCE/DFS Debugging
lsof is available for the following operating systems:
  • AIX 5.1, 5.2 and 5.3
  • HP-UX 11.00, 11.11 and 11.23
  • Linux 2.1.72 and above for Intel-based systems
  • Solaris 2.6, 7, 8, 9 and 10

댓글 없음:

댓글 쓰기