2012년 11월 12일 월요일

[Tech] MustGather: Performance, hang, or high CPU issues on Windows

MustGather: Performance, hang, or high CPU issues on Windows


If you are experiencing performance degradation, hang, no response, hung threads, CPU starvation, high CPU utilization, network delays, or deadlocks, this MustGather will assist you in collecting the critical data that is needed to troubleshoot your issue.
To improve the accuracy of complete data collection, IBM recommends you use the automated data collectors within IBM Support Assistant. Not only will the automated collector gather the equivalent of the manual process, it will also provide a secure file transfer of the collection to IBM.



The verboseGC data is critical to analyze a performance problem. If you have not already done so, enable verboseGC and restart the server.

Important note: Step 2. below involves installation of the chosen CPU data collection tool. Make sure to read that step and complete the installation of the tool on the problem server before the problem.

At the time of the problem:
  1. Take the output of netstat command to get information about TCP/IP sockets:

    netstat -an > netstat_before.out
  2. If you are seeing high CPU usage: Start collecting the CPU data. In most of the cases, the TPROF For Windows tool gives a complete and granular CPU data so its our preferred tool. Please follow the steps given in TPROF For Windows tool, to start collecting the CPU data.

    However, if it is not possible to use the preceding tool, then here are the other tools to collect CPU data:
    Perfmon (Windows XP / Windows 2003)
    Perfmon (Windows 2008 / Windows 7)
    Pslist

  3. Download the file windows_hang.py and copy the file to your <PROFILE_ROOT>\bin directory. If instead copied to <WAS_HOME>\bin, the default server, which may be the deployment manger (dmgr), will be accessed when wsadmin.bat is launched.


    NOTE: This script only works for WebSphere Application Server 6.1 and higher.


    If you are looking for the older windows_hang.bat that works with older releases of WebSphere Application Server, see the FAQ section.

    To launch the script to produce 3 javacores spaced 2 minutes apart, run this command:

    wsadmin -lang jython -f windows_hang.py -j -s SERVER_NAME

    Replacing SERVER_NAME with your server's name.



    This script cannot be used while the application server is starting up (i.e. before the "e-business" message is seen in the SystemOut.log). This is due to the requirement that an active SOAP connection has to be established through wsadmin.
    Alternative steps include collecting raw core dumps using userdump.exe, or (on Windows Vista/2008 or later) opening the Task Manager, right-click on the java process, and selecting Create Dump File from the context menu. See the manual steps (and FAQ) in the Crash MustGather to properly configure full core dumps as well as how to process any raw core dumps.



    All the arguments below are added after the -f windows_hang.py option. Any arguments added before -f are reserved for wsadmin.bat (such as -lang, -host, and/or -port).

    Arguments
    Default Value
    Description
    Required
    --serverName
    -s
    The problematic application server name. This is not the same as the profile name or the host name of the physical machine.
    Case-sensitive
    YES
    --nodeName
    -n
    The problematic application server's node.
    This is not the same as the profile name or the host name of the physical machine.
    Case-sensitive
    Optional; use if multiple nodes are defined or running the script against the dmgr.
    --javacore
    -j
    disabledEnables the generation of multiple javacoresYES, if you want to capture javacores.
    --interval
    -i
    120 (seconds)The interval of time (in seconds) to wait in-between javacore generation. No
    --iterations
    -r
    3The number of javacores (and heapdumps) to produceNo
    --heapdump
    -d
    disabledEnables the generation of a single heapdumpNo
    --multiple
    -m
    disabledEnables the generation of multiple heapdumps.No
    --helpDisplays a help page. Note the two dashes.No
  4. Follow the steps given in TPROF For Windows tool (or the other tool you chose in step 2), to stop collecting the CPU data.
  5. Take the final output of netstat command to get information about TCP/IP sockets:

    netstat -an > netstat_after.out

Submitting required data:
Zip all the output and log files:
Send the results to IBM Support.


Frequently Asked Questions (FAQs):



If asked to do so:
The preceding data is used to troubleshoot most of these issues; however, in certain situations Support may need additional data. Only collect the following data if asked to do so by IBM Support.

Userdumps
Follow instructions in MustGather: Getting user.dmp when hangs/performance degradation prevents generating a javacore to produce a set of three user.dmp files taken at 2 minute intervals.

For a listing of all technotes, downloads, and educational materials specific to a hang or performance degradation, search the WebSphere Application Server support site.

Related information
How to enable verbosegc for WebSphere
IBM Thread and Monitor Dump Analyzer
Steps to getting support for WebSphere Application Server
Submitting information to IBM support
MustGather: Read first for WebSphere Application Server
Troubleshooting guide for WebSphere Application Server
Not geting javacores? Instructions to get user.dmp.

댓글 없음:

댓글 쓰기