If you are experiencing performance degradation, hang, no response, hung threads, CPU starvation, high CPU utilization, network delays, or deadlocks, this MustGather will assist you in collecting the critical data that is needed to troubleshoot your issue.
To improve the accuracy of complete data collection, IBM recommends you use the automated data collectors within IBM Support Assistant. Not only will the automated collector gather the equivalent of the manual process, it will also provide a secure file transfer of the collection to IBM.
The verboseGC data is critical to analyze a performance problem. If you have not already done so, enable verboseGC and restart the server.
Important note: Step 2. below involves installation of the chosen CPU data collection tool. Make sure to read that step and complete the installation of the tool on the problem server before the problem.
At the time of the problem:
- Take the output of netstat command to get information about TCP/IP sockets:
netstat -an > netstat_before.out - If you are seeing high CPU usage: Start collecting the CPU data. In most of the cases, the TPROF For Windows tool gives a complete and granular CPU data so its our preferred tool. Please follow the steps given in TPROF For Windows tool, to start collecting the CPU data.
However, if it is not possible to use the preceding tool, then here are the other tools to collect CPU data:
Perfmon (Windows XP / Windows 2003)
Perfmon (Windows 2008 / Windows 7)
Pslist - Download the file windows_hang.py and copy the file to your <PROFILE_ROOT>\bin directory. If instead copied to <WAS_HOME>\bin, the default server, which may be the deployment manger (dmgr), will be accessed when wsadmin.bat is launched.
NOTE: This script only works for WebSphere Application Server 6.1 and higher.
If you are looking for the older windows_hang.bat that works with older releases of WebSphere Application Server, see the FAQ section.
To launch the script to produce 3 javacores spaced 2 minutes apart, run this command:
wsadmin -lang jython -f windows_hang.py -j -s SERVER_NAME
Replacing SERVER_NAME with your server's name.
This script cannot be used while the application server is starting up (i.e. before the "e-business" message is seen in the SystemOut.log). This is due to the requirement that an active SOAP connection has to be established through wsadmin.
Alternative steps include collecting raw core dumps using userdump.exe, or (on Windows Vista/2008 or later) opening the Task Manager, right-click on the java process, and selecting Create Dump File from the context menu. See the manual steps (and FAQ) in the Crash MustGather to properly configure full core dumps as well as how to process any raw core dumps.
All the arguments below are added after the -f windows_hang.py option. Any arguments added before -f are reserved for wsadmin.bat (such as -lang, -host, and/or -port).ArgumentsDefault ValueDescriptionRequired--serverName
-sThe problematic application server name. This is not the same as the profile name or the host name of the physical machine.
Case-sensitiveYES --nodeName
-nThe problematic application server's node.
This is not the same as the profile name or the host name of the physical machine.
Case-sensitiveOptional; use if multiple nodes are defined or running the script against the dmgr. --javacore
-jdisabled Enables the generation of multiple javacores YES, if you want to capture javacores. --interval
-i120 (seconds) The interval of time (in seconds) to wait in-between javacore generation. No --iterations
-r3 The number of javacores (and heapdumps) to produce No --heapdump
-ddisabled Enables the generation of a single heapdump No --multiple
-mdisabled Enables the generation of multiple heapdumps. No --help Displays a help page. Note the two dashes. No - Follow the steps given in TPROF For Windows tool (or the other tool you chose in step 2), to stop collecting the CPU data.
- Take the final output of netstat command to get information about TCP/IP sockets:
netstat -an > netstat_after.out
Submitting required data:
Zip all the output and log files:
- netstat output (per #1 and #5 above)
- CPU data (per #2 and #4 above)
- All the generated javacores (per #3 above)
- Server logs from the server having problems (<PROFILE_ROOT>\logs\<MY_SERVER>\)
Send the results to IBM Support.
Frequently Asked Questions (FAQs):
- What is the impact of enabling verboseGC?VerboseGC data is critical to diagnosing these issues. This can be enabled on production systems because it has a negligible impact on performance (< 2%).
- What are 'javacores' and where do I find them?Javacores are snapshots of the Java™ Virtual Machine activity and are essential to troubleshooting these issues. These files will usually be found in the profile_root, else search the entire system for "*javacore*".
- How to check the SOAP port of the server ?Check the value of SOAP_CONNECTOR_ADDRESS in serverindex.xml file present under <PROFILE_ROOT>\config\cells\cell_name\nodes\node_name
- If either script fails, can I still collect javacores manually via wsadmin?Follow these manual steps to collect the javacores:
- From the command prompt, enter the command to get a wsadmin command prompt :
<WAS_HOME>\bin\wsadmin.bat
If security is enabled or the default SOAP ports have been changed, you will need to pass additional parameters to the batch file in order to get a wsadmin prompt. For example:
wsadmin.bat [-host host_name] [-port port_number] [-user username [-password password]]
Note: You can connect wsadmin to any of the server JVM in the cell. After running the wsadmin command it will display the server process for which it has attached to. Depending on the process that it has attached to, you can get thread dumps for various JVMs. If wsadmin is connected to deployment manager, then you can get thread dumps for any JVM in that cell. If it is attached to a node agent, then you can get thread dumps for any JVM in that Node. If it is attached to a server, then you can get thread dumps only for the server to which has connected to. - Get a handle to the problem application server.
Note: The contents in brackets "[.....]", along with the brackets, is not optional. It must be entered to set the jvm object. Also, note that there is a space between the words "completeObjectName" and "type":
wsadmin> set jvm [$AdminControl completeObjectName type=JVM,process=problemServerName,*]
Where server1 is the name of the application server that does not respond (or is hung). If wsadmin is connected to a Deployment Manager and if the server names in the cell are not unique, then you can qualify the JVM with node attribute in addition to process:
,node=nodeName,* - Generate multiple javacores by issuing the following command every 2 minutes for 3 iterations:
wsadmin> $AdminControl invoke $jvm dumpThreads
- From the command prompt, enter the command to get a wsadmin command prompt :
- Is there another way to gather the required data?
- How to analyze the Java thread dumps ? Download the IBM Thread and Monitor Dump Analyzer for Java Technology.
ThreadAnalyzer is a technology preview tool that can analyze thread dumps from WebSphere Application Server. It is useful for identifying deadlocks, contention, bottlenecks, and to summarize the state of threads within WebSphere Application Server. - Where is the old windows_hang.bat?The old script is located here, although there are limitations with this script as you are required to run this against the individual application server. Running this script with wsadmin running through the dmgr might cause this to fail.
Download the attached script (windows_hang.bat) under <PROFILE_ROOT>\bin folder.
This script will be used to automatically generate 3 javacores with 2 minutes interval. Before running the script, check the following:- Name of the problematic server(s)
- If admin security is enabled then get the username/password.Check which SOAP port is in use, as you will be required to enter it interactively when running the script
For each of the problematic server(s) open a command prompt and go to profile_root\bin. Enter the following command to start the script:
windows_hang.bat [problem servername]
The script will prompt for the admin security and the SOAP port. It will then generate 3 javacores, 2 minutes apart. Once done, you should see the following message and 3 javacores in the <profile_root> directory:
"MustGather>> Last javacore generation Successful. Script will now exit" - Name of the problematic server(s)
- How to change the default time interval for javacore generation in the older windows_hang.bat script?Edit the TIME_SLEEP variable in the batch file. This variable accepts the time in milli seconds.
- What if I am using WebSphere Application Server 5.1? Where are the server logs?For WebSphere Application Server 5.1 the server logs will be here:
install_root\logs\server_name\*
If asked to do so:
The preceding data is used to troubleshoot most of these issues; however, in certain situations Support may need additional data. Only collect the following data if asked to do so by IBM Support.
Userdumps
Follow instructions in MustGather: Getting user.dmp when hangs/performance degradation prevents generating a javacore to produce a set of three user.dmp files taken at 2 minute intervals.
For a listing of all technotes, downloads, and educational materials specific to a hang or performance degradation, search the WebSphere Application Server support site.
Related information
How to enable verbosegc for WebSphere
IBM Thread and Monitor Dump Analyzer
Steps to getting support for WebSphere Application Server
Submitting information to IBM support
MustGather: Read first for WebSphere Application Server
Troubleshooting guide for WebSphere Application Server
Not geting javacores? Instructions to get user.dmp.
댓글 없음:
댓글 쓰기