2011년 11월 16일 수요일

[WebSphere] DNS 서버 장애로 인한 WAS 서버 정지 현상

예전에 유사한 내용으로 ipv4 설정하는 방법을 올리긴 했으나 고객쪽에서 다시 재발하여 유사한 내용을 찾아 올린다.

IBM JDK 1.4 이상, 그리고 ND 환경에서 발생할 수 있는 문제로
DNS 서버 장애시 nodeagent를 비롯한 모든 application server가 재시작하거나
또는 hang 현상을 유발하는 문제이다.

WebSphere는 hostname 기반하여 동작하기 때문에 OS에서 해당 hostname 또는 domain을 찾지 못하는 경우 nodeagent는 해당 노드 하위의 application server가 정상적으로 동작하지 않는 것으로 인식하고 auto restart를 진행하거나 정상적인 서비스를 하지 못한다.

해결 방법은 OS의 host 파일(해당 파일에 정상적으로 참조하는 서버가 정상적으로 구성되어야 한다.)을 우선적으로 읽게 설정하여 문제를 회피하는 방법을 가이드한다.

아래 link는 이전 버젼을 얘기하고 있지만 이후 버젼에도 동일하게 적용된다.

Nodeagent restarts all application servers on the node when DNS server is down



Problem(Abstract)

Nodeagent restarts all application servers on the node when DNS server is down if the system is not configured to use IPv6 in /etc/hosts. This problem occurs on IBM WebSphere Application Server V5.1 and V6.0 (all releases) and does not occur on V5.0.

Cause

In Java™ 2 SDK 1.4, JVM performs both IPv6 and IPv4 queries. If the system is not configured to use IPv6 in /etc/hosts, IPv6 queries will fail when DNS server is down. Nodeagent checks the status of applications on the node at one minute interval by way of the network. If nodeagent can not connect to an application server, it will assume that the server is hung and try to restart the server. So if nodeagent fails to lookup the hostname of its node, it will restart all the application servers in the same node.

Resolving the problem

To solve this problem, you can select either way from the following 2 options;
HostName lookup causes a JVM hang or slow response
 Technote (troubleshooting)
Problem(Abstract)
Call to the method java.net.InetAddress.getLocalHost takes a long time or the Java™ virtual machine (JVM) hangs.
Cause
If you notice an IBM® WebSphere® Application Server hang during host name lookup, or if the host name lookup fails, the problem could be lookup issues between IPv6 versus IPv4 in releases of the Java 2 SDK 1.4. The problem may be that the JVM performs both IPv6 and IPv4 queries. If the Domain Name System (DNS) server is not setup to handle IPv6 queries, the application may issue an unknown host exception. If the DNS is not setup to handle IPv6 queries properly, the application must wait for the IPv6 query to time out.

You may also notice that getting the wsadmin command prompt takes a long time or sometimes fails because of the preceding problem.
In addition, there is a known problem in Linux®/390 with IPv6 that will lead to a JVM crash. For more details, review the Related information section at the bottom of this technote.
Resolving the problem
By performing the kill -3 command when you take a thread dump, the following lines are at the top of the stack:
at java.net.Inet6AddressImpl.getLocalHostName(Native Method)
at java.net.InetAddress.getLocalHost(InetAddress.java:1186)
at org.apache.soap.util.mime.MimeUtils.getUniqueValue(Unknown Source)
at org.apache.soap.rpc.SOAPContext.setRootPart(Unknown Source)

Important: Carefully follow the steps below in the order that they are listed. If step 1 does not resolve the problem, there is no need to continue the remaining steps.
  1. Java solution
    Use the following system property setting when starting your Java application:

    -Djava.net.preferIPv4Stack=true

    To solve this problem for WebSphere Application Server, do the following:
    1. Open the administrative console and navigate to:

      Servers > Application Servers > server_name > Process Definition > Java Virtual Machine > Custom Properties(/Environment Entries)
    2. Add the following name and value pair:

      Name: java.net.preferIPv4Stack
      Value: true
    3. Click Apply, then save all changes.
    4. Restart the application server.

      Note: If this works, continue to step 2.
  1. AIX® solution
    1. Apply the following APARs:

      For AIX V520: IY47908
      For AIX V510: IY48783

      To see if the APAR is already installed on your system, run the following command:

      instfix -ik IY#####
    2. After these are applied, do one of the following:

      vi /etc/netsvc.conf
      hosts=bind4,local

      or

      export NSORDER=bind4,local
  1. DNS server solution
    Update the DNS server to ignore IPv6 queries.
  2. Network Information Service (NIS) solution
    If you are using NIS for host name resolution, remove NIS from the scenario by updating the host information in /etc/hosts and updating /etc/netsvc.conf not to include "nis".

댓글 없음:

댓글 쓰기