NSX Intelligence 장치의 상태 확인

만약NSX Intelligence 장치가 응답하지 않을 경우 NSX Intelligence 서비스의 상태를 확인합니다.

문제

NSX Intelligence 장치가 응답하지 않거나 장치가 예상대로 작동하지 않음을 나타내는 오류 메시지가 수신되었습니다.

원인

하나 이상의 기본 NSX Intelligence 서비스가 중지되었거나 정상 상태가 아닐 수 있습니다.

해결책

엔터프라이즈 관리자 역할이 있는 계정을 사용하여 NSX Intelligence 장치 CLI 호스트에 로그인합니다.

get services 명령을 사용하여 NSX Intelligence 서비스의 상태를 확인합니다.

모든 NSX Intelligence 서비스가 제대로 작동하는 경우 다음 예와 비슷한 출력이 표시됩니다.

my_nsx-intel> get services
Service name:                  druid
Service state:                 running
Coordinator health:            good
Broker health:                 good
Historical health:             good
Overlord health:               good
MiddleManager health:          good

Service name:                  http
Service state:                 running
Session timeout:               1800
Connection timeout:            30
Redirect host:                 (not configured)
Client API rate limit:         100 requests/sec
Client API concurrency limit:  40
Global API concurrency limit:  199

Service name:                  kafka
Service state:                 running
Service health:                good

Service name:                  liagent
Service state:                 stopped

Service name:                  mgmt-plane-bus
Service state:                 stopped

Service name:                  node-mgmt
Service state:                 running

Service name:                  nsx-config
Service state:                 running

Service name:                  nsx-message-bus
Service state:                 stopped

Service name:                  nsx-upgrade-agent
Service state:                 running

Service name:                  ntp
Service state:                 running
Start on boot:                 True

Service name:                  pace-server
Service state:                 running

Service name:                  postgres
Service state:                 running
Service health:                good

Service name:                  processing
Service state:                 running

Service name:                  snmp
Service state:                 stopped
Start on boot:                 False

Service name:                  spark
Service state:                 running
Service health:                good

Service name:                  spark-job-scheduler
Service state:                 running

Service name:                  ssh
Service state:                 running
Start on boot:                 True

Service name:                  syslog
Service state:                 running

Service name:                  ui-service
Service state:                 running

Service name:                  zookeeper
Service state:                 running
Service health:                good

my_nsx-intel>

서비스 상태가 실행 중 또는 중지됨일 수 있습니다. 서비스 상태가 정상 또는 성능 저하됨일 수 있습니다.

syslog 파일을 보고, NSX Intelligence 서비스의 상태를 syslog 파일에 기록하는 pace-monitor.sh 상태 점검 스크립트의 출력을 검색합니다.

모든 서비스가 예상대로 작동하는 경우 get log-file syslog | find pace-monitor 명령을 실행한 후에 다음 샘플 출력과 비슷한 결과가 나타납니다.

my_nsx-intel> get log-file syslog | find pace-monitor
<13>1 2019-08-30T03:19:20.409899+00:00 my_nsx-intel pace-monitor.sh - - -    "_self": {
<13>1 2019-08-30T03:19:20.410253+00:00 my_nsx-intel pace-monitor.sh - - -      "href": "/node/pace/appliance-health",
<13>1 2019-08-30T03:19:20.410623+00:00 my_nsx-intel pace-monitor.sh - - -      "rel": "self"
<13>1 2019-08-30T03:19:20.410908+00:00 my_nsx-intel pace-monitor.sh - - -    },
<13>1 2019-08-30T03:19:20.411162+00:00 my_nsx-intel pace-monitor.sh - - -    "appliance-health": {
<13>1 2019-08-30T03:19:20.411416+00:00 my_nsx-intel pace-monitor.sh - - -      "status": "Following NSX Intelligence first boot services are either PENDING or FAILED - Token-Registration",
<13>1 2019-08-30T03:19:20.411668+00:00 my_nsx-intel pace-monitor.sh - - -      "sub-system-status": {
<13>1 2019-08-30T03:19:20.411923+00:00 my_nsx-intel pace-monitor.sh - - -        "app-services": {
<13>1 2019-08-30T03:19:20.412280+00:00 my_nsx-intel pace-monitor.sh - - -          "services": [],
<13>1 2019-08-30T03:19:20.412528+00:00 my_nsx-intel pace-monitor.sh - - -          "status": ""
<13>1 2019-08-30T03:19:20.412807+00:00 my_nsx-intel pace-monitor.sh - - -        },
<13>1 2019-08-30T03:19:20.413075+00:00 my_nsx-intel pace-monitor.sh - - -        "base-infra-services": {
<13>1 2019-08-30T03:19:20.413303+00:00 my_nsx-intel pace-monitor.sh - - -          "services": [
<13>1 2019-08-30T03:19:20.413613+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.413848+00:00 my_nsx-intel pace-monitor.sh - - -              "druid-health": {
<13>1 2019-08-30T03:19:20.414146+00:00 my_nsx-intel pace-monitor.sh - - -                "broker": "good",
<13>1 2019-08-30T03:19:20.414473+00:00 my_nsx-intel pace-monitor.sh - - -                "coordinator": "good",
<13>1 2019-08-30T03:19:20.414717+00:00 my_nsx-intel pace-monitor.sh - - -                "historical": "good",
<13>1 2019-08-30T03:19:20.414979+00:00 my_nsx-intel pace-monitor.sh - - -                "middlemanager": "good",
<13>1 2019-08-30T03:19:20.415295+00:00 my_nsx-intel pace-monitor.sh - - -                "overlord": "good"
<13>1 2019-08-30T03:19:20.415533+00:00 my_nsx-intel pace-monitor.sh - - -              },
<13>1 2019-08-30T03:19:20.415762+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "druid"
<13>1 2019-08-30T03:19:20.415982+00:00 my_nsx-intel pace-monitor.sh - - -            },
<13>1 2019-08-30T03:19:20.416269+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.416539+00:00 my_nsx-intel pace-monitor.sh - - -              "health": "good",
<13>1 2019-08-30T03:19:20.416772+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "kafka"
<13>1 2019-08-30T03:19:20.416991+00:00 my_nsx-intel pace-monitor.sh - - -            },
<13>1 2019-08-30T03:19:20.417204+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.417510+00:00 my_nsx-intel pace-monitor.sh - - -              "health": "good",
<13>1 2019-08-30T03:19:20.417745+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "postgres"
<13>1 2019-08-30T03:19:20.418133+00:00 my_nsx-intel pace-monitor.sh - - -            },
<13>1 2019-08-30T03:19:20.418389+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.418626+00:00 my_nsx-intel pace-monitor.sh - - -              "health": "good",
<13>1 2019-08-30T03:19:20.418855+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "spark"
<13>1 2019-08-30T03:19:20.419157+00:00 my_nsx-intel pace-monitor.sh - - -            },
<13>1 2019-08-30T03:19:20.419435+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.419684+00:00 my_nsx-intel pace-monitor.sh - - -              "health": "good",
<13>1 2019-08-30T03:19:20.419928+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "zookeeper"
<13>1 2019-08-30T03:19:20.420165+00:00 my_nsx-intel pace-monitor.sh - - -            }
<13>1 2019-08-30T03:19:20.420496+00:00 my_nsx-intel pace-monitor.sh - - -          ],
<13>1 2019-08-30T03:19:20.420786+00:00 my_nsx-intel pace-monitor.sh - - -          "status": ""
<13>1 2019-08-30T03:19:20.421022+00:00 my_nsx-intel pace-monitor.sh - - -        },
<13>1 2019-08-30T03:19:20.421255+00:00 my_nsx-intel pace-monitor.sh - - -        "first-boot-services": {
<13>1 2019-08-30T03:19:20.421539+00:00 my_nsx-intel pace-monitor.sh - - -          "services": [
<13>1 2019-08-30T03:19:20.421777+00:00 my_nsx-intel pace-monitor.sh - - -            {
<13>1 2019-08-30T03:19:20.422010+00:00 my_nsx-intel pace-monitor.sh - - -              "health": "degraded",
<13>1 2019-08-30T03:19:20.422277+00:00 my_nsx-intel pace-monitor.sh - - -              "service-name": "token-registration"
<13>1 2019-08-30T03:19:20.422512+00:00 my_nsx-intel pace-monitor.sh - - -            }
<13>1 2019-08-30T03:19:20.422770+00:00 my_nsx-intel pace-monitor.sh - - -          ],
<13>1 2019-08-30T03:19:20.423012+00:00 my_nsx-intel pace-monitor.sh - - -          "status": "Following NSX Intelligence first boot, services are either PENDING or FAILED - Token-Registration"
<13>1 2019-08-30T03:19:20.423354+00:00 my_nsx-intel pace-monitor.sh - - -        }
<13>1 2019-08-30T03:19:20.423601+00:00 my_nsx-intel pace-monitor.sh - - -      }
<13>1 2019-08-30T03:19:20.423882+00:00 my_nsx-intel pace-monitor.sh - - -    }
<13>1 2019-08-30T03:19:20.424339+00:00 my_nsx-intel pace-monitor.sh - - -  }
<13>1 2019-08-30T03:19:20.972629+00:00 my_nsx-intel pace-monitor.sh - - -  NSX Intelligence health OK.
<30>1 2019-08-30T03:19:20.973076+00:00 my_nsx-intel pace-monitor 20804 - -  <13>Aug 30 03:19:19 pace-monitor.sh: NSX Intelligence health OK.
<182>1 2019-08-30T03:23:23.857Z my_nsx-intel NSX 21752 - [nsx@6876 comp="nsx-cli" subcomp="node-mgmt" username="admin" level="INFO"] CMD: get log-file syslog | find pace-monitor

서비스 중 하나에 문제가 있으면 get log-file syslog | find pace-monitor를 실행할 때 다음 줄이 표시될 수 있습니다.

NSX Intelligence health DEGRADED. Return code not HTTP OK.

다음 출력 중 하나가 표시되면 restart service service-name 명령을 사용하여 서비스를 다시 시작합니다.
- get services 명령을 실행한 후에 서비스 중 하나에 서비스 상태: 중지됨 또는 서비스 상태: 성능 저하됨이 표시됩니다.
- get log-file syslog | find pace-monitor 명령을 실행한 후에 PACE health DEGRADED. Return code not HTTP OK. 메시지와 비슷한 출력이 표시됩니다.
예를 들어, postgres 서비스 상태가 중지됨으로 표시되거나 서비스 상태가 실행 중으로 표시되지만 실제로는 서비스 상태가 성능 저하됨이면 다음 명령을 실행합니다.
```
restart service postgres
```
중요: NSX Intelligence 서비스를 다시 시작하려면 restart service service-name 명령을 사용해야 합니다. 대신 stop service service-name 및 start service service-name 명령을 사용하기로 한 경우 service-name에 의존하는 각 서비스를 수동으로 다시 시작해야 합니다. 다음 목록에는 NSX Intelligence 서비스를 다시 시작해야 하는 종속성 순서가 나와 있습니다.
```
zookeeper > druid > kafka > spark > spark-job-scheduler > nsx-config > processing > pace-server 
```
예를 들어 nsx-config 서비스를 중지했다가 stop|start service service-name 명령을 사용하여 시작하는 경우 restart service service-name 명령을 사용하여 processing 및 pace-server 서비스를 다시 시작해야 합니다.
서비스가 다시 시작되면 종속된 다른 서비스의 성능이 잠시 저하될 수 있습니다. 오류가 발생하지 않으면 성능이 저하되었던 서비스가 안정적인 상태로 돌아갑니다.