通过在控制平面中使用基于 Python 的 ControlScript 并在数据平面中使用基于 Lua 的 DataScript,可以进行广泛的行为自定义和自动化。

DataScript

DataScript 是一种强大的机制,可以为每个虚拟服务甚至每个客户端自定义 NSX Advanced Load Balancer 行为。DataScript 是使用 Lua 编码的轻型脚本。可以对每个客户端执行这些脚本,以便在数据平面中建立 TCP 连接,发送 HTTP 请求或响应或执行其他事件。

可以将一个或多个 DataScript 附加到虚拟服务的规则部分。

可以将脚本上载或复制粘贴到“请求事件脚本”或“响应事件脚本”部分中。例如,要限制对安全目录的访问,请将以下文本粘贴到“请求事件脚本”部分中:

if avi.http.uri == "/secure/" then
   avi.http.send(403)
end

有关命令和示例 DataScript 的完整文档,请参见《VMware NSX Advanced Load Balancer DataScript 指南》

ControlScript

ControlScript 是基于 Python 的脚本,将在 NSX Advanced Load Balancer Controller 上执行这些脚本。它们是由警示操作启动的,这些操作本身是由系统中的事件触发的。ControlScript 可以执行特定的操作,例如更改 NSX Advanced Load Balancer 配置或向外部系统发送自定义消息(例如,如果当前服务器已达到资源容量并导致运行状况分数降低,则指示 VMware’s vCenter 扩展更多服务器),而不是提醒管理员发生了特定的事件。

要创建 ControlScript,请执行以下操作:

  1. 导航到模板 > 脚本 > ControlScript

  2. 单击创建。将显示新建 ControlScript 屏幕。

  3. 输入 ControlScript 的名称

    注:

    默认情况下,将选择输入文本选项。

  4. 在提供的文本框中输入用户定义的警示操作脚本,或者上载一个 .py 文件。

  5. 单击保存

ControlScript 在控制器的 Linux 子系统中以有限的特权执行。文件系统访问权限为只读,因此,您无法创建新的文件或修改任何文件或目录。

例外:允许 ControlScript 对 /tmp 目录进行读写访问,配额限制为 10 MB。在 /tmp 中写入的任何文件是临时文件,在 ControlScript 结束时,该文件就会立即丢失。

在控制脚本中无法对控制器系统文件和控制器系统 Python 库进行任何类型的访问。需要在容器中共享以在控制脚本中使用的任何脚本文件必须放置在 /opt/avi/python/lib/avi/scripts/csshared 文件夹中。可以使用 import avi.scripts.csshared.<module> 代码导入该文件。

有关 Python 命令的示例和定义,请参见标准 Python 文档。可以使用从 Linux 到 NSX Advanced Load Balancer 的 API 调用进行 NSX Advanced Load Balancer 配置更改(通过标准 API 机制)。

将向 ControlScript 传递两组变量。其中包括:

  • 环境变量

  • 脚本参数

环境变量

可以在 ControlScript 中使用以下环境变量:

  • USER/ API_TOKEN:包含可用于通过 NSX Advanced Load Balancer REST API 进行身份验证的用户名和 API 令牌。

  • TENANT/ TENANT_UUID:包含正在其中执行 ControlScript 的租户上下文(名称/UUID)。

  • DOCKER_GATEWAY:包含 ControlScript 可用于与 NSX Advanced Load Balancer REST API 通信的本地 IP 地址。

  • VERSION:包含正在其中执行 ControlScript 的控制器的版本。

  • EVENT_DESCRIPTION:包含触发警示的事件的描述。

以下示例显示传递到控制器(运行 21.1.4 版本)上的 ControlScript 的这些环境变量的内容,这是在 Python 中使用 os.environ() 检索的。

{
    "USER": "admin",
    "API_TOKEN": "60bc0fc8ece748c4f20f0eabf7a25eb6584af0aa",
    "TENANT": "admin",
    "TENANT_UUID": "admin",
    "DOCKER_GATEWAY": "172.17.0.1",
    "VERSION": "21.1.4",
    "EVENT_DESCRIPTION": "Config vs1 update status is success (performed by user admin)",
}
注:

必须使用 DOCKER_GATEWAY 环境变量中传递的 IP 地址以访问 NSX Advanced Load Balancer REST API。

脚本参数

提供给脚本的参数是以数组形式提供的,格式如下所示:

['/home/admin/{{ ALERT NAME }}' , '{{ ALERT DETAILS }}']

警示详细信息

警示详细信息是以 JSON 数据形式提供的,可以使用 JSON 将其作为 json.loads(sys.argv[1]) 进行解析。以下是提供给 ControlScript 的数据的结构化 JSON:

{
  "name": "System-CC-Alert-cluster-878226b4-ff2c-4e6b-a9a7-66aa58ad23f1-1580412633.0-1580412633-38875205",
  "throttle_count": 0,
  "level": "ALERT_LOW",
  "reason": "threshold_exceeded",
  "obj_name": "AWS",
  "threshold": 1,
  "events": [
    {
      "event_id": "AWS_ACCESS_FAILURE",
      "event_details": {
        "aws_infra_details": {
          "vpc_id": "vpc-0617ccd15673817c0",
          "region": "eu-central-1",
          "error_string": "AuthFailure: AWS was not able to validate the provided access credentials\\n\\tstatus code: 401, request id: f22641cb-140b-4ff2-ab8d-0008a73c6fbf",
          "cc_id": "cloud-8a3e601b-d06d-4998-bc41-ac3004330066"
        }
      },
      "obj_uuid": "cluster-878226b4-ff2c-4e6b-a9a7-66aa58ad23f1",
      "obj_name": "AWS",
      "report_timestamp": 1580412633
    }
  ]
}
注:

如果 JSON 数据大小超过 128kb,则会将其丢弃。

示例

要了解脚本编写,请使用有关如何使用这些值的示例,如下所示:

将数据传递给 ControlScript

#!/usr/bin/python
#
# NSX Advanced Load Balancer ControlScript
#
# This sample ControlScript will output the environment values, and alert
# arguments that are passed from the alert that triggered the alert script.
# You can use these values to help construct your python script actions to
# handle the alert.
#
#
import os
import sys
 
if __name__ == "__main__":
    print("Environment Vars: %s \n" % os.environ)
    print("Alert Arguments: %s \n" % sys.argv)

粘性池组

#!/usr/bin/python3
import os
import sys
import json
from avi.sdk.avi_api import ApiSession
import urllib3
import requests
  
if hasattr(requests.packages.urllib3, 'disable_warnings'):
    requests.packages.urllib3.disable_warnings()
  
if hasattr(urllib3, 'disable_warnings'):
    urllib3.disable_warnings()
  
def ParseAviParams(argv):
    if len(argv) != 2:
        return
    alert_params = json.loads(argv[1])
    print(str(alert_params))
    return alert_params
  
def get_api_token():
    return os.environ.get('API_TOKEN')
  
def get_api_user():
    return os.environ.get('USER')
  
def get_api_endpoint():
    return os.environ.get('DOCKER_GATEWAY') or 'localhost'
  
def get_tenant():
    return os.environ.get('TENANT')
  
def failover_pools(session, pool_uuid, pool_name, retries=5):
    if retries <= 0:
        return 'Too many retry attempts - aborting!'
    query = 'refers_to=pool:%s' % pool_uuid
    pg_result = session.get('poolgroup', params=query)
    if pg_result.count() == 0:
        return 'No pool group found referencing pool %s' % pool_name
  
    pg_obj = pg_result.json()['results'][0]
  
    highest_up_pool = None
    highest_down_pool = None
  
    for member in pg_obj['members']:
        priority_label = member['priority_label']
        member_ref = member['pool_ref']
        pool_runtime_url = ('%s/runtime/detail' %
                            member_ref.split('/api/')[1])
        pool_obj = session.get(pool_runtime_url).json()[0]
        if pool_obj['oper_status']['state'] == 'OPER_UP':
            if (not highest_up_pool or
                int(highest_up_pool[1]) < int(priority_label)):
                highest_up_pool = (member, priority_label,
                                   pool_obj['name'])
        elif (not highest_down_pool or
              int(highest_down_pool[1]) < int(priority_label)):
            highest_down_pool = (member, priority_label,
                                 pool_obj['name'])
  
    if not highest_up_pool:
        return ('No action required as all pools in the '
                'pool group are now down.')
    elif not highest_down_pool:
        return ('No action required as all pools in the '
                'pool group are now up.')
  
    if int(highest_down_pool[1]) <= int(highest_up_pool[1]):
        return ('No action required. The highest-priority available '
                'pool (%s) already has a higher priority than the '
                'highest-priority non-available pool (%s)' %
                (highest_up_pool[2], highest_down_pool[2]))
  
    highest_up_pool[0]['priority_label'] = highest_down_pool[1]
    highest_down_pool[0]['priority_label'] = highest_up_pool[1]
  
    p_result = session.put('poolgroup/%s' % pg_obj['uuid'], pg_obj)
    if p_result.status_code < 300:
        return ', '.join(['Pool %s priority changed to %s' % (p[0], p[1])
                           for p in ((highest_up_pool[2], highest_down_pool[1]),
                                     (highest_down_pool[2], highest_up_pool[1]))])
    if p_result.status_code == 412:
        return failover_pools(session, pool_uuid, pool_name, retries - 1)
  
    return 'Error setting pool priority: %s' % p_result.text
  
if __name__ == "__main__":
    alert_params = ParseAviParams(sys.argv)
    events = alert_params.get('events', [])
    if len(events) > 0:
        token = get_api_token()
        user = get_api_user()
        api_endpoint = get_api_endpoint()
        tenant = get_tenant()
  
        pool_uuid = events[0]['obj_uuid']
        pool_name = events[0]['obj_name']
        event_id = events[0]['event_id']
        try:
            with ApiSession(api_endpoint, user,
                            token=token,
                            tenant=tenant) as session:
                result = failover_pools(session, pool_uuid, pool_name)
        except Exception as e:
            result = str(e)
    else:
        result = 'No event data for ControlScript'
  
    print(result)
 
# Use with a ControlScript and Alert(s) to perform 'sticky' failover of pool groups.
#
# Alert should trigger on 'Pool Up' and 'Pool Down' events.
#

添加到 GCP SE 的路由

#!/usr/bin/python
 
import sys, os, json, traceback, re, time
from avi.sdk.avi_api import ApiSession
from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
 
'''
This ControlScript is executed on the Controller every time there is a
CC_IP_ATTACHED or a CC_IP_DETACHED event.
 
CC_IP_ATTACHED: Event is triggered when a VIP is attached to a SE
CC_IP_DETACHED: Event is triggered when a VIP is detached from a SE, usually
when a SE goes down or a scale in occurs
 
The goal of this script is to add a route to GCP with the destination as the
VIP and nextHopIp as the GCP instance IP on which the SE is running after a
CC_IP_ATTACHED event. After a CC_IP_DETACHED event, the goal of the script is
to remove the corresponding route.
 
Script assumptions:
 
1) The Controller GCP instance has scope=compute-rw to be able to modify
routes in GCP
2) 'description' field in the Service Engine Group is configured as a
JSON encoded string containing GCP project, zone and network
 
Event details contain the SE UUID and the VIP.
 
1) GET SE object from UUID and extract SE IP address (which is
the same as the GCP instance IP address) and Service Engine Group link
 
2) GET Service Engine Group object. The 'description' field in the
Service Engine Group is a JSON encoded string containing GCP project and
network URL. Extract project and network from the 'description' field
 
3) Extract all routes matching destRange as VIP from GCP
 
4) If event is CC_IP_DETACHED, remove matching route with
destRange as vip and nextHopIp as instance IP in the appr network
If event is CC_IP_ATTACHED and no matching route exists already, add a new
route with destRange as vip and nextHopIp as instance IP in appr network
'''
 
def parse_avi_params(argv):
    if len(argv) != 2:
        return {}
    script_parms = json.loads(argv[1])
    return script_parms
 
def create_avi_endpoint():
    token=os.environ.get('API_TOKEN')
    user=os.environ.get('USER')
    # tenant=os.environ.get('TENANT')
    return ApiSession.get_session(os.environ.get('DOCKER_GATEWAY'), user, token=token,
                                  tenant='admin')
 
def google_compute():
    credentials = GoogleCredentials.get_application_default()
    return discovery.build('compute', 'v1', credentials=credentials)
 
def gcp_program_route(gcp, event_id, project, network, inst_ip, vip):
    # List all routes for vip
    result = gcp.routes().list(project=project,
                                   filter='destRange eq %s' % vip).execute()
    if (('items' not in result or len(result['items']) == 0)
        and event_id == 'CC_IP_DETACHED'):
        print(('Project %s destRange %s route not found' %
              (project, vip)))
        return
 
    if event_id == 'CC_IP_DETACHED':
        # Remove route for vip nextHop instance
        for r in result['items']:
            if (r['network'] == network and r['destRange'] == vip and
                r['nextHopIp'] == inst_ip):
                result = gcp.routes().delete(project=project,
                                             route=r['name']).execute()
                print(('Route %s delete result %s' % (r['name'], str(result))))
                # Wait until done or retries exhausted
                if 'name' in result:
                    start = int(time.time())
                    for i in range(0, 20):
                        op_result = gcp.globalOperations().get(project=project,
                                operation=result['name']).execute()
                        print(('op_result %s' % str(op_result)))
                        if op_result['status'] == 'DONE':
                            if 'error' in result:
                                print(('WARNING: Route delete had errors '
                                      'result %s' % str(op_result)))
                            else:
                                print(('Route delete done result %s' %
                                      str(op_result)))
                            break
                        if int(time.time()) - start > 20:
                            print(('WARNING: Wait exhausted last op_result %s' %
                                  str(op_result)))
                            break
                        else:
                            time.sleep(1)
                else:
                    print('WARNING: Unable to obtain name of route delete '
                          'operation')
    elif event_id == 'CC_IP_ATTACHED':
        # Add routes to instance
        # Route names can just have - and alphanumeric chars
        rt_name = re.sub('[./]+', '-', 'route-%s-%s' % (inst_ip, vip))
        route = {'name': rt_name,
            'destRange': vip, 'network': network,
            'nextHopIp': inst_ip}
        result = gcp.routes().insert(project=project,
                                         body=route).execute()
        print(('Route VIP %s insert result %s' %
                (vip, str(result))))
         
def handle_cc_alert(session, gcp, script_parms):
    se_name = script_parms['obj_name']
    print(('Event Se %s %s' % (se_name, str(script_parms))))
    if len(script_parms['events']) == 0:
        print ('WARNING: No events in alert')
        return
 
    # GET SE object from Avi for instance IP address and SE Group link
    rsp = session.get('serviceengine?uuid=%s' %
                      script_parms['events'][0]['event_details']['cc_ip_details']['se_vm_uuid'])
    if rsp.status_code in range(200, 299):
        se = json.loads(rsp.text)
        if se['count'] == 0 or len(se['results']) == 0:
            print(('WARNING: SE %s no results' %
                script_parms['events'][0]['event_details']['cc_ip_details']['se_vm_uuid']))
            return
        inst_ip = next((v['ip']['ip_addr']['addr'] for v in
                se['results'][0]['mgmt_vnic']['vnic_networks']
                if v['ip']['mask'] == 32 and v['mode'] != 'VIP'), '')
        if not inst_ip:
            print(('WARNING: Unable to find IP with mask 32 SE %s' % str(se['results'][0])))
            return
 
        # GET SE Group object for GCP project, zones and network
        # https://localhost/api/serviceenginegroup/serviceenginegroup-99f78850-4d1f-4b7b-9027-311ad1f8c60e
        seg_ref_list = se['results'][0]['se_group_ref'].split('/api/')
        seg_rsp = session.get(seg_ref_list[1])
        if seg_rsp.status_code in range(200, 299):
            vip = '%s/32' % script_parms['events'][0]['event_details']['cc_ip_details']['ip']['addr']
            seg = json.loads(seg_rsp.text)
            descr = json.loads(seg.get('description', '{}'))
            project = descr.get('project', '')
            network = descr.get('network', '')
            if not project or not network:
                print(('WARNING: Project, Network is required descr %s' %
                      str(descr)))
                return
            gcp_program_route(gcp, script_parms['events'][0]['event_id'],
                              project, network, inst_ip, vip)
        else:
            print(('WARNING: Unable to retrieve SE Group %s status %d' %
                  (se['results'][0]['se_group_ref'], seg_rsp.status_code)))
            return
    else:
        print(('WARNING: Unable to retrieve SE %s' %
               script_parms['events'][0]['obj_uuid']))
 
 
# Script entry
 
if __name__ == "__main__":
    script_parms = parse_avi_params(sys.argv)
    try:
        admin_session = create_avi_endpoint()
        gcp = google_compute()
        handle_cc_alert(admin_session, gcp, script_parms)
    except Exception:
        print(('WARNING: Exception with Avi/Gcp route %s' %
               traceback.format_exc()))

拒绝服务攻击处理

#!/usr/bin/python
import sys, os, json
from avi.sdk.avi_api import ApiSession
 
 
'''
This control script will be executed in the Avi Controller when an
alert due to a DOS_ATTACK event is generated.
 
An example params passed to the control script dos-attack.py is as follows
 
params = [u'/home/admin/Dos_Attack-l4vs',
           '{"name": "Dos_Attack-virtualservice-d1093604-e1f0-476a-ad91-01c5224c5641-1461261720.83-1461261716-77911185",
             "throttle_count": 0,
             "level": "ALERT_HIGH",
             "reason": "threshold_exceeded",
             "obj_name": "l4vs",
             "threshold": 1,
             "events":
             [
                 {
                      "event_id": "DOS_ATTACK",
                      "event_details":
                      {
                          "dos_attack_event_details":
                          {
                              "attack_count": 2150.0,
                              "attack": "SYN_FLOOD",
                              "ipgroup_uuids": [
                                  "ipaddrgroup-f6883289-39fa-418f-94c2-3b8f8093cd7a"
                               ],
                               "src_ips": ["10.10.90.67"]
                          }
                      },
                      "obj_uuid": "virtualservice-d1093604-e1f0-476a-ad91-01c5224c5641",
                      "obj_name": "l4vs",
                      "report_timestamp": 1461261716
                 }
             ]
            }'
         ]
The DOS_ATTACK event was generated due to a SYN_FLOOD from client 10.10.90.67. It was
traffic to the Virtual Service : "l4vs".
 
The offending client ip is added as NETWORK_SECURITY_POLICY_ACTION_TYPE_DENY in the
network seurity policy for the virtual service
 
'''
 
def ParseAviParams(argv):
    if len(argv) != 2:
        return
    alert_dict = json.loads(argv[1])
    return alert_dict
 
def create_avi_endpoint():
    token=os.environ.get('API_TOKEN')
    user=os.environ.get('USER')
    # tenant=os.environ.get('TENANT')
    return ApiSession.get_session(os.environ.get('DOCKER_GATEWAY'), user, token=token,
                                  tenant='admin')
 
def add_ns_rules_dos(session, dos_params):
    vs_name = dos_params['obj_name']
    vs_uuid = ''
    client_ips = []
    vs_name = dos_params['obj_name']
    for event in dos_params['events']:
        vs_uuid = event['obj_uuid']
        dos_attack_event_details = event['event_details']['dos_attack_event_details']
        if dos_attack_event_details['attack'] != 'SYN_FLOOD':
            continue
        for ip in dos_attack_event_details['src_ips']:
            client_ips.append(ip)
    if len(client_ips) == 0:
        print ('DOS ATTACK is not SYN_FLOOD. Ignoring')
        return
 
    print('VS name : ' + vs_name + ' VS UUID : ' + vs_uuid + ' Client IPs : ' + str(client_ips))
    ip_list = []
    for ip in client_ips:
        ip_addr_obj = {
            'addr' : ip,
            'type' : 'V4'
        }
        ip_list.append(ip_addr_obj)
 
    match_obj = {
        'match_criteria' : 'IS_IN',
        'addrs' : ip_list
    }
    ns_match_target_obj = {
        'client_ip' : match_obj
    }
    ns_rule_dos_obj = {
        'enable' : True,
        'log'    : True,
        'match'  : ns_match_target_obj,
        'action' : 'NETWORK_SECURITY_POLICY_ACTION_TYPE_DENY'
    }
    ns_policy_dos_obj = {
        'vs_name' : vs_name,
        'vs_uuid' : vs_uuid,
        'rules'   : [
            ns_rule_dos_obj,
        ]
    }
    print('ns_policy_dos_obj : ' + str(ns_policy_dos_obj))
    try :
        session.post(path='networksecuritypolicydos?action=block',
                 data=ns_policy_dos_obj)
    except Exception as e:
        print(str(e))
    print(('Added Client IPs ' + str(client_ips) + \
           ' in the blocked list for VS : ' + vs_name))
 
if __name__ == "__main__":
    alert_dict = ParseAviParams(sys.argv)
    try :
        admin_session = create_avi_endpoint()
    except Exception as e:
        print('login failed to Avi Controller!' + str(e))
        sys.exit(0)
    add_ns_rules_dos(admin_session, alert_dict)