The Stream Collector is divided into multiple components interacting between each other in order to parse text source properly and generate raw values. Refer the figure for a global overview of the architecture.

  • Stream Retriever: These components are used to retrieve the text sources.
  • Dataset Reader: These components divides the text sources into smaller text chunks. This proves useful in order to be able to parse multiple small chunks in parallel.
  • Reader: These components parses the different text source and extract information out of them and enhance the context.
  • Transformer: These components modifies/standardize the different text sources in order to parse them.
  • Context : This component holds the various dynamic information's required for the generation of Raw Values.
  • Releaser: This component triggers the generation of Raw Values out of the current context.
  • Release Listener: This component receives the generation requests from the Releaser, computes the Raw Values and send them to the next component in the collecting chain of the Collector Manager.

Configuring Stream Collector:

Following is a template of StreamCollector config and all options:

<collector-configuration>
	<source>Source property value</source>
	<collecting-group>Retention group</collecting-group>
	<default-character-encoding>The expected character encoding in the streams ex:UTF-8</default-character-encoding>
	<!-- CHOICE -->
	<properties-refresh-periods>The period between a forced +r</properties-refresh-periods>
	<!-- OR -->
	<auto-detect-properties-refresh />
	<collecting-threads-pool-size>The number of threads allowed for all the	collecting chains</collecting-threads-pool-size>
	<!-- One or more -->
	<collecting-configurations name="Configuration name">
		<!-- Zero or more -->
		<include-contexts>Location to an execution contexts file</include-contexts>
		<!-- Zero or more -->
		<execution-contexts name="Context name">
			<!-- One or more -->
			<properties name="Property name">Property value</properties>
		</execution-contexts>
		<data-retrieval-file>File containing the data retrieval chain </data-retrieval-file>
		<!-- One or more -->
		<data-listeners id="The releasing ID to listen to"
			variable-id="Optional properties for the Raw Value variable ID. Default is source device module part name"
			variable-id-separator="Optional separator for the properties for the Raw Value variable ID. Default is nothing"
			normalize-variable-id="Optional flag to normalize the properties for the Raw Value variable ID. Default is true">
			<!-- Optional -->
			<timestamp
				context-key="execution context key where the value will be the timestamp"
				format="optional format when the value isn't numeric" />
			<!-- Zero or more -->
			<values context-key="execution context key where the value will be"
				type="computation type: counter (default), delta, rate or a contextualized value for runtime selection"
				required="true (default) or false">
				<name>Optional metric name property</name>
				<unit>Optional metric unit property</unit>
				<!-- Zero or more property extraction from the value context-key value -->
				<extractions pattern="Regex containing groups">
					<!-- One or more -->
					<value group="regex group">property name</value>
				</extractions>
				<!-- Zero or more -->
				<replace value="value to replace" by="replacement or ommit to nullify"
					pattern="true or false (default)" />
				<!-- Zero or more -->
				<properties context-key="execution context key where the property will be"
					property-name="Raw Value property name">
					<!-- Zero or more -->
					<replace value="property to replace" by="replacement or ommit to nullify"
						pattern="true or false (default)">
				</properties>
				<!-- Zero or more -->
				<hardcoded-properties key="property name">Property value
				</hardcoded-properties>
				<!-- Optional -->
				<dynamic-properties
					prefix-char="The character prefixed to the dynamic properties keys in the execution context (Default = '+')" />
			</values>
			<!-- Zero or more -->
			<dynamic-values>
				Same as value but the context-key contains a regex that will be applied
				to every execution context property name in order to extract values
			</dynamic-values>
			<!-- Zero or more -->
			<properties context-key="execution context key where the property will be"
				property-name="Raw Value property name">
				<!-- Zero or more -->
				<replace value="property to replace" by="replacement or ommit to nullify"
					pattern="true or false (default)">
			</properties>
			<!-- Zero or more -->
			<hardcoded-properties key="property name">Property value
			</hardcoded-properties>
			<!-- Optional -->
			<dynamic-properties
				prefix-char="The character prefixed to the dynamic properties keys in the execution context (Default = '+')" />
		</data-listeners>
	</collecting-configurations>
</collector-configuration>
  1. Each stream collector will have a collector-configuration with following definitions:
    1. source tag is a hardcoded value to identify the source of the data, this is added as property for each raw value.

    2. collecting-group is the retention group and will be added as a Meta property for each rawvalue.

    3. properties-refresh-periods or auto-detect-properties-refresh tags define when to force the action for each raw value as refreshed. properties-refresh-periods will force the refresh action after defined time period whereas auto-detect-properties-refresh will detect automatically when to force the refresh action. Once the action is detected in raw value a Meta property action with value r will be added.

    4. default-character-encoding will define character encoding used by collector.

    5. collecting-threads-pool-size will define the number threads per collecting-configurations defined in this collector.

    6. One or more collecting-configurations tag which defines collecting configuration of the data, explained in detail below.

      Example:

      <collector-configuration xmlns="http://www.watch4net.com/Text-Collector-Configuration" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.watch4net.com/Text-Collector-Configuration ../textCollectorConfiguration.xsd ">
          <source>OpenStack-Collector</source>
          <collecting-group>OpenstackGroup</collecting-group>
          <default-character-encoding>UTF-8</default-character-encoding>
          <properties-refresh-periods>10m</properties-refresh-periods>
          <collecting-threads-pool-size>20</collecting-threads-pool-size>
          <collecting-configurations name="openstack-metrics-cache">
          :
          :
       
      </collector-configuration>
      
  2. In each collecting-configuration we define following:
    1. include-contexts includes a file name which has context details of end device to collect the information.
    2. data-retrieval-file which fetches the stream, transform the stream and converts into execution context data.
    3. One or multiple data-listeners, which converts execution context data into raw values, explained in detail below.

    Example:

    <collecting-configurations name="openstack-metrics-cache">
        <include-contexts>conf/context-openstack.xml</include-contexts>
        <data-retrieval-file>conf/openstack-metrics-cache.xml</data-retrieval-ile>
        <data-listeners id="OPENSTACK-IMAGE" variable-id="source keystid parttype partid name">
        :
        :
    </collecting-configurations>
    
  3. In each data-listeners define following:
    1. variable-id tag used to build unique identification for each packet by grouping the properties. One more tag variable-id-separator used to separate the properties. This value is added as meta property for each rawvalue.

      Example:

      <data-listeners id="SMARTS-TOPO-FASTDXA" variable-id="type Name" variable-id-separator="::">

      Where type is VirtualMachine and Name as VM-1 name for this rawvalue is updates as VirtualMachine::VM-1.

    2. One or more values tag which is a metric for rawvalue will be updated from context. This should be always a float value. If it is not a float then use replace tag to change accordingly. For each values we should provide unit value as shown in below example which is the unit by which this metric is measured.

      Example:

      Here user can map "disabled" key in context as "Availability" metric in outgoing rawvalue

      values context-key="disabled">
             	         <name>Availability</name>
                   		   <unit>%</unit>
           </values>
           Output:
       "metrics" : {
          "Availability" : {
            "properties" : {
              "unit" : "%",
              "name" : "Availability"
            },
            "value" : 100.0
          }
        }
      

      This example is same as first one but in this case value of disabled is Boolean, to make it as metric we are converting as float using replace tag shown below:

      <values context-key="disabled">
                      	<name>Availability</name>
                      	<unit>%</unit>
                     <replace value="false" by="100"/>
                     <replace value="true" by="0"/>
                  </values>
      

      This example is same as second one but in this case value of enabled is String, to make it as metric we are converting as float using replace tag shown below. If value is True then replace with 100 else if it matches regex pattern replace with 0.

      <values context-key="enabled">
                      	<name>Availability</name>
                      	<unit>%</unit>
                      	<replace value="True" by="100"/>
                      	<replace value=".+"   by="0" pattern="true"/>
                </values>
      
    3. Zero or more dynamic-values which will get the metrics value for rawvalue from context upon regex match. To extract metric values dynamically we have to update execution context by appending a specific pattern (ex: ~@M) for each context key which should be treated as metric during the data retrieval and transformer time. And extract the values using extraction pattern as shown in example below.

      Example: In execution context two values are updated with dynamic value prefix i.e ~@MAvailability=100, ~@MUtilization=100, then for the following configuration.

      <dynamic-values context-key="~@M.*" required="false">
                  	<unit>%</unit>
                  	<extractions pattern="~@M(.+)">
                          <value group="1">name</value>
                      </extraction>
      </dynamic-values>
      Output:
      "metrics" : {
          "Availability" : {
            "properties" : {
          "unit" : "%",
             "name" : "Availability"
           },
         "value" : 100.0
      },
                        "Utilization" : {
        "properties" : {
           "unit" : "%",
             "name" : "Utilization"
           },
          "value" : 100.0
         }
      }
      
    4. Zero more properties which is a property for rawvalue, will be updated from context based on context-key provided.

      Example:

      Maps "CreationClassName" key to "type" in raw value

      <properties context-key="CreationClassName" property-name="type"/>

      Maps "pwrstate" key to PWRState by replacing actual integer value to string.

      <properties context-key="pwrstate" property-name="PWRState">
                      <replace value="0" by="Unknown"/>
                      <replace value="1" by="Running"/>
                      <replace value="3" by="Paused"/>
                      <replace value="4" by="Shutdown"/>
                      <replace value="6" by="Crashed"/>
                      <replace value="7" by="Suspended"/>
                      <replace value=".+" by="Unknown" pattern="true"/>
       </properties>
      
    5. Zero more hardcoded-properties which is a property for rawvalue will be updates directly the hardcoded value.

      Example:

      <hardcoded-properties key="devtype">CloudService</hardcoded-properties>
    6. Optional Zero or more dynamic-properties will consider and add context-key as properties if it is prefixed with character configured. Default prefix-char is "+".

      Example:

      In this example if any context key is prefixed with "@" is considered as property and added as part of property of rawvalue.

      <dynamic-properties prefix-char=@/>

      Example: Data-listener configuration example.

      <data-listeners id="OPENSTACK-IMAGE" variable-id="source keystid parttype partid name">
                  <values context-key="Status">
                      <name>Status</name>
                      <unit>code</unit>
                      <replace value="active" by="0"/>
                      <replace value="queued" by="1"/>
                      <replace value="saving" by="2"/>
                      <replace value="deleted" by="3"/>
                      <replace value="pending_delete" by="4"/>
                      <replace value="killed" by="5"/>
                  </values>
                  <dynamic-values context-key="~@M.*" required="false">
                  	<unit>bytes</unit>
                  	<extractions pattern="~@M(.+)">
                          <value group="1">name</value>
                      </extractions>
                  	
                  </dynamic-values>
      
                  <properties context-key="cformat" property-name="cformat"/>
             
            	      <properties context-key="isprotec" property-name="isprotec">
                      <replace value="false" by="No"/>
                      <replace value="true" by="Yes"/>
                  </properties>
                  <properties context-key="ispublic" property-name="ispublic">
                      <replace value="private" by="No"/>
                      <replace value="public" by="Yes"/>
                  </properties>
                  <properties context-key="updated" property-name="updated"/>
                  <hardcoded-properties key="datagrp">OPENSTACK-IMAGE</hardcoded-properties>
                  <hardcoded-properties key="devtype">CloudService</hardcoded-properties>
                  <hardcoded-properties key="parttype">Image</hardcoded-properties>
              </data-listeners>
      

Example of a Stream Collector configuration

In this example a collecting configuration is configured with 2 data-listeners. These collecting configuration will be executed against the device context provided in file conf/context-openstack.xml which includes all device credentials and context variables to collect the information from device.

All the retrieval information of the data is defined in file conf/openstack-metrics-main.xml which will get the stream, transform the stream and provide it to datalistner.

DataListner converts the contexts data available to raw values and publish it to next component defined in collecting.xml.

<?xml version="1.0" encoding="UTF-8"?>
<collector-configuration xmlns="http://www.watch4net.com/Text-Collector-Configuration" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.watch4net.com/Text-Collector-Configuration ../textCollectorConfiguration.xsd ">
    <source>OpenStack-Collector</source>
    <collecting-group>group</collecting-group>
    <default-character-encoding>UTF-8</default-character-encoding>
    <properties-refresh-periods>10m</properties-refresh-periods>
    <collecting-threads-pool-size>20</collecting-threads-pool-size>
    <collecting-configurations name="openstack-metrics-main">
        <!--File location which has context details of end device -->
        <include-contexts>conf/context-openstack.xml</include-contexts>

	 <!--File which retrieves and transforms the stream in to context data-->
        <data-retrieval-file>conf/openstack-metrics-main.xml</data-retrieval-file>

        <data-listeners id="GET-OPENSTACK-HYPERVISORS" variable-id="type device">
            <values context-key="CurrentWorkload">
                <name>CurrentWorkload</name>
                <unit>nb</unit>
            </values>
            <values context-key="RunningVMs">
                <name>RunningVMs</name>
                <unit>nb</unit>
            </values>
            <values context-key="TotalVCpus">
                <name>TotalVCpus</name>
                <unit>nb</unit>
                <properties context-key="cpu-partmod" property-name="partmod"/>
                <properties context-key="cpu-partvndr" property-name="partvndr"/>
                <hardcoded-properties key="part">System</hardcoded-properties>
                <hardcoded-properties key="parttype">Processor</hardcoded-properties>
            </values>
            <values context-key="UsedVCpus">
                <name>UsedVCpus</name>
                <unit>nb</unit>
                <properties context-key="cpu-partmod" property-name="partmod"/>
                <properties context-key="cpu-partvndr" property-name="partvndr"/>
                <hardcoded-properties key="part">System</hardcoded-properties>
                <hardcoded-properties key="parttype">Processor</hardcoded-properties>
            </values>
            <values context-key="CurrentUtilization">
                <name>CurrentUtilization</name>
                <unit>%</unit>
                <hardcoded-properties key="part">Physical Memory</hardcoded-properties>
                <hardcoded-properties key="parttype">Memory</hardcoded-properties>
            </values>
            <values context-key="FreeMemory">
                <name>FreeMemory</name>
                <unit>MB</unit>
                <hardcoded-properties key="part">Physical Memory</hardcoded-properties>
                <hardcoded-properties key="parttype">Memory</hardcoded-properties>
            </values>
            <values context-key="TotalMemory">
                <name>TotalMemory</name>
                <unit>MB</unit>
                <hardcoded-properties key="part">Physical Memory</hardcoded-properties>
                <hardcoded-properties key="parttype">Memory</hardcoded-properties>
            </values>
            <values context-key="UsedMemory">
                <name>UsedMemory</name>
                <unit>MB</unit>
                <hardcoded-properties key="part">Physical Memory</hardcoded-properties>
                <hardcoded-properties key="parttype">Memory</hardcoded-properties>
            </values>
            <values context-key="TotalDisk">
                <name>TotalDisk</name>
                <unit>GB</unit>
                <hardcoded-properties key="part">Physical Memory</hardcoded-properties>
                <hardcoded-properties key="parttype">Disk</hardcoded-properties>
            </values>
            <values context-key="UsedDisk">
                <name>UsedDisk</name>
                <unit>GB</unit>
                <hardcoded-properties key="part">System</hardcoded-properties>
                <hardcoded-properties key="parttype">Disk</hardcoded-properties>
            </values>
            <properties context-key="fqdn" property-name="device"/>
            <properties context-key="fqdn" property-name="fqdn"/>
            <properties context-key="ip" property-name="ip"/>
            <properties context-key="host" property-name="host"/>
            <properties context-key="KEYSTONE_ID" property-name="keystid"/>
            <properties context-key="TotalVCpus" property-name="nbcpu"/>
            <properties context-key="OPST_HOST" property-name="osendpt"/>
            <hardcoded-properties key="datagrp">OPENSTACK-HYPERVISOR</hardcoded-properties>
            <hardcoded-properties key="devtype">Hypervisor</hardcoded-properties>
            <hardcoded-properties key="type">HypervisorMonitor</hardcoded-properties>
        </data-listeners>

        <data-listeners id="OPENSTACK-VM" variable-id="type devid">
            <values context-key="RootCapacity">
                <name>Capacity</name>
                <unit>GB</unit>
                <properties context-key="root-natvolnm" property-name="natvolnm"/>
                <hardcoded-properties key="part">Root</hardcoded-properties>
                <hardcoded-properties key="partdesc">Root Disk for @{device}</hardcoded-properties>
                <hardcoded-properties key="parttype">Virtual Disk</hardcoded-properties>
                <hardcoded-properties key="voltype">Root</hardcoded-properties>
            </values>
            <properties context-key="avzone" property-name="avzone"/>
            <properties context-key="created" property-name="created"/>
            <properties context-key="device" property-name="device"/>
            <properties context-key="deviceid" property-name="devid"/>
            <properties context-key="deviceid" property-name="deviceid"/>
            <properties context-key="flavname" property-name="flavname"/>
            <properties context-key="flavorid" property-name="flavorid"/>
            <properties context-key="hypervsr" property-name="hypervsr"/>
            <properties context-key="imageid" property-name="imageid"/>
            <properties context-key="imagenm" property-name="imagenm"/>
            <properties context-key="KEYSTONE_ID" property-name="keystid"/>
            <properties context-key="ip" property-name="ip"/>
            <properties context-key="keyname" property-name="keyname">
                <replace value="null" by="N/A"/>
            </properties>
            <properties context-key="TotalVCpus" property-name="nbcpu"/>
            <properties context-key="OPST_HOST" property-name="osendpt"/>
            <properties context-key="projid" property-name="projid"/>
            <properties context-key="pwrstate" property-name="pwrstate">
                <replace value="0" by="Unknown"/>
                <replace value="1" by="Running"/>
                <replace value="3" by="Paused"/>
                <replace value="4" by="Shutdown"/>
                <replace value="6" by="Crashed"/>
                <replace value="7" by="Suspended"/>
                <replace value=".+" by="Unknown" pattern="true"/>
            </properties>
            
            <properties context-key="status" property-name="status"/>
            <properties context-key="updated" property-name="updated"/>
            <properties context-key="userid" property-name="userid"/>
            <hardcoded-properties key="devtype">VirtualMachine</hardcoded-properties>
            <hardcoded-properties key="datagrp">OPENSTACK-VM</hardcoded-properties>
            <hardcoded-properties key="type">VirtualMachine</hardcoded-properties>
        </data-listeners>
    </collecting-configurations>
</collector-configuration>