Managing Data Flow Service Instances using cf CLI

Here you will find information about managing Data Flow service instances using the Cloud Foundry Command Line Interface (cf CLI). You can also manage Data Flow service instances using Apps Manager.

Note To have read and write access to a Spring Cloud Data Flow for VMware Tanzu service instance, you must have the SpaceDeveloper role in the space where the service instance was created. If you have only the SpaceAuditor role in the space where the service instance was created, you have only read (not write) access to the service instance.

Available parameters

When creating or updating a Spring Cloud Data Flow service instance, you can configure the service instance using parameters passed to the cf CLI commands. See the following sections for information about the supported parameters.

Setting the buildpack

Each Data Flow service instance can be given the name of a buildpack to use for deploying stream and task apps. You can set the buildpack for the service instance using a buildpack parameter given to cf create-service or cf update-service. To create a service instance that uses a buildpack named custom-java-buildpack to deploy apps, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"buildpack": "custom-java-buildpack"}'

Configuring app settings for Data Flow server and Skipper

You can configure settings for a service instance's backing Data Flow server and Skipper apps using parameters given to cf create-service or cf update-service.

Parameter	Function
`dataflow.disk`	The disk used by the Data Flow server.
`dataflow.memory`	The memory used by the Data Flow server.
`skipper.disk`	The disk used by the Skipper backing app.
`skipper.memory`	The memory used by the Skipper backing app.

For all disk and memory settings, the default unit is mebibytes (MB). You can use other units by naming the unit in the value string (for example, "1G", "512MB", "2GiB", or "3gb").

To create a service instance with a Skipper backing app that uses 4 GiB of disk space, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"skipper": { "disk": "4GiB" } }'

Configuring domain for Data Flow server and Skipper

You can configure the domain used by a service instance's backing Data Flow server and Skipper apps using a domain parameter given to cf create-service or cf update-service. To create a service instance that uses the domain my-dataflow.example.com for its backing Data Flow server and Skipper apps, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"domain": "my-dataflow.example.com"}'

Configuring Skipper health check

You can configure Spring Cloud Skipper settings for a service instance's Skipper backing app by passing the settings as parameters to cf create-service or cf update-service. This can be used to configure the deployer health check timeout, for example. To create a service instance that uses a health check timeout of five (5) minutes, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"spring.cloud.skipper.server.strategies.healthcheck.timeout-in-millis": 300000}'

Activating caching for Maven artifacts

By default, a Data Flow server instance will not cache artifacts downloaded from a Maven repository, because this caching can overwhelm app containers and cause the service instance's Data Flow or Skipper backing apps to crash. If you wish, you can activate caching of Maven artifacts by setting a maven-cache parameter, passed to cf create-service or cf update-service, to true:

$ cf create-service p-dataflow standard data-flow -c '{"maven-cache": true}'

Setting dependent services

Each Data Flow service instance uses three dependent data services. Defaults for these services can be configured in the tile settings, and these defaults can be overridden for each individual service instance at create or update time.

Note The service offerings with the plan proxy are proxy services used by Spring Cloud Data Flow for VMware Tanzu service instances. The Spring Cloud Data Flow service broker creates and deletes instances of these services automatically along with each Spring Cloud Data Flow service instance. Do not manually create or delete instances of these services.

General parameters used to configure dependent data services for a Data Flow service instance are listed below.

Parameter	Function
`relational-data-service.name`	The name of the service to use for a relational database that stores Spring Cloud Data Flow metadata and task history.
`relational-data-service.plan`	The name of the service plan to use for the relational database service.
`messaging-data-service.name`	The name of the service to use for a RabbitMQ or Kafka server that facilitates event messaging.
`messaging-data-service.plan`	The name of the service plan to use for the RabbitMQ or Kafka service.
`skipper-relational.name`	The name of the service to use for a relational database used by the Skipper application.
`skipper-relational.plan`	The name of the service plan to use for a relational database used by the Skipper application.

To create a Data Flow service instance that uses VMware Tanzu SQL [MySQL] for the Data Flow and Skipper relational databases and uses VMware RabbitMQ for the event messaging service, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{ "relational-data-service": { "name": "p.mysql", "plan": "med-db" }, "messaging-data-service": { "name": "p.rabbitmq", "plan": "high-vol" }, "skipper-relational": { "name": "p.mysql", "plan": "sm-db" } }'

Setting Composed Task Runner app URL

To run composed tasks, Spring Cloud Data Flow uses a task app called the Composed Task Runner (CTR). By default, Data Flow downloads this app from the Maven Central repository. A different default URL for this app can be configured in the tile settings, and this default can be overridden for each individual service instance at create or update time. You can specify a different URL to use for downloading the app by using a parameter passed to the cf create-service or cf update-service command.

To create a service instance that downloads the CTR app from https://example.com/ctr.jar, you might run:

$ cf create-service p-dataflow standard data-flow -c '{ "composed-task-runner-uri": "https://example.ctr.jar" }'

Binding arbitrary services

Each Data Flow service instance can optionally be bound to other service instances. For instance, you can configure a Data Flow service instance to be bound to an existing Spring Cloud Services Config Server service instance. To specify that a Data Flow service instance should be bound to an existing other service instance, include the other service instance's name in a JSON array called services and pass the array to the cf create-service or cf update-service command.

To create a Data Flow service instance that is bound to an existing Spring Cloud Services Config Server service instance named my-config-server, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"services": ["my-config-server"] }'

When created, the data-flow service instance will be bound to the existing my-config-server service instance.

Using a Grafana dashboard

You can use Grafana to view metrics for Spring Cloud Data Flow apps and streams. To activate this, use settings under spring.cloud.dataflow.grafana-info, passed to cf create-service or cf update-service.

To create a service instance that sends metrics to a Grafana dashboard located at https://grafana.example.com:443, you might run:

$ cf create-service p-dataflow standard data-flow -c '{"spring.cloud.dataflow.grafana-info.url": "https://grafana.example.com:443"}'

Note Spring Cloud Data Flow for VMware Tanzu does not provide a Grafana installation. You must provide your own Grafana installation in order to use Spring Cloud Data Flow for VMware Tanzu with Grafana.

Setting Maven properties

Each Data Flow service instance can optionally specify Maven configuration properties. For the complete list of properties that can be specified, see the Maven section in the OSS Spring Cloud Data Flow documentation.

Maven configuration properties can be set for each Data Flow service instance using parameters given to cf create-service or cf update-service. To set the maven.remote-repositories.repo1.url property, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot"}'

To configure a private Maven repository that requires authentication, you can provide a username and password, such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url":"https://my.private.maven/repo","maven.remote-repositories.repo1.auth.username":"user","maven.remote-repositories.repo1.auth.password":"password"}'

Configuring Wavefront

Spring Cloud Data Flow can integrate with VMware Tanzu Observability by Wavefront to monitor deployed event-streaming and batch applications. Default values for Wavefront settings can be set in the tile configuration, and these default values can be overridden for each individual service instance at create or update time.

To configure Wavefront settings for a Data Flow service instance, pass a wavefront parameter to the cf create-service or cf update-service command. This parameter is a JSON object with the fields listed below.

Parameter	Function
`uri`	The URI of the Wavefront instance.
`api-token`	The user API token to use for Wavefront.
`source`	An arbitrary string used to identify the Data Flow service instance.

To configure these settings for a new Data Flow service instance, you might use a command such as the following:

$ cf create-service p-dataflow standard data-flow -c '{"wavefront": {"uri": "https://wavefront.example.com", "api-token": "EXAMPLE_API_TOKEN", "source": "my-dataflow-si"} }'

All Wavefront settings are optional. If you do not supply a value for any particular setting, the Data Flow service instance will use the default value for that setting (the value specified in the tile settings).

Limiting concurrent tasks

Each Data Flow service instance can execute a maximum number of concurrently-running tasks (the default limit is 10). You can configure the concurrent task limit using a concurrent-task-limit parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"concurrent-task-limit": 30}'

When the number of concurrent tasks reaches the specified limit, the Data Flow service instance will no longer launch new tasks until the number of running tasks is again below the limit.

Activating task support only (no streams)

Each Data Flow service instance can be configured to run tasks only (with stream support deactivated). You can configure the service instance to activate only task support using a task-only parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"task-only": true}'

With task-only set to true, the Spring Cloud Skipper backing app (with its associated relational database backing service instance and the messaging backing service instance) will not be deployed for the service instance, and the service instance's dashboard (see Using the Dashboard) will not display the Streams tab.

Activating stream support only (no tasks)

Each Data Flow service instance can be configured to run streams only (with task support deactivated). You can configure the service instance to activate only stream support using a stream-only parameter given to cf create-service or cf update-service:

$ cf create-service p-dataflow standard data-flow -c '{"stream-only": true}'

With stream-only set to true, the service instance's dashboard (see Using the Dashboard) will not display the Tasks tab.

Configuring Skipper memory allocation

When creating or updating a Data Flow service instance, you can set the memory allocation for the associated Spring Cloud Skipper server deployed to VMware Tanzu Application Service for VMs (TAS for VMs). The default memory allocation for Skipper is 2 GB.

To configure a value for Skipper's memory allocation, you can pass a skipper parameter--a JSON object with a single memory key--to the cf create-service or cf update-service command:

$ cf create-service p-dataflow standard data-flow -c '{"skipper": { "memory": "8G" }}'

Using Scheduler

You can use the Scheduler service with Spring Cloud Data Flow for VMware Tanzu to schedule task executions (see the Spring Cloud Data Flow OSS documentation on Scheduling Tasks). If you configure a Data Flow service instance to use Scheduler, the Data Flow broker will create a new Scheduler service instance in the Data Flow service instance's backing space. This Scheduler service instance will be bound to the Data Flow server's backing application.

To configure a Data Flow service instance to use Scheduler, pass a scheduler parameter to the cf create-service or cf update-service command. This parameter is a JSON object with the fields listed below.

Parameter	Function
`name`	The name of the scheduler service offering to use. Only the Scheduler service, `scheduler-for-pcf`, is supported at this time.
`plan`	The name of the service plan to use.
`instance-name`	The name of the service instance to create. Optional.

To create a Data Flow service instance that uses a Scheduler service instance with the standard plan, named mysched, you might use a command such as the following:

$ cf create-service p-dataflow standard mydf -c '{"scheduler":{"name": "p-scheduler", "plan": "standard", "instance-name":"mysched"}}

Creating an instance

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view plan details for the Data Flow product using cf marketplace -s.

$ cf marketplace
Getting services from marketplace in org myorg / space development as user...
OK

service             plans    description
p-dataflow          standard Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
p-dataflow-mysql    proxy    Proxies to the Spring Cloud Data Flow MySQL service instance
p-dataflow-rabbitmq proxy    Proxies to the Spring Cloud Data Flow RabbitMQ service instance

TIP:  Use 'cf marketplace -s SERVICE' to view descriptions of individual plans of a given service.

$ cf marketplace -s p-dataflow
Getting service plan information for service p-dataflow as user...
OK

service plan   description     free or paid
standard       Standard Plan   free

Create the service instance using cf create-service. To create a Data Flow service that sets the Maven maven.remote-repositories.repo1.url property to https://repo.spring.io/release, you might run:

$ cf create-service p-dataflow standard data-flow -c '{ "maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot" }'
Creating service instance data-flow in org myorg / space development as user...
OK

Create in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

As the command output suggests, you can use the cf services or cf service commands to check the status of the service instance. When the service instance is ready, the cf service command will give a status of create succeeded:

$ cf service data-flow

Service instance: data-flow
Service: p-dataflow
Bound apps:
Tags:
Plan: standard
Description: Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
Documentation url: https://cloud.spring.io/spring-cloud-dataflow/
Dashboard: https://p-dataflow.apps.example.com/instances/f09e5c77-e526-4f49-86d6-721c6b8e2fd9/dashboard

Last Operation
Status: create succeeded
Message: Created
Started: 2017-07-20T18:24:14Z
Updated: 2017-07-20T18:26:17Z

Updating an instance

You can update settings on a Data Flow service instance using the cf CLI. The cf update-service command can be given a -c flag with a JSON object containing parameters used to configure the service instance.

Note If you upgrade a Data Flow service instance created using Spring Cloud Data Flow for VMware Tanzu v1.3.x to the version included in v1.5.0, the upgrade will delete the metrics app and Redis analytics backing service instance for the Data Flow service instance. The metrics app and Redis service are no longer used in Spring Cloud Data Flow for VMware Tanzu v1.5.0.

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view all service instances in the space using cf services.

$ cf services
Getting services in org myorg / space development as user...
OK

name                                           service              plan      bound apps  last operation
data-flow                                      p-dataflow           standard              create succeeded
mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11     p-dataflow-mysql     proxy                 create succeeded
rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11  p-dataflow-rabbitmq  proxy                 create succeeded

Run cf update-service SERVICE_NAME -c '{ "PARAMETER": "VALUE" }', where SERVICE_NAME is the name of the service instance, PARAMETER is a supported parameter (see Available Parameters), and VALUE is the value for the parameter. To upgrade a service instance to the latest version included in the tile, include the parameter upgrade with value true.

$ cf update-service data-flow -c '{"upgrade": true}'
Updating service instance data-flow as user...
OK

Update in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

As the output from the cf update-service command suggests, you can use the cf services or cf service commands to check the status of the service instance. When the Data Flow service instance has been updated, the cf service command will give a status of update succeeded:

$ cf service data-flow
Showing info of service data-flow in org myorg / space dev as user...

name:            data-flow
service:         p-dataflow
bound apps:
tags:
plan:            standard
description:     Deploys Spring Cloud Data Flow servers to orchestrate data pipelines
documentation:
dashboard:       https://p-dataflow.apps.example.com/instances/1cf8ff5b-4a65-469d-bee7-36e6541ac241/dashboard

Showing status of last operation from service data-flow...

status:    update succeeded
message:   Updated
started:   2018-06-19T19:26:09Z
updated:   2018-06-19T19:29:17Z

Deleting an instance

Deleting a Data Flow service instance will result in deletion of all of its dependent service instances.

Begin by targeting the correct org and space.

$ cf target -o myorg -s development
api endpoint:   https://api.system.example.com
api version:    2.75.0
user:           user
org:            myorg
space:          development

You can view all service instances in the space using cf services.

$ cf services
Getting services in org myorg / space development as user...
OK

name                                           service              plan      bound apps  last operation
data-flow                                      p-dataflow           standard              create succeeded
mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11     p-dataflow-mysql     proxy                 create succeeded
rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11  p-dataflow-rabbitmq  proxy                 create succeeded

Delete the Data Flow service instance using cf delete-service. When prompted, enter y to confirm the deletion.

$ cf delete-service data-flow

Really delete the service data-flow?>y
Deleting service data-flow in org myorg / space development as user...
OK

Delete in progress. Use 'cf services' or 'cf service data-flow' to check operation status.

The dependent service instances for the Data Flow server service instance are deleted first, and then the Data Flow server service instance itself is deleted.

As the output from the cf delete-service command suggests, you can use the cf services or cf service commands to check the status of the service instance. When the Data Flow service instance and its dependent service instances have been deleted, the cf services command will no longer list the service instance:

$ cf services
Getting services in org myorg / space development as user...
OK

No services found