Here you will find information about managing Data Flow service instances using the Cloud Foundry Command Line Interface (cf CLI). You can also manage Data Flow service instances using Apps Manager.
Note To have read and write access to a Spring Cloud Data Flow for VMware Tanzu service instance, you must have the SpaceDeveloper
role in the space where the service instance was created. If you have only the SpaceAuditor
role in the space where the service instance was created, you have only read (not write) access to the service instance.
When creating or updating a Spring Cloud Data Flow service instance, you can configure the service instance using parameters passed to the cf CLI commands. See the following sections for information about the supported parameters.
Each Data Flow service instance can be given the name of a buildpack to use for deploying stream and task apps. You can set the buildpack for the service instance using a buildpack
parameter given to cf create-service
or cf update-service
. To create a service instance that uses a buildpack named custom-java-buildpack
to deploy apps, you might run:
$ cf create-service p-dataflow standard data-flow -c '{"buildpack": "custom-java-buildpack"}'
You can configure settings for a service instance's backing Data Flow server and Skipper apps using parameters given to cf create-service
or cf update-service
.
Parameter | Function |
---|---|
dataflow.disk |
The disk used by the Data Flow server. |
dataflow.memory |
The memory used by the Data Flow server. |
skipper.disk |
The disk used by the Skipper backing app. |
skipper.memory |
The memory used by the Skipper backing app. |
For all disk
and memory
settings, the default unit is mebibytes (MB). You can use other units by naming the unit in the value string (for example, "1G"
, "512MB"
, "2GiB"
, or "3gb"
).
To create a service instance with a Skipper backing app that uses 4 GiB of disk space, you might run:
$ cf create-service p-dataflow standard data-flow -c '{"skipper": { "disk": "4GiB" } }'
You can configure the domain used by a service instance's backing Data Flow server and Skipper apps using a domain
parameter given to cf create-service
or cf update-service
. To create a service instance that uses the domain my-dataflow.example.com
for its backing Data Flow server and Skipper apps, you might run:
$ cf create-service p-dataflow standard data-flow -c '{"domain": "my-dataflow.example.com"}'
You can configure Spring Cloud Skipper settings for a service instance's Skipper backing app by passing the settings as parameters to cf create-service
or cf update-service
. This can be used to configure the deployer health check timeout, for example. To create a service instance that uses a health check timeout of five (5) minutes, you might run:
$ cf create-service p-dataflow standard data-flow -c '{"spring.cloud.skipper.server.strategies.healthcheck.timeout-in-millis": 300000}'
By default, a Data Flow server instance will not cache artifacts downloaded from a Maven repository, because this caching can overwhelm app containers and cause the service instance's Data Flow or Skipper backing apps to crash. If you wish, you can activate caching of Maven artifacts by setting a maven-cache
parameter, passed to cf create-service
or cf update-service
, to true
:
$ cf create-service p-dataflow standard data-flow -c '{"maven-cache": true}'
Each Data Flow service instance uses three dependent data services. Defaults for these services can be configured in the tile settings, and these defaults can be overridden for each individual service instance at create or update time.
Note The service offerings with the plan proxy
are proxy services used by Spring Cloud Data Flow for VMware Tanzu service instances. The Spring Cloud Data Flow service broker creates and deletes instances of these services automatically along with each Spring Cloud Data Flow service instance. Do not manually create or delete instances of these services.
General parameters used to configure dependent data services for a Data Flow service instance are listed below.
Parameter | Function |
---|---|
relational-data-service.name |
The name of the service to use for a relational database that stores Spring Cloud Data Flow metadata and task history. |
relational-data-service.plan |
The name of the service plan to use for the relational database service. |
messaging-data-service.name |
The name of the service to use for a RabbitMQ or Kafka server that facilitates event messaging. |
messaging-data-service.plan |
The name of the service plan to use for the RabbitMQ or Kafka service. |
skipper-relational.name |
The name of the service to use for a relational database used by the Skipper application. |
skipper-relational.plan |
The name of the service plan to use for a relational database used by the Skipper application. |
To create a Data Flow service instance that uses VMware Tanzu SQL [MySQL] for the Data Flow and Skipper relational databases and uses VMware RabbitMQ for the event messaging service, you might use a command such as the following:
$ cf create-service p-dataflow standard data-flow -c '{ "relational-data-service": { "name": "p.mysql", "plan": "med-db" }, "messaging-data-service": { "name": "p.rabbitmq", "plan": "high-vol" }, "skipper-relational": { "name": "p.mysql", "plan": "sm-db" } }'
To run composed tasks, Spring Cloud Data Flow uses a task app called the Composed Task Runner (CTR). By default, Data Flow downloads this app from the Maven Central repository. A different default URL for this app can be configured in the tile settings, and this default can be overridden for each individual service instance at create or update time. You can specify a different URL to use for downloading the app by using a parameter passed to the cf create-service
or cf update-service
command.
To create a service instance that downloads the CTR app from https://example.com/ctr.jar
, you might run:
$ cf create-service p-dataflow standard data-flow -c '{ "composed-task-runner-uri": "https://example.ctr.jar" }'
Each Data Flow service instance can optionally be bound to other service instances. For instance, you can configure a Data Flow service instance to be bound to an existing Spring Cloud Services Config Server service instance. To specify that a Data Flow service instance should be bound to an existing other service instance, include the other service instance's name in a JSON array called services
and pass the array to the cf create-service
or cf update-service
command.
To create a Data Flow service instance that is bound to an existing Spring Cloud Services Config Server service instance named my-config-server
, you might use a command such as the following:
$ cf create-service p-dataflow standard data-flow -c '{"services": ["my-config-server"] }'
When created, the data-flow
service instance will be bound to the existing my-config-server
service instance.
You can use Grafana to view metrics for Spring Cloud Data Flow apps and streams. To activate this, use settings under spring.cloud.dataflow.grafana-info
, passed to cf create-service
or cf update-service
.
To create a service instance that sends metrics to a Grafana dashboard located at https://grafana.example.com:443
, you might run:
$ cf create-service p-dataflow standard data-flow -c '{"spring.cloud.dataflow.grafana-info.url": "https://grafana.example.com:443"}'
Note Spring Cloud Data Flow for VMware Tanzu does not provide a Grafana installation. You must provide your own Grafana installation in order to use Spring Cloud Data Flow for VMware Tanzu with Grafana.
Each Data Flow service instance can optionally specify Maven configuration properties. For the complete list of properties that can be specified, see the Maven section in the OSS Spring Cloud Data Flow documentation.
Maven configuration properties can be set for each Data Flow service instance using parameters given to cf create-service
or cf update-service
. To set the maven.remote-repositories.repo1.url
property, you might use a command such as the following:
$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot"}'
To configure a private Maven repository that requires authentication, you can provide a username and password, such as the following:
$ cf create-service p-dataflow standard data-flow -c '{"maven.remote-repositories.repo1.url":"https://my.private.maven/repo","maven.remote-repositories.repo1.auth.username":"user","maven.remote-repositories.repo1.auth.password":"password"}'
Spring Cloud Data Flow can integrate with VMware Tanzu Observability by Wavefront to monitor deployed event-streaming and batch applications. Default values for Wavefront settings can be set in the tile configuration, and these default values can be overridden for each individual service instance at create or update time.
To configure Wavefront settings for a Data Flow service instance, pass a wavefront
parameter to the cf create-service
or cf update-service
command. This parameter is a JSON object with the fields listed below.
Parameter | Function |
---|---|
uri |
The URI of the Wavefront instance. |
api-token |
The user API token to use for Wavefront. |
source |
An arbitrary string used to identify the Data Flow service instance. |
To configure these settings for a new Data Flow service instance, you might use a command such as the following:
$ cf create-service p-dataflow standard data-flow -c '{"wavefront": {"uri": "https://wavefront.example.com", "api-token": "EXAMPLE_API_TOKEN", "source": "my-dataflow-si"} }'
All Wavefront settings are optional. If you do not supply a value for any particular setting, the Data Flow service instance will use the default value for that setting (the value specified in the tile settings).
Each Data Flow service instance can execute a maximum number of concurrently-running tasks (the default limit is 10). You can configure the concurrent task limit using a concurrent-task-limit
parameter given to cf create-service
or cf update-service
:
$ cf create-service p-dataflow standard data-flow -c '{"concurrent-task-limit": 30}'
When the number of concurrent tasks reaches the specified limit, the Data Flow service instance will no longer launch new tasks until the number of running tasks is again below the limit.
Each Data Flow service instance can be configured to run tasks only (with stream support deactivated). You can configure the service instance to activate only task support using a task-only
parameter given to cf create-service
or cf update-service
:
$ cf create-service p-dataflow standard data-flow -c '{"task-only": true}'
With task-only
set to true
, the Spring Cloud Skipper backing app (with its associated relational database backing service instance and the messaging backing service instance) will not be deployed for the service instance, and the service instance's dashboard (see Using the Dashboard) will not display the Streams tab.
Each Data Flow service instance can be configured to run streams only (with task support deactivated). You can configure the service instance to activate only stream support using a stream-only
parameter given to cf create-service
or cf update-service
:
$ cf create-service p-dataflow standard data-flow -c '{"stream-only": true}'
With stream-only
set to true
, the service instance's dashboard (see Using the Dashboard) will not display the Tasks tab.
When creating or updating a Data Flow service instance, you can set the memory allocation for the associated Spring Cloud Skipper server deployed to VMware Tanzu Application Service for VMs (TAS for VMs). The default memory allocation for Skipper is 2 GB.
To configure a value for Skipper's memory allocation, you can pass a skipper
parameter--a JSON object with a single memory
key--to the cf create-service
or cf update-service
command:
$ cf create-service p-dataflow standard data-flow -c '{"skipper": { "memory": "8G" }}'
You can use the Scheduler service with Spring Cloud Data Flow for VMware Tanzu to schedule task executions (see the Spring Cloud Data Flow OSS documentation on Scheduling Tasks). If you configure a Data Flow service instance to use Scheduler, the Data Flow broker will create a new Scheduler service instance in the Data Flow service instance's backing space. This Scheduler service instance will be bound to the Data Flow server's backing application.
To configure a Data Flow service instance to use Scheduler, pass a scheduler
parameter to the cf create-service
or cf update-service
command. This parameter is a JSON object with the fields listed below.
Parameter | Function |
name |
The name of the scheduler service offering to use. Only the Scheduler service, scheduler-for-pcf , is supported at this time. |
plan |
The name of the service plan to use. |
instance-name |
The name of the service instance to create. Optional. |
To create a Data Flow service instance that uses a Scheduler service instance with the standard
plan, named mysched
, you might use a command such as the following:
$ cf create-service p-dataflow standard mydf -c '{"scheduler":{"name": "p-scheduler", "plan": "standard", "instance-name":"mysched"}}
Begin by targeting the correct org and space.
$ cf target -o myorg -s development api endpoint: https://api.system.example.com api version: 2.75.0 user: user org: myorg space: development
You can view plan details for the Data Flow product using cf marketplace -s
.
$ cf marketplace Getting services from marketplace in org myorg / space development as user... OK service plans description p-dataflow standard Deploys Spring Cloud Data Flow servers to orchestrate data pipelines p-dataflow-mysql proxy Proxies to the Spring Cloud Data Flow MySQL service instance p-dataflow-rabbitmq proxy Proxies to the Spring Cloud Data Flow RabbitMQ service instance TIP: Use 'cf marketplace -s SERVICE' to view descriptions of individual plans of a given service. $ cf marketplace -s p-dataflow Getting service plan information for service p-dataflow as user... OK service plan description free or paid standard Standard Plan free
Create the service instance using cf create-service
. To create a Data Flow service that sets the Maven maven.remote-repositories.repo1.url
property to https://repo.spring.io/release
, you might run:
$ cf create-service p-dataflow standard data-flow -c '{ "maven.remote-repositories.repo1.url": "https://repo.spring.io/libs-snapshot" }' Creating service instance data-flow in org myorg / space development as user... OK Create in progress. Use 'cf services' or 'cf service data-flow' to check operation status.
As the command output suggests, you can use the cf services
or cf service
commands to check the status of the service instance. When the service instance is ready, the cf service
command will give a status of create succeeded
:
$ cf service data-flow Service instance: data-flow Service: p-dataflow Bound apps: Tags: Plan: standard Description: Deploys Spring Cloud Data Flow servers to orchestrate data pipelines Documentation url: https://cloud.spring.io/spring-cloud-dataflow/ Dashboard: https://p-dataflow.apps.example.com/instances/f09e5c77-e526-4f49-86d6-721c6b8e2fd9/dashboard Last Operation Status: create succeeded Message: Created Started: 2017-07-20T18:24:14Z Updated: 2017-07-20T18:26:17Z
You can update settings on a Data Flow service instance using the cf CLI. The cf update-service
command can be given a -c
flag with a JSON object containing parameters used to configure the service instance.
Note If you upgrade a Data Flow service instance created using Spring Cloud Data Flow for VMware Tanzu v1.3.x to the version included in v1.5.0, the upgrade will delete the metrics app and Redis analytics backing service instance for the Data Flow service instance. The metrics app and Redis service are no longer used in Spring Cloud Data Flow for VMware Tanzu v1.5.0.
Begin by targeting the correct org and space.
$ cf target -o myorg -s development api endpoint: https://api.system.example.com api version: 2.75.0 user: user org: myorg space: development
You can view all service instances in the space using cf services
.
$ cf services Getting services in org myorg / space development as user... OK name service plan bound apps last operation data-flow p-dataflow standard create succeeded mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11 p-dataflow-mysql proxy create succeeded rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11 p-dataflow-rabbitmq proxy create succeeded
Run cf update-service SERVICE_NAME -c '{ "PARAMETER": "VALUE" }'
, where SERVICE_NAME
is the name of the service instance, PARAMETER
is a supported parameter (see Available Parameters), and VALUE
is the value for the parameter. To upgrade a service instance to the latest version included in the tile, include the parameter upgrade
with value true
.
$ cf update-service data-flow -c '{"upgrade": true}' Updating service instance data-flow as user... OK Update in progress. Use 'cf services' or 'cf service data-flow' to check operation status.
As the output from the cf update-service
command suggests, you can use the cf services
or cf service
commands to check the status of the service instance. When the Data Flow service instance has been updated, the cf service
command will give a status of update succeeded
:
$ cf service data-flow Showing info of service data-flow in org myorg / space dev as user... name: data-flow service: p-dataflow bound apps: tags: plan: standard description: Deploys Spring Cloud Data Flow servers to orchestrate data pipelines documentation: dashboard: https://p-dataflow.apps.example.com/instances/1cf8ff5b-4a65-469d-bee7-36e6541ac241/dashboard Showing status of last operation from service data-flow... status: update succeeded message: Updated started: 2018-06-19T19:26:09Z updated: 2018-06-19T19:29:17Z
Deleting a Data Flow service instance will result in deletion of all of its dependent service instances.
Begin by targeting the correct org and space.
$ cf target -o myorg -s development api endpoint: https://api.system.example.com api version: 2.75.0 user: user org: myorg space: development
You can view all service instances in the space using cf services
.
$ cf services Getting services in org myorg / space development as user... OK name service plan bound apps last operation data-flow p-dataflow standard create succeeded mysql-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11 p-dataflow-mysql proxy create succeeded rabbitmq-b3e76c87-c5ae-47e4-a83c-5fabf2fc4f11 p-dataflow-rabbitmq proxy create succeeded
Delete the Data Flow service instance using cf delete-service
. When prompted, enter y
to confirm the deletion.
$ cf delete-service data-flow Really delete the service data-flow?>y Deleting service data-flow in org myorg / space development as user... OK Delete in progress. Use 'cf services' or 'cf service data-flow' to check operation status.
The dependent service instances for the Data Flow server service instance are deleted first, and then the Data Flow server service instance itself is deleted.
As the output from the cf delete-service
command suggests, you can use the cf services
or cf service
commands to check the status of the service instance. When the Data Flow service instance and its dependent service instances have been deleted, the cf services
command will no longer list the service instance:
$ cf services Getting services in org myorg / space development as user... OK No services found