Concourse Release Notes

7.11.2+LTS-T

Release Date: Feb 28th, 2024

Security

update module github.com/opencontainers/runc to v1.1.12.

Breaking

Topgun gc_interval to gc.interval
- See concourse/concourse-bosh-release@8d2cfa0. If you are deploying Concourse with BOSH, replace gc_interval in the spec with gc.interval, if applicable.
cf resource is no longer included in the Concourse binary, as the cf-resource repository on GitHub has moved to the cloudfoundry-community organization and is no longer maintained by the Concourse team.

Features

Added --aws-ssm-shared-path to configure shared secret paths for AWS SSM cred manager similarly to the one for Vault.
Make cc.xml endpoint public, and only list public pipelines
- Public pipelines are now accessible through the cc.xml endpoint while unauthenticated. For more information, see the Concourse documentation.
Emitting "latest_completed_build_status" gauge from prometheus
- Add concourse_builds_latest_completed_build_status metric
  - Guage = 0 for success
  - Guage = 1 for failure
  - Guage = 2 for aborted
  - Guage = 3 for error
Update base image of all built-in resource types:
- The following resources now use concourse/resource-types-base-image-static, which is based on paketobuildpacks/run-jammy-static:
  - time
  - bosh-io-release
  - bosh-io-stemcell
  - github-release
  - mock
- The following resources now use paketobuildpacks/run-jammy-base:
  - git
  - docker-image
  - registry-image
  - tracker
  - hg
  - semver
  - s3
  - pool
Support "raw" encoding for volume streaming:
- Add a new compression method raw to CONCOURSE_STREAMING_ARTIFACTS_COMPRESSION. The new method costs more worker network bandwidth but saves CPU time for many workers, and dramatically speeds up volume streaming. Larger volumes have a more dramatic streaming speed improvement.
Add a drift-based number of goroutines to component scheduler:
- Add a new ATC option --num-goroutine-threshold to specify a goroutine count threshold. If set, when an ATC reaches the goroutine count threshold, the ATC is less likely to run workloads than other ATCs with fewer goroutines. This option helps to evenly distribute workloads across ATCs.
Hermetic for task container:
- add Hermetic: bool to task step configuration. When set to true, the task container runs without external network access. Only the containerd worker runtime supports this feature. When setting a pipeline that contains a task step which sets hermetic: true, you encounter a reminder as a warning.
Optimized the database notifications, reducing TPS/QPS in the database side. Add a new ATC option --db-notification-bus-queue-size, which by default is set to 10000. If the UI is slow to load logs for running builds, consider increasing the value of this option.
Add a maximum streaming volume size:
- Add a new ATC option CONCOURSE_STREAMING_SIZE_LIMITATION, which restricts the maximum size (in MB) of volumes that can be streamed between workers. This prevents a rogue pipeline from affecting multiple workers.

Resolved Issues

Resolved Core Functionality Issues

Fix Cloud Foundry connector regression bug introduced in v7.9.1
Fix a fly builds bug that shows pipeline/job as not found when both --team and --pipeline/--job are provided
Bump ifrit to fix ATC graceful termination issue.
- When a UI client viewed a build, the ATC goroutine for processing the transaction could hang. This issue has been fixed
Unhide the --instance-var option in fly set-pipeline

Bundled resource types and versions

bosh-io-release: v1.2.2
bosh-io-stemcell: v1.2.0
docker-image: v1.8.1
git: v1.15.0
github-release: v1.9.0
hg: v1.3.0
mock: v0.13.0
pool: v1.4.0
registry-image: v1.9.0
s3: v1.3.0
semver: v1.7.0
time: v1.7.0
tracker: v1.1.0

7.9.1+LTS-T

Release Date: Feb 28th, 2023

Known Issues

Authentication through CF(Cloudfoundry) connector is broken. It will be fixed in the next patch (v7.9.2). Please do not upgrade if these affect you. Further details here.

Breaking

Fix DB out of range error due to build numbers exceed the integer limit
- To allow the migration to run Postgresql version has to be v11+.
- For upgrading Concourse k8s deployment with internal postgresql DB that deployed by Concourse chart, refer to the Upgrading Concourse with Helm. In Concourse chart v7.9.1+LTS-T, the internal postgresql DB's credential is now configured by postgresql.auth in values.yml.
Fixed a bug of leaking resource config scope ids
- When global-resources is enabled, resource_config_scopes tables leaked IDs. A side effect of the bug is that unnecessary insert will be performed (see #8618 for details).
- When global-resources is enabled, old resources were not affected. This fix ensures old resources to switch to global scopes.
  - With this change, when switching global-resources from OFF to ON, all resource histories will be lost. It is equivalent to changing source of a resource and causing version history to be lost. Depending on a resource's check behavior, versions may be regenerated.
  - If your deployment has turned ON global-resources before the upgrade, or you choose to stay with global-resources OFF, this "breaking" change won't impact your deployment.
  - If you upgrade to this version then turn ON global-resources, as described, version histories will lost. You can turn OFF global-resources again and old version histories should come back.
  If your cluster has turned ON global-resources, and you plan to turn it OFF, no matter what version it is, after turning OFF global-resources, each resource will have an unique version history, thus shared version history will be lost. The behaviour comes with global-resources and it has nothing to do with this change.
Prefer overlay over btrfs in baggageclaim when using driver: detect
- Previously, when the baggageclaim driver was not specified, Concourse attempts to detect the supported drivers
- The prior driver precedence is: btrfs -> overlay -> naive
- The new driver precedence is: overlay -> btrfs -> naive
Allow team members to archive pipelines
- Users with the member role on a team can now archive pipelines by default. The "archive pipeline" action was previously assigned to the owner role. If you've configured your own RBAC this change will not effect you.
Do not cache secrets indefinitely when using Vault KV v2
- For who uses Vault KV v2 as their creds manager, this change eliminates the ability to set an infinite cache duration, which may be a bug others are relying on.
Bump dependencies for worker runtime to support Ubuntu Jammy Jellyfish

guardian runtime is still under development to fully support Ubuntu Jammy. In fact, it does not work on any linux distribution with cgroups v2 enabled.
- If your deployment is using containerd runtime, this change won't impact.
- If your deployment is using guardian runtime, we strongly recommend to use Ubuntu Jammy stemcell on the Broadcom Support portal. Otherwise, a linux distribution with cgroups v1 and kernel version >= 5.15 has to be used to allow worker node to start up.
Remove "check build started" and "check build finished" metrics
- To monitor checks, use "check started" and "check finished" metrics instead.

Features

Automatically pause pipelines
- Adds a new component that will automatically pause pipelines that have not run in more than the configured number of days. The number of days can be configured with CONCOURSE_PAUSE_PIPELINES_AFTER. A value of zero (the default) disables this component. On first run it will retroactively pause pipelines that already fall out of the given day range.
Add default get/put/task timeout
- Allows Concourse administrator to configure global timeout for get, put and task steps.
- Fixed a bug where global check timeout didn't work.
Add optional flag --no-input-container-placement-strategy for configuring a container placement strategy used for only get and nested check steps (e.g. checks triggered by a build for making sure input versions). Configuring this strategy will prevent get and nested check step to be placed to a busy worker. Default to random.
Add optional flag --check-container-placement-strategy for configuring a container placement startegy used for resource/resource type check that triggered by lidar scanner. Default to random.
Expose ATC_EXTERNAL_URL to task env.
Add OIDC get user info flag
- Add CONCOURSE_OIDC_DISABLE_GET_USER_INFO flag. OIDC connector will now fetch additional claims from OpenID UserInfo endpoint. This should fix the problem of configuring Concourse team auth by OIDC user groups due to groups claims missing in some identity providers' auth response.
Add Vault srv lookup flag
- Bump Vault API package to latest version and add --disable-srv-lookup flag to Vault configure. If your current Vault URL contains a port number, this change makes no impact. If your Vault URL dose not contain port number, by default SRV lookup is opt-in for backward compatibility. In this case, one can use the flag to disable the feature to avoid unnecessary requests from Vault client.
Enhance Vault API client to auto retry upon rate limit
- Enhanced Vault credential manager to auto retry when hitting Vault rate limit error. Vault started to support rate limit since 1.5. When setting rate limit on Vault, it's better to enable rate limit HTTP response header by vault write sys/quotas/config enable_rate_limit_response_headers=true, so that the response header Retry-After may guide the Vault API client to retry after a reasonable duration.
load_var step supported var interpolation for file and format
Support a way to skip implied get after put
- Added no_get option to put step to skip implied get. For example:
```
- put: email
  no_get: true
  params:
    ...
```
Add seccomp profile, hooks dir override for worker containerd cli option
- seccomp-profile to override the seccomp filter
- oci-hooks-dir to pass on a oci hooks dir, for i.e. nvidia gpu mapping

Fly commands

Add dry-run mode to fly set-pipeline command
- This adds a dry-run feature to the set-pipeline command within the Fly CLI, the main purpose of this is to allow users to check what would be changed without any interactive-prompt/danger of applying by mistake.
Prefer FLY_HOME over HOME (if set) as the directory for storing .flyrc
Add --team to following Fly command
- fly -t dev check-resource -r some-pipeline/branch:master/myresource --team test
- fly -t dev check-resource-type -r some-pipeline/branch:master/myresource --team test
- fly -t dev resources -p some-pipeline --team test
- fly -t dev resource-versions -r some-pipeline/branch:master/myresource --team test
- fly -t dev archive-pipeline --pipeline some-pipeline --team test
- fly -t dev watch --job some-pipeline/tests --build 52 --team test
Add fly clear-versions command
- Can be used to clear version history of a resource or resource type
- Can only be used by an admin user
- If global-resources is enabled, it can possibly delete version histories of other resources/resource-types in other pipelines so there is a warning message that will show any resources or resource types that are affected.
Add prometheus emitter for jobs scheduled duration.

Core Functionality

Add audit information for job & pipeline pauses
- Add pipeline and job pause meta information - who and when.
Optimize build log collection
- Optimized a SQL statement used to remove build logs. This optimization will specially benefit large deployments that have a lot of pipelines.
Avoid peridoic check build to use db
- Changing Lidar triggered check builds to not use the database, which should mitigate the performance drop introduced by the big refactory of resource checks since 7.0.0.
Worker: baggageclaim emits spans
- Workers now emit traces from the baggageclaim server so one can see volumes being created and streamed as part of a build
Allow task/set_pipeline name to include across step var
- Identifiers for task and set_pipeline steps wrapped by the across step can now have their identifier/step name as a var ((.:some-var)) and won't receive a warning about the name being deprecated
AWS SecretsManager can be used from var_sources
Garbage collect task caches from paused pipelines
- When a pipeline or a job is paused, the task caches that used in the pipeline's job will be garbage collected. This should help free up worker disk space.
Optimize ATC performance by avoid unneccessary go-routines of no-op check notifiers
Optimize limit active tasks strategy logic
- Optimize limit-active-tasks strategy to reduce DB load and avoid deadlocking when under heavy load.
Add no-input-strategy for get/check.
Optimize worker selection when global-resources is enabled
- When global resource is enabled, a check may run on any worker disregarding tag and team workers.
Optimize work load distribution across ATCs by enhancing locks logic.
Force checks on nested resource types when manually triggered build
- When a build is manually triggered, it will cause any nested resource types or images to skip its checking interval, essentially forcing a check. This will not result in the same resource type getting checked multiple times if it appears multiple times in a build.
Disable connection tracker by default and provide an option to enable
- Disable /debug/connections at ATC start time. It can be enabled at runtime by /debug/connections/on or be disabled by /debug/connections/off again.
Add a drift to component interval
- Enhancement of component scheduling so that workloads are distributed across ATCs more evenly.
Optimized performance of the login authentication process, which will benefit large deployments that has a lot teams and a lot of UI/fly accesses.
Optimized performance of check-build-events collector.
Fixed a bug where invalidated worker resource caches are not GC-ed.

Web UI

Propagate groups between subpages of a pipeline
- If a user was initially viewing a group in the pipeline page, this will be persisted in the pipeline breadcrumb when navigating between pipeline subpages.
Optimize pipeline svg rendering
- The initial render of the pipeline page should be much faster, particularly on Chrome 92+.
Make Build page spacing consistent and color theme updated for accessibility
Indicate if a pipeline is archived in pipeline view
- When viewing an archived pipeline (or any sub routes of it) in UI, the pipeline name now shows "archived" and the breadcrumbs background will change to grey so one won't confuse.
Add build event for volume streaming
- Build logs will now contain new events when a volume is being streamed to a worker
Fix default username prompt for local logins
- Ensure the default username prompt for local logins is properly set.
Add tooltip to username if overflow
- When username is overflowing, show a hovering tooltip with full name in web UI so it won't block buttons below it e.g. trigger build buttons in build page.
Fix step header key value UI in build page
- Fix line height of step header in build page when there is sub header like instance vars or across.
Fix a bug where sub step of across step showing incorrect state.

Resolved Issues

Resolved Core Functionality Issues

Sanitize prometheus metric labels
- Ensure Prometheus metric labels are valid. This resolves an issue with bosh release, where web nodes would fail to start, due to a metric label that wasn't valid according to Prometheus.
Validate if a Pipeline contains a cycle
- The API will reject any pipelines that contains a cycle
Make build reaper more robust
- Make build log reaper more robust by not exiting early if it encounters an issue while iterating over pipelines/jobs. Before this change build logs for some pipelines could have accumulated endlessly even with a build retention policy.
on_error should not run the hook when err is retriable
- Fixed a bug when --enable-rerun-when-worker-disappears was enabled and a job/step had an on_error hook. If the step was retried the on_error hook would run when it should not.
GC builds based on chronological order
- Fix a bug that events of a rerun build be reaped immediately if its prarent build is already reaped. Now candidate builds for GC will be ordered chronologically.
Run task caches collector when ATC starts
- Previously when a pipeline is archived, the task caches used in its job will not be garbage collected, which will cause volume leaks in worker disk. Now a component for GC task caches will runs when ATC starts.
Do not send check build events to syslog drainer
- Since v7.0, resouce and resource type checks are ran as builds. When syslog drainer is enabled, those check build events are also sent to external server, which requires storage space (depends on amount of resources and check interval). Now this type of events will be ignored by syslog drainer.
Fix acrossStep handling for more than 3 vars

Resolved Web UI Issues

Fix a rendering issue with nested across steps.
Render build page correctly for legacy aggregate step
- Show legacy builds with aggregate steps. Pipeline configure with aggregate step is still deprecated. This is just fixing the UI rendering error.
Show var source error on resource and build page
- Now error caused by variable interpolation can be shown correctly on resource and build page.

Resolved Runtime Issues

Delete btrfs volume if it exists when using the overlay driver
- Made worker initialization more stable if you're switching from btrfs to overlay. The worker will remove the btrfs mount if it exists before creating overlay mounts
Ignore cached input from volume-locality's consideration
- When EnableCacheStreamedVolume is enabled and container placement strategy is volume-locality, as get step may not fetch a resource if the resource is found in cache, following step containers may all be placed to the worker where cached resource is found. That worker might be overloaded when there are other workers available. This issue is now fixed.
Only delete btrfs mounts if *.img exists
- Concourse worker would fail to start if it's on a btrfs filesystem and tries to use the overlay driver
Handling huge volumes transfer in P2P streaming
- Fix a bug that P2P streaming would fail if streaming a volume takes longer than 3 minutes. This fix should be applied to both ATCs and workers.
Avoid duplicating parallel volume streams
- Steps that stream volumes will now use a global (per worker) lock to ensure identical volumes are not streamed more times than they need to be
- A new waiting-for-streamed-volume/waiting for volume <name> to be streamed by another step event is included in build step logs where this behavior occurs.
Fix a bug when a worker is pruned, volumes streamed from it got destroyed immediately
- Now if opt in EnableCacheStreamedVolumes, worker cache volumes are kept around whilst they are still in use.
Inherite env proxy configure when tls enabled
- Fix a bug that proxy setting through env var got lost when TLS is enabled by --tls-bind-port

Bundled resource types and versions

bosh-io-release: v1.1.3
bosh-io-stemcell: v1.1.2
cf: v1.1.4
docker-image: v1.7.1
git: v1.14.7
github-release: v1.8.0
hg: v1.2.7
mock: v0.12.4
pool: v1.3.2
registry-image: v1.7.1
s3: v1.2.1
semver: v1.5.1
time: v1.6.3
tracker: v1.0.8

7.4.0

Release Date: July 29th, 2021

Known Issues

If you have a very large deployment you'll see an increase in DB and CPU usage with the 7.4.0 series. Also note that the experimental load_var step has a memory leak issue in 7.4.0.
If you are using syslog drainer feature, expect increasing amount of logs from check builds. Since resource check now runs as build, every one of them will generate build events. Depends on the resource count and check interval, syslog draining will produce log files that might take large amount of disk space in the external server.

Features

Fly commands

Fly clear-resource-cache command
- Added fly command clear-resource-cache, you could use this following the next format fly -t ci clear-resource-cache -r pipeline/resource [--version some:version]

Core Functionality

Support soft policy enforcement
- This feature doesn't break the existing OPA policy check. If you have enabled OPA policy check, and you don't need "soft" policy enforcement, then you just don't need to do any configuration change.
- 3 new ATC cli options are added:
  - CONCOURSE_OPA_RESULT_ALLOWED_KEY: specifies a key of allow flag in OPA returned result
  - CONCOURSE_OPA_RESULT_SHOULD_BLOCK_KEY: specifies a key of should-block flag in OPA returned result
  - CONCOURSE_OPA_RESULT_MESSAGES_KEY: specifies a key of messages in OPA returned result
    
    For example, if OPA returns the following result:
```
{
    "result": {
        "allow": true,
        "block": true,
        "reasons": ["foo", "bar"]
    }
}
```
    then CONCOURSE_OPA_RESULT_ALLOWED_KEY should be set to result.allow; CONCOURSE_OPA_RESULT_SHOULD_BLOCK_KEY should be result.block, and CONCOURSE_OPA_RESULT_MESSAGES_KEY should be result.reasons.
    
    allow and block in OPA result should be boolean type, because it is easy to convert other types to boolean in an OPA policy.
Add teamName to concourse_steps_wait_duration metrics
Allow interpolation in the across step values
- The across step now supports dynamic interpolation of values. For instance, this can be combined with the set_pipeline step and instanced pipelines to set a dynamic list of pipelines:
```
- load_var: branches
  file: branches/branches.json
- across:
  - var: branch
    values: ((.:branches))
  set_pipeline: my-app
  file: ci/pipelines/my-app.yml
  instance_vars: {branch: ((.:branch))}
```
Cache the list of workers in memory
- Scheduling containers should be more performant by reducing the number of required database calls
Optimize build log collector
- Optimized a SQL statement used to remove build logs. This optimization will specially benefit large deployments that have a lot of pipelines.

Web UI

Build page shows name of who triggered the build in header line of build page
- The build page now shows the username of who triggers the build if the build is triggered manually.
Add page to view all builds/resource versions downstream/upstream from a root resource version
- Disabled by default since computing causality for large datasets can be expensive, use --enable-resource-causality or $CONCOURSE_ENABLE_RESOURCE_CAUSALITY=true to enable the web UI and API endpoint.
  - Most datasets (like the merge commit for this PR) have < 100 builds and/or resource versions and take < 100ms, but it's possible for some "slow paced" resource versions (i.e. very infrequent new versions) to generate extremely large datasets
  - There is an automatic cutoff at 5000 builds or 25000 resource versions. On our deployment, the call for our slowest paced resource took about ~7 seconds to process, most of which is spent in the DB query
- The causality page can be navigated to from the resource page
- The causality page displays all the builds and resource versions that was generated from (downstream) or resulted in (upstream) the creation of a particular resource version
- The downstream graph will put the root resource version on the left whereas the upstream graph will put it on the right
- It takes into account all the intermediate resource versions when computing the final graph. In the picture above, while the resource page only shows that git version: 123 is a direct input to integrate #4 & #5, there is also an indirect link from git version: 123 → test #19 → ... → intermediate-3 version:123 → integrate #6 & #6.1
Add ability to comment on a build
You can now leave comments on builds. For instance, this can be used to give context to your coworkers about why a particular build failed:
If a build has a comment, it is displayed with a small marker to help you quickly find builds of interest. Hovering over the build displays a portion of the comment:
Use browser cache API for dashboard caching
- The cached API responses on the dashboard no longer need to get truncated, which was previously introduced to work around localStorage limits
Enable emitting dogstatsd metrics over uds
- The Datadog emitter can now be configured to communicate with the Datadog agent over Unix Domain Sockets

Resolved Issues

Resolved Core Functionality Issues

Handle 403 for vault preflight check of V2
atc: across step logs errors
- Across step emits an error event when one of the sub-steps errors
atc(fix): fixed a bug in resource check rate limiter.
- Fixed a bug in check rate limiter that caused slow checks.
Fix memory leak in notification bus
Fix algorithm considering reruns as new builds
- Fixes pipelines getting stuck with the same inputs when a job upstream of a job with version: every succeeds and is rerun
Fixed build log reaper not respecting when both Days and Builds are set
- The build log reaper has two options for determining when to reap logs. Before, if both of the options are set, it would reap if either of the two options were true, rather than requiring both of them to be satisfied
Apply a minimum rate limit for resource checking
- If CONCOURSE_MAX_CHECKS_PER_SECOND is unset, Concourse will try to distribute checks evenly over the course of the check interval to reduce the concurrent load on external systems.
- If there are few resources in a Concourse deployment (~1-20), checks may have to wait a substantial amount of time to run to space the checks out evenly. However, there's no real benefit to doing this, since having just a few resources doesn't cause significant load in the first place.
- Now, Concourse ensures that at least one check is allowed to run per second
atc/db: prevent creation of duplicate check builds
- Prevent duplicate checks from being created for a single resource
set_pipeline unpauses previously archived pipelines
- When an archived pipeline is un-archived via the set_pipeline step, it will be unpaused
GC task caches belonging to archived pipelines
Fix prometheus emitter not setting default attributes Additional metrics attributes configured by --metrics-attribute now propagates to the prometheus emitter correctly.
run check builds GC in batch
Fix BaseResourceType for streamed volumes

Resolved Web UI Issues

Fix browser back button after selecting a group
- Previously, if a pipeline group was selected in the UI, the back button would not work (you'd have to press it twice to go back)

Resolved Runtime Issues

containerd: properly populate /etc/hosts and /etc/hostname
containerd: Mount /dev/fuse to privileged containers
Fix worker restart issue with containerd daemon and beacon
- Fix worker stall issue when restarting with containerd. Exit the worker's beacon process gracefully if any other top level process like the containerd daemon fails. Wait for containerd daemon to come up before starting the containerd Garden server.
containerd: default to root if /etc/passwd is missing
- Fixes a regression introduced in 7.3.0 that prevented containers that don't have an /etc/passwd file from running
containerd: keep tasks running after concourse worker restarts gracefully
- The containerd runtime is now more resilient to the concourse worker process gracefully restarting (e.g. via monit restart)
  - Tasks that were started prior to restart will continue to run when the worker process comes back up
  - This matches the behaviour of the Guardian runtime
containerd: Clean up networking files in /tmp
- Fixed a bug where the containerd runtime would create networking related files under /tmp and never delete them. They are now made under the --work-dir set for the worker and are cleaned up when the container is deleted. You can delete any lingering network files under your workers /tmp directory after upgrading.

Bundled resource types and versions

bosh-io-release: v1.1.1
bosh-io-stemcell: v1.1.1
cf: v1.1.3
docker-image: v1.6.1
git: v1.14.0
github-release: v1.6.4
hg: v1.2.4
mock: v0.12.2
pool: v1.2.1
registry-image: v1.4.0
s3: v1.1.2
semver: v1.3.2
time: v1.6.1
tracker: v1.0.6

7.3.2

Release Date: June 14th, 2021

Resolved Issues

Resolved Core Functionality Issues

Fix memory leak in notification bus

7.3.1

Release Date: May 27th, 2021

Resolved Issues

Resolved Core Functionality Issues

Bump guardian to 1.19.28
- Fixes a bug where guardian would fail to start up when the kernel version contained an unexpected suffix

7.3.0

Release Date: May 25th, 2021

Breaking Changes

Bump opentelemetry to 0.19.0
- The service name Honeycomb tracing exporter is now configured via the more general --tracing-service-name (CONCOURSE_TRACING_SERVICE_NAME) rather than --tracing-honeycomb-service-name (CONCOURSE_TRACING_HONEYCOMB_SERVICE_NAME)

Features

Core Functionality

Cache streamed volumes and use local cache when looking for volumes Optimize resource cache streaming and get step.
- Mark streamed resource cache volumes as resource cache, to avoid duplicate streaming in next runs.
- If a resource from a get can be found on some workers, then get step will do nothing. This will reduce times of Concourse connecting to external systems, such as git, docker hub, and so on.
- This feature is currently opt-in and can be enabled using CONCOURSE_ENABLE_CACHE_STREAMED_VOLUMES flag.
Enhance syslog-drainer to make it more useful
- Add event_id into syslog-drainer entries, to get the correct order of "drained" build logs.
- Add more supported event_type for syslog-drainer to include more info for "drained" build logs.
Enhance webhook triggered checks
- When multiple pipelines hold a common resource and webhook calls against the common resource, checks are sent to all pipelines at same time. Without this enhancement, each webhook call will cause a check to run. With this enhancement, only a single check will run, which is the expected behavior as a global resource.
Allow override of container limits in task config
- Pipeline authors can now set container_limits for reusable tasks in pipelines. Any limits set in the pipeline will override the limits set within the reusable task file.

Web UI

Re-ordering instanced pipelines
- Instanced Pipelines are allowed to be re-ordered with in their group through the UI (using the drag and drop functionality) or using the fly command: fly -t dev oip -g groupName -p key1:var1 -p key2:var2
Use cursor-based pagination for build events
- Optimizes fetching build logs from the DB for builds with massive logs
Set Content-Security-Policy and Cache-Control Headers
- A Content-Security-Policy header is now set with a default value that will block framing of the Concourse web UI. This was already possible with the default value of the X-Frames-Option header.
  - The CSP header value is configurable with CONCOURSE_CONTENT_SECURITY_POLICY
- A Cache-Control header is set on every page with a default value of no-store, private. The value of the header is overwritten for some paths (i.e. web assets)

Resolved Issues

Resolved Core Functionality Issues

Add trigger for deleting pipeline
- Fix a bug that might leave orphan pipeline_build_events_* table in DB when deleting a team. Pipelines belong to the deleted team will be destroyed by DELETE CASCADE but associated events table was not cleaned up properly.
Scan unchecked resource-types
- Fixed an edge case where a put-only resource's parent-type would not be checked
Fix Postgres deadlock when frequently setting pipelines

Resolved Web UI Issues

Use display_user_id field to render username in web interface
Set autocomplete to off for login form
- add autocomplete="off" to the top-level form and username tags.

Resolved Runtime Issues

Ensure stdin never errors when using containerd with TTY enabled
- Fixed bug with containerd runtime where builds to error out if it runs for a long time without any output
Fix volume GC query to not include volumes with children
- Fix query that causes volume cannot be destroyed as children are present in web and update or delete on table "volumes" violates foreign key constraint "volumes_parent_id_fkey" in DB.
Ignore "not found" error on process deletion for Containerd runtime
worker: Set PATH based on UID instead of container's privileged state
- Containerd: fixed a bug where PATH did not contain directories to system tools (i.e. /sbin) when a user/process was root. Only effects unprivileged containers.
containerd: allow use of non-existent uids
- containerd supports running images with non-existent UIDs such as distroless images.

7.2.0

Release Date: April 13th, 2021

Breaking Changes

Wait for worker matching strategy when scheduling build steps
- Previously, if no workers satisfied the container placement strategy for a step (with the exception of task steps when using the limit-active-tasks placement strategy), the step would simply error the build
- Now, all steps will wait for a worker to become available
- The metric concourse_tasks_waiting was removed and replaced with concourse_steps_waiting{type="task"}

Features

Core Functionality

Allow using LDAP as a password connector
- By setting --password-connector ($CONCOURSE_PASSWORD_CONNECTOR) to ldap, you can authenticate to Concourse with fly login -u ... -p ... using your LDAP credentials
  - Enabling this feature prohibits the use of local users
- If you use an attribute other than username for authenticating with LDAP (e.g. email address), you can now configure --username-prompt ($CONCOURSE_USERNAME_PROMPT) to change the help text when logging in via the UI
Optimize check creation in DB
feat(atc): add check build metrics.
- Fixed metrics BuildsStarted, BuildsRunning, BuildStarted, BuildFinsished to exclude check builds.
- Added check build metrics: CheckBuildsStarted, CheckBuildsRunning, CheckBuildStarted, CheckBuildFinsished

Web UI

Add ability to navigate to resources page from build page UI: clicking on the version text for a get/put step in the Build page will now navigate directly to the Resource page with the corresponding version expanded
Add DB index to optimize paginating job builds
enhance put.inputs detect to ignore prefixed . and ..
- input: detect now can handle paths prefixed by . and ...

Resolved Issues

Resolved Core Functionality Issues

Treat empty value for worker tags the same as not having any tags at all
move migration table updating SQL into a migration transaction Fix a bug where a completed migration was not recorded in migrations_history table
Build image resource caches foreign key constraint to job ids should be on delete cascade
- This change fixes a bug that was introduced in v7.1.0 where deleting a pipeline could possibly result in a 500 error. This was caused by a foreign key constraint within the build_image_resource_caches table referencing a job in the jobs table.
update check metrics comments.
- Just update code comments, no release impact.

Resolved Web UI Issues

Prevent UI from stalling when you keep the resource page open for a while

Resolved Runtime Issues

runtime: check if swap limits is enabled
- The containerd runtime will conditionally set memory swap limits if it detects that memory swap limits are enabled
runtime: timeout set to 0 means there is no timeout
- When CONCOURSE_CONTAINERD_REQUEST_TIMEOUT is set to 0 that means there is no timeout
better handling for containerd error message
- Fixed a bug with the containerd runtime where gracefully stopping a container might have failed with an unhandled error. Now it gracefully shuts down.
Fix race condition in containerd runtime resulting in lost output for quickly printing-then-exiting processes

7.1.0

Release Date: March 11th, 2021

Features

Fly commands

Show warning for pipelines configured with 'set_pipeline' step
- fly set-pipeline now prints warning message when the pipeline has already been configured through a set_pipeline step.

Web UI

Allow favoriting instance groups
Change SideBar "menu" icon
- Updated the visuals for the button to open and close the sidebar
Adjust spacing and padding for elements in pipeline card view in Dashboard

Runtime

Start non-privileged containers in their own cgroup namespace
Bump baggageclaim to v1.11.0
- Privileged container initialization will be much faster for workers using OverlayFS as the baggageclaim driver and if their kernel supports OverlayFS's metacopy feature

Resolved Issues

Resolved Fly Issues

Only interpolate static vars when it does not contain a source (#6619) @chenbh _^:link:
- Fixed bug where static vars from fly set-pipeline -v ... -y ... were interpolated into local vars ((.:var))

Resolved Core Functionality Issues

Skip build log reaping process for paused jobs
Check parent resource types of resources that have set check_every: never (#6603) @taylorsilva _^:link:
- Resources that had check_every: never who's type was defined in resource_types in their pipeline, would fail to check because the parent resource type would never be checked

Resolved Web UI Issues

Fix reaped link in UI
Bump elm-ansi to support 8-bit and 24-bit ANSI colors
- Fixes a bug where ANSI escape codes for 8-bit/24-bit colors were misinterpreted, resulting in build logs blinking and other peculiarities

Resolved Runtime Issues

containerd: fix mount issues with certain images
- Fix an issue on the containerd runtime where processes fail to run with certain container images
containerd: infer MTU from host's network interface
- In prior versions of Concourse, the Containerd runtime always set the MTU of the container bridge network to the system default
- Now, the Containerd matches Guardian's behavior by:
  - Detecting the external IP of the host (can be set explicitly using CONCOURSE_CONTAINERD_EXTERNAL_IP)
  - Extracting the MTU from the network interface corresponding with that IP (can be set explicitly using CONCOURSE_CONTAINERD_MTU)

7.0.0

Release Date: February 10th, 2021

Breaking Changes

Run checks as builds
- Breaking change: unique_version_history can no longer be configured on resource types. No one seemed to be using it, and it made internal architecture unnecessarily complicated. The need for it should go away entirely as we make progress on the v10 roadmap.
- Resource check operations, which collect and save versions for pipeline resources, are now run as builds.
  - This is largely an internal architecture refactor, but it also improves UX - check output can now be viewed on the resource page!
- fly check-resource and fly check-resource-type now stream the checking output to the user, just like fly watch and fly trigger-job.
- This change includes a migration to convert id column of the builds table and all tables referencing build_id to a bigint. This is unfortunately a slow migration, so please anticipate downtime proportional to the amount of builds in your database.
  - If the migration fails with deadlock detected, shut down the other web nodes first.
  - Our large-ish scale test environment took about an hour.
Remove aggregate step
- Removing the aggregate step as planned. It is succeeded by the in_parallel step.

Features

Fly commands

Fallback fly intercept to sh when bash is missing
- If no command is specified,fly intercept will first try to use bash for an interactive shell, but if the container returns an error indicating bash is not available, fly will fallback to the more common (but more limited) sh
- If this fallback logic is not desired, the user can explicitly specify bash as the path argument to the fly intercept command
Add --team flag to fly order-pipelines command
Add --team option to fly get-pipeline command
Add --team option to fly expose-pipeline command
fly set-pipeline prints pipeline name and instance vars
fly: add autocomplete for fish.

Core Functionality

Perform image fetching using check/get sub-steps (#6153) @vito _^:link:
- Image fetching for resources and resource types is now handled explicitly in the build plan using check and get steps, and can be inspected in the UI
Speed up database queries by adding a job_id column to build image resource caches table and adding an index for ordering builds of a job
Allow globs in groups
- groups in a pipeline can now match jobs based on globs e.g.:
```
groups:
- name: deploy
  jobs:
  - deploy-*
```
Implement support for Vault KV v2 backends
add support for exporting traces via OTLP
- Added support for OTLP as a target for traces to be exported to
Add index to speed up build deletion, fix up a few issues with checks as builds
Ensure pipelines contain at least one job
- Pipelines are now validated to ensure that they contain at least one job - pipeline configs with no jobs will be rejected
Experimental support for P2P Volume Streaming
- Support P2P volume streaming directly between two workers instead of through the ATC.
  - This is an opt-in feature enabled with --enable-p2p-volume-streaming or env var $CONCOURSE_ENABLE_P2P_VOLUME_STREAMING on the web nodes. When this feature is enabled, --baggageclaim-bind-ip on workers should be set to 0.0.0.0 so that baggage claim can be accessed from another workers.
  - This should only be used for clusters where all workers can reach each other on the same local network.
  - Adds --baggageclaim-p2p-interface-name-pattern and --baggageclaim-p2p-interface-family to the worker command.
Log the worker name when creating a container fails
Support chained container placement strategies.
- Enhanced container placement strategy to support chained strategies, for example CONCOURSE_CONTAINER_PLACEMENT_STRATEGY=volume-locality,fewest-build-containers
Add new container placement strategies: limit-max-containers and limit-max-volumes
- These strategies prevent scheduling on workers that already have too many containers or volumes on them (respectively), according to a limit set by --max-active-containers-per-worker, --max-active-volumes-per-worker
- A possibly placement strategy chain to better balance workloads across workers could be [limit-max-containers, limit-max-volumes, volume-locality, fewest-build-containers]
  - This strategy chain first filters out workers that already have too many containers/volumes, then chooses all the workers with the most inputs already present locally, breaking ties by preferring the worker with fewer containers
Workers are not guaranteed to never exceed the maximum specified limits.
go-concourse surfaces error messages on saving pipelines
- where fly set-pipeline would simply print forbidden when the underlying API call returned a 403 status, now the body of the response will be printed. In particular, errors originating from OPA policy check rejections will be printed.
Don't enforce timeouts during image fetching
Skip checking put-only resources
- An optimization which should lower the resource checking load on some instances: instead of checking all resources, only resources which are actually used as inputs will be checked. This feature was released in 6.0.0 and reverted in 6.6.0 because of its side effects. Now after resolving those side effects, it's back.
- The --enable-skip-checking-not-in-use-resources flag has been removed as it is no longer needed.
Give worker registration its own database connection pool
- Give the worker registration endpoint its own database connection pool to avoid the situation where the API connection pool is maxed out and workers fail to register and stall
Allow underscore in identifiers
Support for mTLS
- Added support for mTLS between Concourse and a reverse proxy that may be in front of Concourse
Allow configuring login and query timeouts for Vault
- These timeouts can be configured using CONCOURSE_VAULT_LOGIN_TIMEOUT and CONCOURSE_VAULT_QUERY_TIMEOUT respectively
- The new default login timeout is 60s
Expose username of who manually triggered build to build metadata.
- fly builds has a new column created by that shows a user ID if a build is triggered manually.
- A new build metadata BUILD_CREATED_BY may be exposed to resource. It is not exposed by default; you need to turn it by add expose_build_created_by when defining a resource:
```
resources:
- name: some-resource
  type: some-type
  expose_build_created_by: true
  source:
    ...
```
- As different authentication connectors populate different claims, a new concourse web CLI option --concourse-display-user-id-per-connector is added that allow cluster administrator to configure which claims field should be consider as unique user id
  - Values of this option should be in format <connector>:<fieldname>
    - connector is one of: ldap, github, cf, bitbucket-cloud, gitlab, microsoft, oauth, oidc or saml
    - fieldname is one of:
      - user_id mapping to claims' user id field
      - name mapping to claims' username field
      - username mapping to claims' preferred username field
      - email mapping to claims' email field
Allow disabling resource checking for individual resources
- Automatic resource checking for individual resources can be disabled by setting check_every: never in a resource's definition
db: lidar checks put-only resources with failed checks
- Lidar now checks any put-only resources that ran a check which failed.
Add a flag to migrate to the latest db version
- add a --migrate-to-latest-version flag to the migrate command. This flag has concourse perform database migrations to the latest database version.
- the concourse web command will still automatically migrate the database
metrics: make tasks_wait_duration histogram record up to 1h
Removes unnecessary indexes from build events tables
Allow @ in vars path
- Allow "@" sign in a var's path e.g. ((var:"[email protected]".field))

Web UI

Show resource check build output in web UI
Enforce SetPipeline policy check in set_pipeline step
- When OPA integration is enabled, the set_pipeline step now respects the same policy check as fly set-pipeline
[experimental] Group instanced pipelines on UI
- Instanced pipelines (RFC) provide a mechanism for constructing multiple instances of a pipeline template that differ by some parameters
  - e.g. to support multiple release lines, you may have a collection of instanced pipelines called release that differ by the version line (1.0.x, 1.1.x, 2.0.x, etc.)
- All instanced pipelines with the same name (but different parameters) will be collected in the UI into a grouping of related pipelines, removing clutter from the dashboard when there are many instances of a pipeline template
- Instanced pipelines are currently experimental until we work out the UX, but if you'd like to play around with them, you can set the flag --enable-pipeline-instances ($CONCOURSE_ENABLE_PIPELINE_INSTANCES)
set_pipeline step prints 'no changes to apply'
- set_pipeline now prints "no changes to apply" and thereby behaves similarly to fly set-pipeline when a pipeline config contains no changes.
Update colours and contrast
Ignore paused jobs when displaying pipeline status in the UI
- The UI will no longer consider paused jobs when figuring out the overall status of a pipeline
Enhance search bar filtering and allow filtering by instance group
- Allow filtering by exact match on the dashboard by quoting search terms
- Allow applying multiple search filters simultaneously (e.g. team:"main" status:paused)
- Make search suggestions more intelligent
Add more tooltips for action buttons
- Many buttons in the UI now have a tooltip on hover to indicate what they do

Runtime

The formerly-experimental containerd runtime is now GA and is considered ready for production use
- We will be changing the default container runtime from Guardian to containerd in coming releases, but we encourage using the containerd runtime ASAP
- To enable the containerd runtime, set --runtime ($CONCOURSE_RUNTIME) to containerd on the concourse worker command
- You will also need to convert any --garden-* ($CONCOURSE_GARDEN_*) flags to their containerd counterparts:
  - --garden-request-timeout ($CONCOURSE_GARDEN_REQUEST_TIMEOUT) → --containerd-request-timeout ($CONCOURSE_CONTAINERD_REQUEST_TIMEOUT)
  - --garden-dns-proxy-enable ($CONCOURSE_GARDEN_DNS_PROXY_ENABLE) → --containerd-dns-proxy-enable ($CONCOURSE_CONTAINERD_DNS_PROXY_ENABLE)
  - --garden-network-pool ($CONCOURSE_GARDEN_NETWORK_POOL) → --containerd-network-pool ($CONCOURSE_CONTAINERD_NETWORK_POOL)
  - --garden-max-containers ($CONCOURSE_GARDEN_MAX_CONTAINERS) → --containerd-max-containers ($CONCOURSE_CONTAINERD_MAX_CONTAINERS)
  - $CONCOURSE_GARDEN_DENY_NETWORKS → --containerd-restricted-network ($CONCOURSE_CONTAINERD_RESTRICTED_NETWORK)
  - $CONCOURSE_GARDEN_DNS_SERVER → --containerd-dns-server ($CONCOURSE_CONTAINERD_DNS_SERVER)
- If you rely on any Garden config that is not yet supported on our containerd runtime, please open an issue
Add flag to concourse worker to overwrite init binary path for the containerd runtime
- The init binary can be configured using the --containerd-init-bin flag ($CONCOURSE_CONTAINERD_INIT_BIN)
Make CNI plugins directory configurable for the containerd runtime
- CNI plugins directory can be configured using the --containerd-cni-plugins-dir flag ($CONCOURSE_CONTAINERD_CNI_PLUGINS_DIR)
Update Otel module[go.opentelemetry.io/otel]
Remove legacy logic for dealing with resource versions that have a check order of zero
- Includes a migration that will delete any versions with a check order of 0. This should not affect anything because versions with a check order of 0 are invalid versions.
- Should speed up some queries that had legacy logic with filtering on versions with a check order of 0.
start containerd with low oom_score
- It is recommended that containerd be started with an oom_score of -999. We want it to be at the level of other system daemons. This is so that containerd never runs into an out of memory state before the containers it's managing are cleaned up. At the same time it should not be unkillable.
Bump BaggageClaim to v1.10.0
- Windows workers will now shell out to the much faster robocopy executable for copying local files. This should dramatically improve performance for Windows tasks which utilize caches: for caching a bunch of tiny files.

Resolved Issues

Resolved Fly Issues

fly pin-resource requires a version if the resource is unpinned
- Previously, you could run the command on an unpinned resource without passing a version -- it would run and succeed, but do nothing. Now the command will fail and print an error message.
set-pipeline prompted unpause-pipeline command should have --team option. (#6336) @evanchaoli _^:link:
- Fixed a bug of fly set-pipeline where --team option was missing in the prompted unpause-pipeline command.
Ensure task, set_pipeline, load_var steps have names
- Return an error when no identifier is provided for task, set_pipeline, and load_var steps

Resolved Core Functionality Issues

Fix quoting for var subkeys
- Fix interpolation of quoted variable fields containing special characters.
Prevent set_pipeline runtime error
- set_pipeline of a YML pipeline configuration file with no jobs: or resources: no longer causes a runtime error: invalid memory address or nil pointer dereference.
Respect limit configured by limit-active-tasks
Remove any existing guardian assets
- The worker will now clear out any existing Guardian assets on start-up (/var/gdn/assets)
- This fixes in-place upgrade scenarios where guardian was using old versions of runc
atc: abort a rerun build if input version gone
- A rerun build will be aborted automatically if required version of any input is not available.
add lock for concourse migrate to latest version cmd

Resolved Web UI Issues

Fix pipeline cards being rendered off-screen when sidebar was open
- Fixes occasional bug where pipelines would be rendered off-screen after a refresh on the dashboard
Preserve whitespace within build output
- In v6.6.0, whitespace was collapsed to fix a bug with horizontal scrolling in the build output. This change will preserve all whitespace while also keeping the horizontal scrolling fix.

Resolved Runtime Issues

Use default uid:gid if passwd file does not exist and username is "root"
- The containerd runtime will now default to uid:gid 0:0 if username is "root" but /etc/passwd file does not exist
- This matches the behaviour of the default guardian backend
Prevent retrying on worker error when build is aborted
- Fixed a endless build retry bug
Fix mount issues on containerd
- Set the appropriate permissions for mounts in privileged containers.
- Use the Linux default size for /dev/shm (shared memory) mount.
Bump baggageclaim to 1.9.1 to fix deeply-nested volumes with overlay driver
- This was partially fixed by #5961, but that original fix did not solve the problem in all cases