In a mixed version cluster (e.g. some versions are 3.7.x and some are 3.8.x) some nodes will support a different set of features, behave differently in certain scenarios, and otherwise not act exactly the same: they are different versions after all.
Feature flags are a mechanism that controls what features are considered to be enabled or available on all cluster nodes. If a feature flag is enabled, so is its associated feature (or behavior). If not then all nodes in the cluster will disable the feature (behavior).
The feature flag subsystem allows RabbitMQ nodes with different versions to determine if they are compatible and then communicate together, regardless of their version.
This subsystem was introduced in RabbitMQ 3.8.0 to allow rolling upgrades of cluster members without shutting down the entire cluster.
Feature flags are not meant to be used as a form of cluster configuration. After a successful rolling upgrade, users should enable all feature flags.
Each feature flag will become mandatory at some point. For example, RabbitMQ 3.11 requires feature flags introduced in 3.8 to be enabled prior to the upgrade.
RabbitMQ 3.7.x and 3.8.x nodes are compatible as long as no 3.8.x feature flags are enabled.
This subsystem does not guarantee that all future changes in RabbitMQ can be implemented as feature flags and entirely backwards compatible with older release series. Therefore, a future version of RabbitMQ might still require a cluster-wide shutdown for upgrading.
Please always read release notes to see if a rolling upgrade to the next minor or major RabbitMQ version is possible.
rabbitmqctl list_feature_flags
rabbitmqctl enable_feature_flag <all | name>
It is also possible to list and enable feature flags from the Management plugin UI, in "Admin > Feature flags".
As covered earlier, the feature flags subsystem's primary goal is to allow upgrades regardless of the version of RabbitMQ, if possible.
Therefore, as of RabbitMQ 3.8.0, it will be possible to upgrade to the next patch, minor or major release, except if it is stated otherwise in the release notes. Indeed, there are some changes which cannot be implemented as feature flags.
It is also possible to upgrade from RabbitMQ 3.7.x to 3.8.x. Indeed, RabbitMQ 3.7.x does not have the feature flags subsystem and RabbitMQ 3.8.x considers that a 3.7.x node has an empty list of feature flags. Therefore, as long as the 3.8.x node has all its feature flags disabled, it is compatible with a 3.7.x node.
However, note that only upgrading from one minor to the next minor or major is supported. To upgrade from e.g. 3.6.16 to 3.8.7, it is necessary to upgrade to 3.7.28 first. Likewise if there is one or more minor release branches between the minor version used and the next major release. That might work (i.e. there could be no incompatible changes between major releases), but this scenario is unsupported by design for the following reasons:
The deprecation/removal policy of feature flags is yet to be defined.
When a node starts for the first time, all supported feature flags are enabled by default. When a node is upgraded to a newer version of RabbitMQ, new feature flags are enabled by default if it is a single isolated node, or remain disabled by default if it belongs to a cluster.
To list the feature flags, use rabbitmqctl list_feature_flags
:
rabbitmqctl list_feature_flags # => Listing feature flags ... # => name state # => empty_basic_get_metric enabled # => implicit_default_bindings enabled # => quorum_queue enabled
For improved table readability, switch to the pretty_table
formatter:
rabbitmqctl -q --formatter pretty_table list_feature_flags \ name state provided_by desc doc_url
which would produce a table that looks like this:
┌───────────────────────────┬─────────┬───────────────────────────┬───────┬────────────┐ │ name │ state │ provided_by │ desc │ doc_url │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ empty_basic_get_metric │ enabled │ rabbitmq_management_agent │ (...) │ │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ implicit_default_bindings │ enabled │ rabbit │ (...) │ │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ quorum_queue │ enabled │ rabbit │ (...) │ http://... │ └───────────────────────────┴─────────┴───────────────────────────┴───────┴────────────┘
As shown in the example above, the list_feature_flags
command accepts a list of columns to display. The available columns are:
name
: the name of the feature flag.state
: enabled or disabled if the feature flag is enabled or disabled, unsupported if one or more nodes in the cluster do not know this feature flag (and therefore it cannot be enabled).provided_by
: the RabbitMQ component or plugin which provides the feature flag.desc
: the description of the feature flag.doc_url
: the URL to a webpage to learn more about the feature flag.stability
: indicates if the feature flag is stable or experimental.After upgrading one node or the entire cluster, it will be possible to enable new feature flags. Note that it will be impossible to roll back the version or add a cluster member using the old version once new feature flags are enabled.
To enable a feature flag, use rabbitmqctl enable_feature_flag
:
rabbitmqctl enable_feature_flag <name>
To enable all feature flags, use rabbitmqctl enable_feature_flag all
:
rabbitmqctl enable_feature_flag <all>
The list_feature_flags
command can be used again to verify the feature flags' states. Assuming all feature flags were disabled initially, here is the state after enabling the quorum_queue
feature flag:
rabbitmqctl -q --formatter pretty_table list_feature_flags ┌───────────────────────────┬──────────┐ │ name │ state │ ├───────────────────────────┼──────────┤ │ empty_basic_get_metric │ disabled │ ├───────────────────────────┼──────────┤ │ implicit_default_bindings │ disabled │ ├───────────────────────────┼──────────┤ │ quorum_queue │ enabled │ └───────────────────────────┴──────────┘
It is also possible to list and enable feature flags from the Management Plugin UI, in "Admin > Feature flags":
It is impossible to disable a feature flag once it is enabled.
By default a new and unclustered node will start with all supported feature flags enabled, but this setting can be overridden. There are two ways to configure the list of feature flags to enable out-of-the-box when starting a node for the first time:
RABBITMQ_FEATURE_FLAGS
environment variable:RABBITMQ_FEATURE_FLAGS=quorum_queue,implicit_default_bindings
forced_feature_flags_on_init
configuration parameter:{rabbit, [{forced_feature_flags_on_init, [quorum_queue, implicit_default_bindings]}]}
The environment variable has precedence over the configuration parameter.
The feature flags listed below are those provided by RabbitMQ core or one of the tier-1 plugins bundled with RabbitMQ.
Feature flag name | Description | Lifecycle | ||||
---|---|---|---|---|---|---|
empty_basic_get_metric | Count AMQP 0-9-1 basic.get issued on empty queues in statistics. |
|
||||
implicit_default_bindings | Cleans up explicit default exchange bindings now that they are managed implicitly. |
|
||||
quorum_queue | Enables quorum queue type. |
|
||||
stream_queue | Enables streams. |
|
||||
drop_unroutable_metric | Dropped unroutable message metrics. |
|
||||
maintenance_mode_status | Enable maintenance mode, so nodes can be drained for maintenance or upgrade operations. |
|
||||
user_limits | Enable connection and queue limits associated with a given user. |
|
||||
virtual_host_metadata | Enable virtual host metadata. |
|
There are two times when an operator has to consider feature flags:
A node compares its own list of feature flags with remote nodes' list of feature flags to determine if it can join a cluster. The rules are defined as:
It is important to understand the difference between enabled and supported:
If one of those two conditions is not verified, the node cannot join or re-join the cluster.
However, if it can join the cluster, the state of enabled feature flags is synchronized between nodes: if a feature flag is enabled on one node, it is enabled on all other nodes.
The feature flags subsystem covers inter-node communication only. This means the following scenarios are not covered and may not work as initially expected.
rabbitmqctl
on a remote nodeControlling a remote node with rabbitmqctl
is only supported if the remote node is running the same version of RabbitMQ than rabbitmqctl
comes from.
If CLI tools from a different minor/major version of RabbitMQ is used on a remote node, they may fail to work as expected or even have unexpected side effects on the node.
If a request sent to the HTTP API exposed by the Management plugin goes through a load balancer, including one from the management plugin UI, the API's behavior and its response may be different, depending on the version of the node which handled the request. This is exactly the same if the domain name of the HTTP API resolves to multiple IP addresses.
This situation may happen during a rolling upgrade if the management UI is open in a browser with periodic automatic refresh.
For example, if the management UI was loaded from a RabbitMQ 3.7.x node but it then queries a RabbitMQ 3.8.x node, the JavaScript code running in the browser may fail with exceptions due to HTTP API changes.
When a feature flag is enabled with rabbitmqctl
, here is what happens internally:
As an operator, the most important part of this procedure to remember is that if the migration takes time, some components and thus some operations in RabbitMQ might be blocked during the migration.
When working on a plugin or a RabbitMQ core contribution, feature flags should be used to make the new version of the code compatible with older versions of RabbitMQ.
It is developer's responsibility to look at the list of existing and future (i.e. those added to the main
branch) feature flags and see if the new code can be adapted to take advantage of them.
Here is an example. When developing a plugin which used to use the #amqqueue{}
record defined in rabbit_common/include/rabbit.hrl
, the plugin has to be adapted to use the new amqqueue
API which hides the previous record (which is private now). However, there is no need to query feature flags for that: the plugin will be ABI-compatible (i.e. no need to recompile it) with RabbitMQ 3.8.0 and later. It should also be ABI-compatible with RabbitMQ 3.7.x once the amqqueue
appears in that branch.
However if the plugin targets quorum queues introduced in RabbitMQ 3.8.0, it may have to query feature flags to determine what it can do. For instance, can it declare a quorum queue? Can it even expect the new fields added to amqqueue
as part of the quorum queues implementation?
If the plugin carefully checks feature flags to avoid any incorrect expectations, it will be compatible with many versions of RabbitMQ: the user will not have to recompile anything or download another version-specific copy of the plugin.
If a plugin or core broker change modifies one of the following aspects:
Then compatibility with older versions of RabbitMQ becomes a concern. This is where a new feature flag can help ensure a smoother upgrade experience.
The two most important parts of a feature flag are:
The declaration is a module attribute which looks like this:
-rabbit_feature_flag( {quorum_queue, #{desc => "Support queues of type quorum", doc_url => "https://www.rabbitmq.com/quorum-queues.html", stability => stable, migration_fun => {?MODULE, quorum_queue_migration} }}).
The migration function is a stateless function which looks like this:
quorum_queue_migration(FeatureName, _FeatureProps, enable) -> Tables = ?quorum_queue_tables, rabbit_table:wait(Tables), Fields = amqqueue:fields(amqqueue_v2), migrate_to_amqqueue_with_type(FeatureName, Tables, Fields); quorum_queue_migration(_FeatureName, _FeatureProps, is_enabled) -> Tables = ?quorum_queue_tables, rabbit_table:wait(Tables), Fields = amqqueue:fields(amqqueue_v2), mnesia:table_info(rabbit_queue, attributes) =:= Fields andalso mnesia:table_info(rabbit_durable_queue, attributes) =:= Fields.
More implementation docs can be found in the rabbit_feature_flags
module source code.
Erlang's edoc
reference can be generated locally from a RabbitMQ repository clone or source archive:
gmake edoc # => ... Ignore warnings and errors... # Now open `doc/rabbit_feature_flags.html` in the browser.
When a feature or behavior depends on a feature flag (either in the core broker or in a plugin), the associated testsuites must be adapted to take this feature flag into account. It means that before running the actual testcase, the setup code must verify if the feature flag is supported and either enable it if it is, or skip the testcase. This is the same for setup code running at the group or suite level.
There are helper functions in rabbitmq-ct-heleprs
to ease that check. Here is an example, taken from the dynamic_qq_SUITE.erl
testsuite in rabbitmq-server:
init_per_testcase(Testcase, Config) -> % (...) % 1. % The broker or cluster is started: we rely on this to query feature % flags. Config1 = rabbit_ct_helpers:run_steps( Config, rabbit_ct_broker_helpers:setup_steps() ++ rabbit_ct_client_helpers:setup_steps()), % 2. % We try to enable the `quorum_queue` feature flag. The helper is % responsible for checking if the feature flag is supported and % enabling it. case rabbit_ct_broker_helpers:enable_feature_flag(Config1, quorum_queue) of ok -> % The feature flag is enabled at this point. The setup can % continue to play with `Config1` and the cluster. Config1; Skip -> % The feature flag is unavailable/unsupported. The setup % calls `end_per_testcase()` to stop the node/cluster and % skips the testcase. end_per_testcase(Testcase, Config1), Skip end.
It is possible to run testsuites locally in the context of a mixed-version cluster. If configured to do so, rabbitmq-ct-helpers
will use a second version of RabbitMQ to start half of the nodes when starting a cluster:
rabbitmq-ct-helpers
)To run a testsuite in the context of a mixed-version cluster:
Clone the rabbitmq-public-umbrella
repository and checkout the appropriate branch or tag. This will be the secondary Umbrella. In this example, the v3.7.x
branch is used:
git clone https://github.com/rabbitmq/rabbitmq-public-umbrella.git secondary-umbrella cd secondary-umbrella git checkout v3.7.x make co
Currently, when using the `v3.7.x` branch, `deps/rabbit_common` and `deps/rabbit` must use the `v3.7.x-versions-compatibility` branch.
Compile RabbitMQ or the plugin being tested in the secondary Umbrella. The rabbitmq-federation
plugin is used as an example:
cd secondary-umbrella/deps/rabbitmq_federation make dist
Go to RabbitMQ or the same plugin in the primary copy:
cd /path/to/primary/rabbitmq_federation
Run the testsuite. Here, two environment variables are specified to configure the "mixed-version cluster" mode:
SECONDARY_UMBRELLA=/path/to/secondary-umbrella \ RABBITMQ_FEATURE_FLAGS= \ make tests
The first environment variable, SECONDARY_UMBRELLA
, tells rabbitmq-ct-helpers
where to find the secondary Umbrella, as the name suggests. This is how the mixed-version cluster mode is enabled.
The secondary environment variable, RABBITMQ_FEATURE_FLAGS
, is set to the empty string and tells RabbitMQ to start with all feature flags disabled: this is mandatory to have a newer node compatible with an older one.