In a mixed version cluster (e.g. some versions are 3.11.x and some are 3.12.x) during an upgrade, some nodes will support a different set of features, behave differently in certain scenarios, and otherwise not act exactly the same: they are different versions after all.
Feature flags are a mechanism that controls what features are considered to be enabled or available on all cluster nodes. If a feature flag is enabled, so is its associated feature (or behavior). If not then all nodes in the cluster will disable the feature (behavior).
The feature flag subsystem allows RabbitMQ nodes with different versions to determine if they are compatible and then communicate together, despite having different versions and thus potentially having different feature sets or implementation details.
This subsystem was introduced in RabbitMQ 3.8.0 to allow rolling upgrades of cluster members without shutting down the entire cluster.
Feature flags are not meant to be used as a form of cluster configuration. After a successful rolling upgrade, users should enable all feature flags. Each feature flag will become mandatory at some point. For example, RabbitMQ 3.11 requires feature flags introduced in 3.8 to be enabled prior to the upgrade.
For example, RabbitMQ 3.12.x and 3.11.x nodes are compatible as long as no 3.12.x feature flags are enabled.
This subsystem does not guarantee that all future changes in RabbitMQ can be implemented as feature flags and entirely backwards compatible with older release series. Therefore, a future version of RabbitMQ might still require a cluster-wide shutdown for upgrading. Please always read release notes to see if a rolling upgrade to the next minor or major RabbitMQ version is possible.
rabbitmqctl list_feature_flags
rabbitmqctl enable_feature_flag <all | name>
It is also possible to list and enable feature flags from the Management plugin UI, in "Admin > Feature flags".
As covered earlier, the feature flags subsystem's primary goal is to allow upgrades regardless of the version of cluster members, to the extent possible.
Feature flags make it possible to safely perform a rolling upgrade to the next patch or minor release, except if it is stated otherwise in the release notes. Indeed, there are some changes which cannot be implemented as feature flags.
However, note that only upgrading from one minor to the next minor or major is supported. To upgrade from e.g. 3.9.16 to 3.12.3, it is necessary to upgrade to 3.9.29 first, then to the latest 3.10 patch release, then the latest 3.11 release, then 3.12.3. After certain steps in the upgrade process it will also be necessary to enable all stable feature flags available in that version. For example, 3.12.0 is a release that requires all feature flags to be enabled before a node can be upgraded to it.
Likewise if there is one or more minor release branches between the minor version used and the next major release. That might work (i.e. there could be no incompatible changes between major releases), but this scenario is unsupported by design for the following reasons:
The deprecation/removal policy of feature flags is yet to be defined.
When a node starts for the first time, all supported feature flags are enabled by default. When a node is upgraded to a newer version of RabbitMQ, new feature flags are enabled by default if it is a single isolated node, or remain disabled by default if it belongs to a cluster.
To list the feature flags, use rabbitmqctl list_feature_flags:
rabbitmqctl list_feature_flags # => Listing feature flags ... # => name state # => empty_basic_get_metric enabled # => implicit_default_bindings enabled # => quorum_queue enabled
For improved table readability, switch to the pretty_table formatter:
rabbitmqctl -q --formatter pretty_table list_feature_flags \ name state provided_by desc doc_url
which would produce a table that looks like this:
┌───────────────────────────┬─────────┬───────────────────────────┬───────┬────────────┐ │ name │ state │ provided_by │ desc │ doc_url │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ empty_basic_get_metric │ enabled │ rabbitmq_management_agent │ (...) │ │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ implicit_default_bindings │ enabled │ rabbit │ (...) │ │ ├───────────────────────────┼─────────┼───────────────────────────┼───────┼────────────┤ │ quorum_queue │ enabled │ rabbit │ (...) │ http://... │ └───────────────────────────┴─────────┴───────────────────────────┴───────┴────────────┘
As shown in the example above, the list_feature_flags command accepts a list of columns to display. The available columns are:
After upgrading one node or the entire cluster, it will be possible to enable new feature flags. Note that it will be impossible to roll back the version or add a cluster member using the old version once new feature flags are enabled.
To enable a feature flag, use rabbitmqctl enable_feature_flag:
rabbitmqctl enable_feature_flag <name>
To enable all feature flags, use rabbitmqctl enable_feature_flag all:
rabbitmqctl enable_feature_flag <all>
The list_feature_flags command can be used again to verify the feature flags' states. Assuming all feature flags were disabled initially, here is the state after enabling the quorum_queue feature flag:
rabbitmqctl -q --formatter pretty_table list_feature_flags ┌───────────────────────────┬──────────┐ │ name │ state │ ├───────────────────────────┼──────────┤ │ empty_basic_get_metric │ disabled │ ├───────────────────────────┼──────────┤ │ implicit_default_bindings │ disabled │ ├───────────────────────────┼──────────┤ │ quorum_queue │ enabled │ └───────────────────────────┴──────────┘
It is also possible to list and enable feature flags from the Management Plugin UI, in "Admin > Feature flags":
It is impossible to disable a feature flag once it is enabled.
By default a new and unclustered node will start with all supported feature flags enabled, but this setting can be overridden. There are two ways to configure the list of feature flags to enable out-of-the-box when starting a node for the first time:
Using the RABBITMQ_FEATURE_FLAGS environment variable:
RABBITMQ_FEATURE_FLAGS=quorum_queue,implicit_default_bindings
Using the forced_feature_flags_on_init configuration parameter:
{rabbit, [{forced_feature_flags_on_init, [quorum_queue, implicit_default_bindings]}]}
The environment variable has precedence over the configuration parameter.
After their initial introduction into RabbitMQ, feature flags are optional, that is, they only surve the purpose of allowing for a safe rolling cluster upgrade. Some feature flags do not have to be enabled unless a specific feature is used.
Over time, however, features become more mature, and future development of RabbitMQ assumes that a certain set of features is available and can be relied on by the users and developers alike. When that happens, feature flags graduate to core (required) features in the next minor feature release.
It is very important to enable all feature flags after performing a rolling cluster upgrade: in the future these flags will become mandatory, and proactively enabling them will allow for a smoother upgrade experience in the future.
The feature flags listed below are provided by RabbitMQ core or one of the tier-1 plugins bundled with RabbitMQ.
Column Required shows the RabbitMQ version before which a feature flag MUST have been enabled. For example, if a feature flag is required in 3.12.0, this feature flag must be enabled in 3.11.x (or earlier) before upgrading to 3.12.x. Otherwise, if a RabbitMQ node is upgraded to 3.12.x while this feature flag is disabled, the RabbitMQ node will refuse to start in 3.12.x.
Column Stable shows the RabbitMQ version that introduced a feature flag. For example, if a feature flag is stable in 3.11.0, that feature flag SHOULD be enabled promptly after upgrading all nodes in a RabbitMQ cluster to version 3.11.x.
The following feature flags are provided by RabbitMQ core.
Required | Stable | Feature flag name | Description |
---|---|---|---|
3.12.0 | restart_streams | Support for restarting streams with optional preferred next leader argument. Used to implement stream leader rebalancing | |
3.12.0 | stream_sac_coordinator_unblock_group | Bug fix to unblock a group of consumers in a super stream partition | |
3.12.0 | 3.11.0 | classic_mirrored_queue_version | Support setting version for classic mirrored queues |
3.12.0 | 3.11.0 | direct_exchange_routing_v2 | v2 direct exchange routing implementation |
3.12.0 | 3.11.0 | feature_flags_v2 | Feature flags subsystem v2 |
3.12.0 | 3.11.0 | listener_records_in_ets | Store listener records in ETS instead of Mnesia |
3.12.0 | 3.11.0 | stream_single_active_consumer | Single active consumer for streams |
3.12.0 | 3.11.0 | tracking_records_in_ets | Store tracking records in ETS instead of Mnesia |
3.12.0 | 3.10.9 | classic_queue_type_delivery_support | Bug fix for classic queue deliveries using mixed versions |
3.12.0 | 3.9.0 | stream_queue | Support queues of type stream |
3.11.0 | 3.8.10 | user_limits | Configure connection and channel limits for a user |
3.11.0 | 3.8.8 | maintenance_mode_status | Maintenance mode status |
3.11.0 | 3.8.0 | implicit_default_bindings | Default bindings are now implicit, instead of being stored in the database |
3.11.0 | 3.8.0 | quorum_queue | Support queues of type quorum |
3.11.0 | 3.8.0 | virtual_host_metadata | Virtual host metadata (description, tags, etc.) |
The following feature flags are provided by plugin rabbimq_management_agent.
Required | Stable | Feature flag name | Description |
---|---|---|---|
3.12.0 | 3.8.10 | drop_unroutable_metric | Count unroutable publishes to be dropped in stats |
3.12.0 | 3.8.10 | empty_basic_get_metric | Count AMQP basic.get on empty queues in stats |
The following feature flags are provided by plugin rabbimq_mqtt.
Required | Stable | Feature flag name | Description |
---|---|---|---|
3.12.0 | delete_ra_cluster_mqtt_node | Delete Ra cluster mqtt_node since MQTT client IDs are tracked locally | |
3.12.0 | rabbit_mqtt_qos0_queue | Support pseudo queue type for MQTT QoS 0 subscribers omitting a queue process |
There are two times when an operator has to consider feature flags:
A node compares its own list of feature flags with remote nodes' list of feature flags to determine if it can join a cluster. The rules are defined as:
It is important to understand the difference between enabled and supported:
If one of those two conditions is not verified, the node cannot join or re-join the cluster.
However, if it can join the cluster, the state of enabled feature flags is synchronized between nodes: if a feature flag is enabled on one node, it is enabled on all other nodes.
The feature flags subsystem covers inter-node communication only. This means the following scenarios are not covered and may not work as initially expected.
Controlling a remote node with rabbitmqctl is only supported if the remote node is running the same version of RabbitMQ asrabbitmqctl.
If CLI tools from a different minor/major version of RabbitMQ is used on a remote node, they may fail to work as expected or even have unexpected side effects on the node.
If a request sent to the HTTP API exposed by the Management plugin goes through a load balancer, including one from the management plugin UI, the API's behavior and its response may be different, depending on the version of the node which handled the request. This is exactly the same if the domain name of the HTTP API resolves to multiple IP addresses.
This situation may happen during a rolling upgrade if the management UI is open in a browser with periodic automatic refresh.
For example, if the management UI was loaded from a RabbitMQ 3.11.x node but it then queries a RabbitMQ 3.12.x node, the JavaScript code running in the browser may fail with exceptions due to HTTP API changes.
When a feature flag is enabled with rabbitmqctl, here is what happens internally:
As an operator, the most important part of this procedure to remember is that if the migration takes time, some components and thus some operations in RabbitMQ might be blocked during the migration.
When working on a plugin or a RabbitMQ core contribution, feature flags should be used to make the new version of the code compatible with older versions of RabbitMQ.
It is developer's responsibility to look at the list of existing and future (i.e. those added to the main branch) feature flags and see if the new code can be adapted to take advantage of them.
Here is an example. When developing a plugin which used to use the #amqqueue{} record defined in rabbit_common/include/rabbit.hrl, the plugin has to be adapted to use the new amqqueue API which hides the previous record (which is private now). However, there is no need to query feature flags for that: the plugin will be ABI-compatible (i.e. no need to recompile it) with RabbitMQ 3.8.0 and later. It should also be ABI-compatible with RabbitMQ 3.7.x once the amqqueue appears in that branch.
However if the plugin targets quorum queues introduced in RabbitMQ 3.8.0, it may have to query feature flags to determine what it can do. For instance, can it declare a quorum queue? Can it even expect the new fields added to amqqueue as part of the quorum queues implementation?
If the plugin carefully checks feature flags to avoid any incorrect expectations, it will be compatible with many versions of RabbitMQ: the user will not have to recompile anything or download another version-specific copy of the plugin.
If a plugin or core broker change modifies one of the following aspects:
Then compatibility with older versions of RabbitMQ becomes a concern. This is where a new feature flag can help ensure a smoother upgrade experience.
The two most important parts of a feature flag are:
The declaration is a module attribute which looks like this:
-rabbit_feature_flag( {quorum_queue, #{desc => "Support queues of type quorum", doc_url => "https://www.rabbitmq.com/quorum-queues.html", stability => stable, migration_fun => {?MODULE, quorum_queue_migration} }}).
The migration function is a stateless function which looks like this:
quorum_queue_migration(FeatureName, _FeatureProps, enable) -> Tables = ?quorum_queue_tables, rabbit_table:wait(Tables), Fields = amqqueue:fields(amqqueue_v2), migrate_to_amqqueue_with_type(FeatureName, Tables, Fields); quorum_queue_migration(_FeatureName, _FeatureProps, is_enabled) -> Tables = ?quorum_queue_tables, rabbit_table:wait(Tables), Fields = amqqueue:fields(amqqueue_v2), mnesia:table_info(rabbit_queue, attributes) =:= Fields andalso mnesia:table_info(rabbit_durable_queue, attributes) =:= Fields.
More implementation docs can be found in the rabbit_feature_flags module source code.
Erlang's edoc reference can be generated locally from a RabbitMQ repository clone or source archive:
gmake edoc # => ... Ignore warnings and errors... # Now open `doc/rabbit_feature_flags.html` in the browser.
When a feature or behavior depends on a feature flag (either in the core broker or in a plugin), the associated testsuites must be adapted to take this feature flag into account. It means that before running the actual testcase, the setup code must verify if the feature flag is supported and either enable it if it is, or skip the testcase. This is the same for setup code running at the group or suite level.
There are helper functions in rabbitmq-ct-heleprs to ease that check. Here is an example, taken from the dynamic_qq_SUITE.erl testsuite in rabbitmq-server:
init_per_testcase(Testcase, Config) -> % (...) % 1. % The broker or cluster is started: we rely on this to query feature % flags. Config1 = rabbit_ct_helpers:run_steps( Config, rabbit_ct_broker_helpers:setup_steps() ++ rabbit_ct_client_helpers:setup_steps()), % 2. % We try to enable the `quorum_queue` feature flag. The helper is % responsible for checking if the feature flag is supported and % enabling it. case rabbit_ct_broker_helpers:enable_feature_flag(Config1, quorum_queue) of ok -> % The feature flag is enabled at this point. The setup can % continue to play with `Config1` and the cluster. Config1; Skip -> % The feature flag is unavailable/unsupported. The setup % calls `end_per_testcase()` to stop the node/cluster and % skips the testcase. end_per_testcase(Testcase, Config1), Skip end.
It is possible to run testsuites locally in the context of a mixed-version cluster. If configured to do so, rabbitmq-ct-helpers will use a second version of RabbitMQ to start half of the nodes when starting a cluster:
To run a testsuite in the context of a mixed-version cluster:
Clone the rabbitmq-public-umbrella repository and checkout the appropriate branch or tag. This will be the secondary Umbrella. In this example, the v3.11.x branch is used:
git clone https://github.com/rabbitmq/rabbitmq-server.git secondary-umbrella cd secondary-umbrella git checkout v3.11.x make co
Currently, when using the v3.11.x branch, deps/rabbit_common and deps/rabbit must use the v3.11.x-versions-compatibility branch.
Compile RabbitMQ or the plugin being tested in the secondary Umbrella. The rabbitmq-federation plugin is used as an example:
cd secondary-umbrella/deps/rabbitmq_federation make dist
Go to RabbitMQ or the same plugin in the primary copy:
cd /path/to/primary/rabbitmq_federation
Run the testsuite. Here, two environment variables are specified to configure the "mixed-version cluster" mode:
SECONDARY_UMBRELLA=/path/to/secondary-umbrella \ RABBITMQ_FEATURE_FLAGS= \ make tests
The first environment variable, SECONDARY_UMBRELLA, tells rabbitmq-ct-helpers where to find the secondary Umbrella, as the name suggests. This is how the mixed-version cluster mode is enabled.
The secondary environment variable, RABBITMQ_FEATURE_FLAGS, is set to the empty string and tells RabbitMQ to start with all feature flags disabled: this is mandatory to have a newer node compatible with an older one.
If you have questions about the contents of this guide or any other topic related to RabbitMQ, don't hesitate to ask them using GitHub Discussions or our community Discord server.
If you'd like to contribute an improvement to the site, its source is available on GitHub. Simply fork the repository and submit a pull request. Thank you!