Run a Sensu cluster
To deploy Sensu for use outside of a local development environment, first decide whether you want to run a Sensu cluster.
A Sensu cluster is a group of at least three sensu-backend nodes, each connected to a shared database provided either by Sensu’s embedded etcd or an external etcd cluster. Creating a Sensu cluster ultimately configures an etcd cluster.
Clustering improves Sensu’s availability, reliability, and durability. It allows you to absorb the loss of a backend node, prevent data loss, and distribute the network load of agents. If you have a healthy clustered backend, you only need to make Sensu API calls to any one of the cluster members. The cluster protocol will replicate your changes to all cluster members.
Scaling a single backend to a cluster or migrating a cluster from cleartext HTTP to encrypted HTTPS without downtime can require a number of tedious steps. For this reason, we recommend that you decide whether your deployment will require clustering as part of your initial planning effort.
No matter whether you deploy a single backend or a clustered configuration, begin by securing Sensu with transport layer security (TLS). The first step in setting up TLS is to generate the certificates you need. Then, follow our Secure Sensu guide to make Sensu production-ready.
After you’ve secured Sensu, continue reading this document to set up a clustered configuration.
NOTE: We recommend using a load balancer to evenly distribute agent connections across a cluster.
Configure a cluster
The sensu-backend arguments for its store mirror the etcd configuration flags, but the Sensu flags are prefixed with
For more detailed descriptions of the different arguments, see the etcd docs or the Sensu backend reference.
You can configure a Sensu cluster in a couple different ways — we’ll show you a few below — but you should adhere to some etcd cluster guidelines as well:
The recommended etcd cluster size is 3, 5 or 7, which is decided by the fault tolerance requirement. A 7-member cluster can provide enough fault tolerance in most cases. While a larger cluster provides better fault tolerance, the write performance reduces since data needs to be replicated to more machines. It is recommended to have an odd number of members in a cluster. Having an odd cluster size doesn’t change the number needed for majority, but you gain a higher tolerance for failure by adding the extra member. etcd2 Admin Guide
We also recommend using stable platforms to support your etcd instances (see etcd’s supported platforms).
NOTE: If a cluster member is started before it is configured to join a cluster, the member will persist its prior configuration to disk.
For this reason, you must remove any previously started member’s etcd data by stopping sensu-backend and deleting the contents of
/var/lib/sensu/sensu-backend/etcd before proceeding with cluster configuration.
If you prefer to stand up your Sensu cluster within Docker containers, check out the Sensu Go Docker configuration. This configuration defines three sensu-backend containers and three sensu-agent containers.
Traditional computer instance
NOTE: The remainder of this guide describes on-disk configuration.
If you are using an ephemeral computer instance, you can use
sensu-backend start --help to see examples of etcd command line flags.
The configuration file entries in the rest of this guide translate to
Sensu backend configuration
The examples in this section are configuration snippets from
/etc/sensu/backend.yml using a three-node cluster.
The nodes are named
backend-3 with IP addresses
NOTE: This backend configuration assumes you have set up and installed the sensu-backend on all the nodes used in your cluster. Follow the Install Sensu guide if you have not already done this.
## # store configuration for backend-1/10.0.0.1 ## etcd-advertise-client-urls: "http://10.0.0.1:2379" etcd-listen-client-urls: "http://10.0.0.1:2379" etcd-listen-peer-urls: "http://0.0.0.0:2380" etcd-initial-cluster: "backend-1=http://10.0.0.1:2380,backend-2=http://10.0.0.2:2380,backend-3=http://10.0.0.3:2380" etcd-initial-advertise-peer-urls: "http://10.0.0.1:2380" etcd-initial-cluster-state: "new" etcd-initial-cluster-token: "" etcd-name: "backend-1"
## # store configuration for backend-2/10.0.0.2 ## etcd-advertise-client-urls: "http://10.0.0.2:2379" etcd-listen-client-urls: "http://10.0.0.2:2379" etcd-listen-peer-urls: "http://0.0.0.0:2380" etcd-initial-cluster: "backend-1=http://10.0.0.1:2380,backend-2=http://10.0.0.2:2380,backend-3=http://10.0.0.3:2380" etcd-initial-advertise-peer-urls: "http://10.0.0.2:2380" etcd-initial-cluster-state: "new" etcd-initial-cluster-token: "" etcd-name: "backend-2"
## # store configuration for backend-3/10.0.0.3 ## etcd-advertise-client-urls: "http://10.0.0.3:2379" etcd-listen-client-urls: "http://10.0.0.3:2379" etcd-listen-peer-urls: "http://0.0.0.0:2380" etcd-initial-cluster: "backend-1=http://10.0.0.1:2380,backend-2=http://10.0.0.2:2380,backend-3=http://10.0.0.3:2380" etcd-initial-advertise-peer-urls: "http://10.0.0.3:2380" etcd-initial-cluster-state: "new" etcd-initial-cluster-token: "" etcd-name: "backend-3"
After you configure each node as described in these examples, start each sensu-backend:
sudo systemctl start sensu-backend
Add Sensu agents to clusters
Each Sensu agent should have the following entries in
/etc/sensu/agent.yml to ensure the agent is aware of all cluster members.
This allows the agent to reconnect to a working backend if the backend it is currently connected to goes into an unhealthy state.
## # backend-url configuration for all agents connecting to cluster over ws ## backend-url: - "ws://10.0.0.1:8081" - "ws://10.0.0.2:8081" - "ws://10.0.0.3:8081"
You should now have a highly available Sensu cluster! Confirm cluster health and try other cluster management commands with sensuctl.
Manage and monitor clusters with sensuctl
Sensuctl includes several commands to help you manage and monitor your cluster.
sensuctl cluster -h for additional help information.
Get cluster health status
Get cluster health status and etcd alarm information:
sensuctl cluster health ID Name Error Healthy ────────────────── ─────────── ─────────────────────────────────────────────────── ───────── a32e8f613b529ad4 backend-1 true c3d9f4b8d0dd1ac9 backend-2 dial tcp 10.0.0.2:2379: connect: connection refused false c8f63ae435a5e6bf backend-3 true
Add a cluster member
Add a new member node to an existing cluster:
sensuctl cluster member-add backend-4 https://10.0.0.4:2380 added member 2f7ae42c315f8c2d to cluster ETCD_NAME="backend-4" ETCD_INITIAL_CLUSTER="backend-4=https://10.0.0.4:2380,backend-1=https://10.0.0.1:2380,backend-2=https://10.0.0.2:2380,backend-3=https://10.0.0.3:2380" ETCD_INITIAL_CLUSTER_STATE="existing"
List cluster members
List the ID, name, peer URLs, and client URLs of all nodes in a cluster:
sensuctl cluster member-list ID Name Peer URLs Client URLs ────────────────── ─────────── ───────────────────────── ───────────────────────── a32e8f613b529ad4 backend-1 https://10.0.0.1:2380 https://10.0.0.1:2379 c3d9f4b8d0dd1ac9 backend-2 https://10.0.0.2:2380 https://10.0.0.2:2379 c8f63ae435a5e6bf backend-3 https://10.0.0.3:2380 https://10.0.0.3:2379 2f7ae42c315f8c2d backend-4 https://10.0.0.4:2380 https://10.0.0.4:2379
Remove a cluster member
Remove a faulty or decommissioned member node from a cluster:
sensuctl cluster member-remove 2f7ae42c315f8c2d Removed member 2f7ae42c315f8c2d from cluster
Replace a faulty cluster member
To replace a faulty cluster member to restore a cluster’s health, start by running
sensuctl cluster health to identify the faulty cluster member.
For a faulty cluster member, the
Error column will include an error message and the
Healthy column will list
In this example, cluster member
backend-4 is faulty:
sensuctl cluster health ID Name Error Healthy ────────────────── ─────────── ─────────────────────────────────────────────────── ───────── a32e8f613b529ad4 backend-1 true c3d9f4b8d0dd1ac9 backend-2 true c8f63ae435a5e6bf backend-3 true 2f7ae42c315f8c2d backend-4 dial tcp 10.0.0.4:2379: connect: connection refused false
Then, delete the faulty cluster member.
To continue this example, you will delete cluster member
backend-4 using its ID:
sensuctl cluster member-remove 2f7ae42c315f8c2d Removed member 2f7ae42c315f8c2d from cluster
Finally, add a newly created member to the cluster.
You can use the same name and IP address as the faulty member you deleted, with one change to the configuration: specify the
etcd-advertise-client-urls: "http://10.0.0.4:2379" etcd-listen-client-urls: "http://10.0.0.4:2379" etcd-listen-peer-urls: "http://0.0.0.0:2380" etcd-initial-cluster: "backend-1=http://10.0.0.1:2380,backend-2=http://10.0.0.2:2380,backend-3=http://10.0.0.3:2380,backend-4=http://10.0.0.4:2380" etcd-initial-advertise-peer-urls: "http://10.0.0.4:2380" etcd-initial-cluster-state: "existing" etcd-initial-cluster-token: "" etcd-name: "backend-4"
If replacing the faulty cluster member does not resolve the problem, see the etcd operations guide for more information.
Update a cluster member
Update the peer URLs of a member in a cluster:
sensuctl cluster member-update c8f63ae435a5e6bf https://10.0.0.4:2380 Updated member with ID c8f63ae435a5e6bf in cluster
See Secure Sensu for information about cluster security.
Use an external etcd cluster
To use Sensu with an external etcd cluster, you must have etcd 3.3.2 or newer. To stand up an external etcd cluster, follow etcd’s clustering guide using the same store configuration.
etcd \ --listen-client-urls "https://10.0.0.1:2379" \ --advertise-client-urls "https://10.0.0.1:2379" \ --listen-peer-urls "https://10.0.0.1:2380" \ --initial-cluster "backend-1=https://10.0.0.1:2380,backend-2=https://10.0.0.2:2380,backend-3=https://10.0.0.3:2380" \ --initial-advertise-peer-urls "https://10.0.0.1:2380" \ --initial-cluster-state "new" \ --name "backend-1" \ --trusted-ca-file=./ca.pem \ --cert-file=./backend-1.pem \ --key-file=./backend-1-key.pem \ --client-cert-auth \ --peer-trusted-ca-file=./ca.pem \ --peer-cert-file=./backend-1.pem \ --peer-key-file=./backend-1-key.pem \ --peer-client-cert-auth \ --auto-compaction-mode revision \ --auto-compaction-retention 2
auto-compaction-retention flags are important.
Without these settings, your database may quickly reach etcd’s maximum database size limit.
To tell Sensu to use this external etcd data source, add the
--no-embed-etcd to the original configuration, along with the path to a client certificate created using your CA:
sensu-backend start \ --etcd-trusted-ca-file=./ca.pem \ --etcd-cert-file=./client.pem \ --etcd-key-file=./client-key.pem \ --etcd-client-urls=https://10.0.0.1:2379 https://10.0.0.2:2379 https://10.0.0.3:2379 \ --no-embed-etcd
etcd-cient-urls value must be a space-delimited list or a YAML array.
See the etcd failure modes documentation for information about cluster failure modes.
See the etcd recovery guide for disaster recovery information.