Planning your Sensu Go deployment

This guide describes various deployment considerations and recommendations, including details related to communication security and common deployment architectures.

What is etcd?

etcd is a key-value store which is used by applications of varying complexity, from simple web apps to Kubernetes. The Sensu backend uses an embedded etcd instance for storing both configuration and event data, so you can get Sensu up and running without external dependencies.

By building atop etcd, Sensu’s backend inherits a number of characteristics that should be considered when planning for a Sensu deployment.

Hardware sizing

Because etcd’s design prioritizes consistency across a cluster, the speed with which write operations can be completed is very important to the performance of a Sensu cluster.

This means that Sensu backend infrastructure should be provisioned to provide sustained IO operations per second (IOPS) appropriate for the rate of monitoring events the system will be required to process.

For more detail, our hardware requirements document describes the minimum and recommended hardware specifications for running the Sensu backend.

Communications security

Whether using a single or multiple Sensu backends in a cluster, communication with the backend’s various network ports (web UI, HTTP API, websocket API, etcd client & peer) occurs in cleartext by default. Encrypting network communications via TLS is highly recommended, and requires both some planning and explicit configuration.

Planning TLS for etcd

The URLs for each member of an etcd cluster are persisted to the database after initialization. As a result, moving a cluster from cleartext to encrypted communications requires resetting the cluster, which destroys all configuration and event data in the database. Therefore, we recommend planning for encryption before initiating a clustered Sensu backend deployment.

WARNING: Reconfiguring a Sensu cluster for TLS post-deployment will require resetting all etcd cluster members, resulting in the loss of all data.

As described in our guide for securing Sensu, the backend uses a shared certificate and key for web UI and agent communications. Communications with etcd can be secured using the same certificate and key; the certificate’s common name or subject alternate names must include the network interfaces and DNS names that will point to those systems.

See our clustering guide and the etcd docs for more info on setup and configuration, including a walk-through for generating TLS certificates for your cluster.

Common Sensu architectures

Depending on your infrastructure and the type of environments you’ll be monitoring, you may use one or a combination of these architectures to best fit your needs.

Single backend using embedded etcd

This architecture requires minimal resources, but provides no redundancy in the event of failure.

Sensu Standalone Architecture

Sensu standalone architecture with embedded etcd

A single backend can later be reconfigured as a member of a cluster, but this operation is destructive – meaning that it requires destroying the existing database.

Use cases

The simplicity of this architecture may make it a good fit for small to medium-sized deployments, such as monitoring a remote office or datacenter, deploying alongside individual auto-scaling groups or in various segments of a logical environment spanning multiple cloud providers.

For example, in environments with unreliable WAN connectivity, having agents connect to a local backend may be more reliable than having those agents connect over WAN or VPN tunnel to a backend running in a central location.

NOTE: Multiple Sensu backends can relay their events to a central backend using the sensu-relay-handler.

Clustered backend with embedded etcd

The embedded etcd databases of multiple Sensu backend instances can be joined together in a cluster, providing increased availability and replication of both configuration and data. Please see our clustering guide for more information.

Sensu Clustered Architecture

Sensu clustered architecture with embedded etcd

Clustering requires an odd number of backend instances. While larger clusters provide better fault tolerance, write performance suffers because data must be replicated across more machines. Following on the advice of the etcd maintainers, clusters of 3, 5 or 7 backends are the only recommended sizes. See the etcd docs for more info.

Scaling cluster performance with Postgres

To achieve the high rate of event processing required by many enterprises, Sensu offers support for Postgres event storage as a licensed feature. See the Datastore reference documentation for details on configuring the Sensu backend to use Postgres for event storage.

Sensu Clustered Architecture

Sensu clustered architecture with embedded etcd and Postgres event storage

In load testing Sensu Go has proven capable of processing 36,000 events per second when using Postgres as the event store. See the sensu-perf project repository for detailed explanation of our testing methodology and results.

Cluster creation and maintenance

Sensu’s embedded etcd supports initial cluster creation via a static list of peer URLs. Once the cluster is created, members can be added or removed using etcdctl tooling. See our clustering guide and the etcd docs for more info.

Networking considerations

Clustered deployments benefit from a fast and reliable network. Ideally they should be co-located in the same network segment with as low latency as possible between all the nodes. Clustering backends across disparate subnets or WAN connections is not recommended.

While a 1GbE is sufficient for common deployments, larger deployments will benefit from 10GbE network allowing for a reduced mean time to recovery.

As the number of agents connected to a backend cluster grows, so will the communication between members of the cluster required for data replication. With this in mind, it is recommended that clusters with a thousand or more agents use a discrete network interface for peer communication.

Load balancing

Although each Sensu agent can be configured with the URLs for multiple backend instances, we recommend that agents be configured for connecting to a load balancer. This approach provides operators with greater control over agent connection distribution and makes it possible to replace members of the backend cluster without requiring updates to agent configuration.

Conversely, the sensuctl command-line utility cannot be configured with multiple backend URLs. Under normal conditions it is desirable for both sensuctl communications and browser access to the web UI to be routed via a load balancer as well.