mirror of https://github.com/hashicorp/consul
Daniel Nephin
3 years ago
5 changed files with 138 additions and 22 deletions
@ -1,4 +1,4 @@
|
||||
{ |
||||
"theme": "default", |
||||
"themeCSS": ".node rect { fill: rgb(220, 71, 125); stroke-width: 1; stroke: black; } .node .label { color: white; } " |
||||
"themeCSS": ".node rect { fill: rgb(220, 71, 125); stroke-width: 1; stroke: black; } .node .label { color: white; }; .cluster rect { fill: #f0f0f0; stroke-width: 1px; stroke: #333}; .edgeLabel { background-color: #f0f0f0; }" |
||||
} |
||||
|
@ -1,15 +1,108 @@
|
||||
# Persistence |
||||
# Cluster Persistence |
||||
|
||||
- hashicorp/raft |
||||
- state.Store - hashicorp/go-memdb |
||||
- FSM |
||||
- boltdb - https://github.com/boltdb/bolt (https://github.com/etcd-io/bbolt) |
||||
- snapshot and restore |
||||
The cluser persistence subsystem runs entirely in Server Agents. It handles both read and |
||||
write requests from the [RPC] subsystem. See the [Consul Architecture Guide] for an |
||||
introduction to the Consul deployment architecture and the [Consensus Protocol] used by |
||||
the cluster persistence subsystem. |
||||
|
||||
[RPC]: ../rpc |
||||
[Consul Architecture Guide]: https://www.consul.io/docs/architecture |
||||
[Consensus Protocol]: https://www.consul.io/docs/architecture/consensus |
||||
|
||||
## Additional documentation |
||||
|
||||
The following links provide more detail about about cluster persistence. |
||||
![Overview](./overview.svg) |
||||
|
||||
* [Consul Architecture Guide](https://www.consul.io/docs/architecture) |
||||
<sup>[source](./overview.mmd)</sup> |
||||
|
||||
|
||||
## Raft and FSM |
||||
|
||||
[hashicorp/raft] is at the core of cluster persistence. Raft requires an [FSM], a |
||||
finite-state machine implementation, to persist state changes. The Consul FSM is |
||||
implemented in [agent/consul/fsm] as a set of commands. |
||||
|
||||
[FSM]: https://pkg.go.dev/github.com/hashicorp/raft#FSM |
||||
[hashicorp/raft]: https://github.com/hashicorp/raft |
||||
[agent/consul/fsm]: https://github.com/hashicorp/consul/tree/main/agent/consul/fsm |
||||
|
||||
Raft also requires a [LogStore] to persist logs to disk. Consul uses [hashicorp/raft-boltdb] |
||||
which implements [LogStore] using [boltdb]. In the near future we should be updating to |
||||
use [bbolt]. |
||||
|
||||
|
||||
[LogStore]: https://pkg.go.dev/github.com/hashicorp/raft#LogStore |
||||
[hashicorp/raft-boltdb]: https://github.com/hashicorp/raft-boltdb |
||||
[boltdb]: https://github.com/boltdb/bolt |
||||
[bbolt]: https://github.com/etcd-io/bbolt |
||||
|
||||
|
||||
## State Store |
||||
|
||||
Consul stores the full state of the cluster in memory using the state store. The state store is |
||||
implemented in [agent/consul/state] and uses [hashicorp/go-memdb] to maintain indexes of |
||||
data stored in a set of tables. The main entrypoint to the state store is [NewStateStore]. |
||||
|
||||
[agent/consul/state]: https://github.com/hashicorp/consul/tree/main/agent/consul/state |
||||
[hashicorp/go-memdb]: https://github.com/hashicorp/go-memdb |
||||
[NewStateStore]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/state_store.go |
||||
|
||||
### Tables, Schemas, and Indexes |
||||
|
||||
The state store is organized as a set of tables, and each table has a set of indexes. |
||||
`newDBSchema` in [schema.go] shows the full list of tables, and each schema function shows |
||||
the full list of indexes. |
||||
|
||||
[schema.go]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/schema.go |
||||
|
||||
There are two styles for defining table indexes. The original style uses generic indexer |
||||
implementations from [hashicorp/go-memdb] (ex: `StringFieldIndex`). These indexes use |
||||
[reflect] to find values for an index. These generic indexers work well when the index |
||||
value is a single value available directly from the struct field, and there are no |
||||
oss/enterprise differences. |
||||
|
||||
The second style of indexers are custom indexers implemented using only functions and |
||||
based on the types defined in [indexer.go]. This style of index works well when the index |
||||
value is a value derived from one or multiple fields, or when there are oss/enterprise |
||||
differences between the indexes. |
||||
|
||||
[reflect]: https://golang.org/pkg/reflect/ |
||||
[indexer.go]: https://github.com/hashicorp/consul/blob/main/agent/consul/state/indexer.go |
||||
|
||||
|
||||
## Snapshot and Restore |
||||
|
||||
Snapshots are the primary mechanism used to backup the data stored by cluster persistence. |
||||
If all Consul servers fail, a snapshot can be used to restore the cluster back |
||||
to its previous state. |
||||
|
||||
Note that there are two different snapshot and restore concepts that exist at different |
||||
layers. First there is the `Snapshot` and `Restore` methods on the raft [FSM] interface, |
||||
that Consul must implement. These methods are implemented as mostly passthrough to the |
||||
state store. These methods may be called internally by raft to perform log compaction |
||||
(snapshot) or to bootstrap a new follower (restore). Consul implements snapshot and |
||||
restore using the `Snapshot` and `Restore` types in [agent/consul/state]. |
||||
|
||||
Snapshot and restore also exist as actions that a user may perform. There are [CLI] |
||||
commands, [HTTP API] endpoints, and [RPC] endpoints that allow a user to capture an |
||||
archive which contains a snapshot of the state, and restore that state to a running |
||||
cluster. The [consul/snapshot] package provides some of the logic for creating and reading |
||||
the snapshot archives for users. See [commands/snapshot] for a reference to these user |
||||
facing operations. |
||||
|
||||
[CLI]: ../cli |
||||
[HTTP API]: ../http-api |
||||
[commands/snapshot]: https://www.consul.io/commands/snapshot |
||||
[consul/snapshot]: https://github.com/hashicorp/consul/tree/main/snapshot |
||||
|
||||
Finally, there is also a [snapshot agent] (enterprise only) that uses the snapshot API |
||||
endpoints to periodically capture a snapshot, and optionally send it somewhere for |
||||
storage. |
||||
|
||||
[snapshot agent]: https://www.consul.io/commands/snapshot/agent |
||||
|
||||
## Raft Autopilot |
||||
|
||||
[hashicorp/raft-autopilot] is used by Consul to automate some parts of the upgrade process. |
||||
|
||||
|
||||
[hashicorp/raft-autopilot]: https://github.com/hashicorp/raft-autopilot |
||||
|
@ -0,0 +1,34 @@
|
||||
graph TB |
||||
|
||||
requestLeader[request] --> RPCLeader |
||||
requestFollower[request] --> RPCFollower |
||||
|
||||
class requestLeader,requestFollower req; |
||||
classDef req fill:transparent,color:#000,stroke-width:1; |
||||
|
||||
subgraph Leader |
||||
RPCLeader[RPC] |
||||
RaftLeader[Raft] |
||||
StateStoreLeader[State Store] |
||||
FSMLeader[FSM] |
||||
end |
||||
|
||||
RPCLeader -->|write| RaftLeader |
||||
RPCLeader -->|read| StateStoreLeader |
||||
RaftLeader ---> FSMLeader |
||||
FSMLeader --> StateStoreLeader |
||||
|
||||
subgraph Follower |
||||
RPCFollower[RPC] |
||||
RaftFollower[Raft] |
||||
StateStoreFollower[State Store] |
||||
FSMFollower[FSM] |
||||
end |
||||
|
||||
RaftLeader <-.->|consensus and replication| RaftFollower |
||||
|
||||
RPCFollower -->|forward write to leader| RPCLeader |
||||
RPCFollower -->|read| StateStoreFollower |
||||
RaftFollower --> FSMFollower |
||||
FSMFollower --> StateStoreFollower |
||||
|
After Width: | Height: | Size: 22 KiB |
Loading…
Reference in new issue