Nomad
Autopilot Operator HTTP API
The /operator/autopilot
endpoints allow for automatic operator-friendly
management of Nomad servers including cleanup of dead servers, monitoring the
state of the Raft cluster, and stable server introduction.
Read Autopilot Configuration
This endpoint retrieves its latest Autopilot configuration.
Method | Path | Produces |
---|---|---|
GET | /v1/operator/autopilot/configuration | application/json |
The table below shows this endpoint's support for blocking queries and required ACLs.
Blocking Queries | ACL Required |
---|---|
NO | operator:read |
Sample Request
$ nomad operator api /v1/operator/autopilot/configuration
Sample Response
{
"LastContactThreshold": "200ms",
"ServerStabilizationTime": "10s",
"CleanupDeadServers": true,
"MaxTrailingLogs": 250,
"MinQuorum": 2,
"EnableRedundancyZones": false,
"DisableUpgradeMigration": false,
"EnableCustomUpgrades": false,
"CreateIndex": 68,
"ModifyIndex": 68
}
For more information about the Autopilot configuration options, see the agent configuration section.
Update Autopilot Configuration
This endpoint updates the Autopilot configuration of the cluster.
Method | Path | Produces |
---|---|---|
PUT | /v1/operator/autopilot/configuration | application/json |
The table below shows this endpoint's support for blocking queries and required ACLs.
Blocking Queries | ACL Required |
---|---|
NO | operator:write |
Parameters
cas
(int: 0)
- Specifies to use a Check-And-Set operation. The update will only happen if the given index matches theModifyIndex
of the configuration at the time of writing.
Sample Payload
{
"LastContactThreshold": "200ms",
"ServerStabilizationTime": "10s",
"CleanupDeadServers": true,
"MaxTrailingLogs": 250,
"MinQuorum": 2,
"EnableRedundancyZones": false,
"DisableUpgradeMigration": false,
"EnableCustomUpgrades": false
}
CleanupDeadServers
(bool: true)
- Specifies automatic removal of dead server nodes periodically and whenever a new server is added to the cluster.LastContactThreshold
(string: "200ms")
- Specifies the maximum amount of time a server can go without contact from the leader before being considered unhealthy. Must be a duration value such as10s
.MaxTrailingLogs
(int: 250)
specifies the maximum number of log entries that a server can trail the leader by before being considered unhealthy.MinQuorum
(int: 0)
- Sets the minimum number of servers necessary in a cluster. Autopilot will stop pruning dead servers when this minimum is reached. There is no default.ServerStabilizationTime
(string: "10s")
- Specifies the minimum amount of time a server must be stable in the 'healthy' state before being added to the cluster. Only takes effect if all servers are running Raft protocol version 3 or higher. Must be a duration value such as30s
.EnableRedundancyZones
(bool: false)
- (Enterprise-only) Specifies whether to enable redundancy zones.DisableUpgradeMigration
(bool: false)
- (Enterprise-only) Disables Autopilot's upgrade migration strategy in Nomad Enterprise of waiting until enough newer-versioned servers have been added to the cluster before promoting any of them to voters.EnableCustomUpgrades
(bool: false)
- (Enterprise-only) Specifies whether to enable using custom upgrade versions when performing migrations.
Read Health
This endpoint queries the health of the autopilot status.
Method | Path | Produces |
---|---|---|
GET | /v1/operator/autopilot/health | application/json |
The table below shows this endpoint's support for blocking queries and required ACLs.
Blocking Queries | ACL Required |
---|---|
NO | operator:read |
Sample Request
$ nomad operator api /v1/operator/autopilot/health
Sample response
{
"FailureTolerance": 1,
"Healthy": true,
"Leader": "021e5b05-7100-6982-19a3-bb4bd3765e1c",
"Servers": [
{
"LastContact": "28.900799ms",
"ID": "01dc237a-638b-0532-c548-f29f2269feaf",
"Name": "server2.global",
"Address": "192.168.50.139:4657",
"SerfStatus": "alive",
"Version": "1.9.3",
"Leader": false,
"LastTerm": 2,
"LastIndex": 7,
"Healthy": true,
"Voter": true,
"StableSince": "2024-11-22T00:54:11Z"
},
{
"LastContact": "4.824885ms",
"ID": "eae21027-b4ed-ea94-73c3-d25b7fd26c13",
"Name": "server3.global",
"Address": "192.168.50.139:4667",
"SerfStatus": "alive",
"Version": "1.9.3",
"Leader": false,
"LastTerm": 2,
"LastIndex": 7,
"Healthy": true,
"Voter": true,
"StableSince": "2024-11-22T00:54:11Z"
},
{
"LastContact": "0s",
"ID": "021e5b05-7100-6982-19a3-bb4bd3765e1c",
"Name": "server1.global",
"Address": "192.168.50.139:4647",
"SerfStatus": "alive",
"Version": "1.9.3",
"Leader": true,
"LastTerm": 2,
"LastIndex": 7,
"Healthy": true,
"Voter": true,
"StableSince": "2024-11-22T00:54:11Z"
}
],
"Voters": [
"021e5b05-7100-6982-19a3-bb4bd3765e1c",
"01dc237a-638b-0532-c548-f29f2269feaf",
"eae21027-b4ed-ea94-73c3-d25b7fd26c13"
]
}
Healthy
is whether all the servers are currently healthy.FailureTolerance
is the number of redundant healthy servers that could be fail without causing an outage (this would be 2 in a healthy cluster of 5 servers).Servers
holds detailed health information on each server:ID
is the Raft ID of the server.Name
is the node name of the server.Address
is the address of the server.SerfStatus
is the SerfHealth check status for the server.Version
is the Nomad version of the server.Leader
is whether this server is currently the leader.LastContact
is the time elapsed since this server's last contact with the leader.LastTerm
is the server's last known Raft leader term.LastIndex
is the index of the server's last committed Raft log entry.Healthy
is whether the server is healthy according to the current Autopilot configuration.Voter
is whether the server is a voting member of the Raft cluster.StableSince
is the time this server has been in its currentHealthy
state.
The HTTP status code will indicate the health of the cluster. If Healthy
is true, then a
status of 200 will be returned. If Healthy
is false, then a status of 429 will be returned.
Enterprise
This API endpoint return with more information in Nomad Enterprise. This is not present in Nomad Community Edition.
When using Nomad Enterprise this endpoint will return more information about the underlying actions and state of Autopilot.
{
...
"ReadReplicas": null,
"RedundancyZones": {},
"Upgrade": {
"Status": "idle",
"TargetVersion": "1.7.5+ent",
"TargetVersionVoters": [
"e349749b-3303-3ddf-959c-b5885a0e1f6e",
"e36ee410-cc3c-0a0c-c724-63817ab30303"
],
"TargetVersionNonVoters": null,
"TargetVersionReadReplicas": null,
"OtherVersionVoters": null,
"OtherVersionNonVoters": null,
"OtherVersionReadReplicas": null,
"RedundancyZones": {}
},
"OptimisticFailureTolerance": 0
}