blob: 81fe9c93749c8b20c28c761f72543a985a7411f6 (
plain) (
blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
# ANSIBLE
## How to proceed
For each machine, **one by one** do:
- Check that cluster is healthy
- Check garage
- check that all nodes are online `docker exec -ti xxx /garage status`
- check that tables are in sync `docker exec -ti 63a4d7ecd795 /garage repair --yes tables`
- check garage logs
- no unknown errors or resync should be in progress
- the following line must appear `INFO garage_util::background > Worker exited: Repair worker`
- Check that Nomad is healthy
- `nomad server members`
- `nomad node status`
- Check that Consul is healthy
- `consul members`
- Check that Postgres is healthy
- Run `ansible-playbook -i production.yml --limit <machine> -u <username> site.yml`
- Run `nomad node drain -enable -force -self`
- Reboot
- Run `nomad node drain -self -disable`
- Check that cluster is healthy (basically the whole first point)
|