diff options
author | Alex <alex@adnab.me> | 2024-03-15 13:17:53 +0000 |
---|---|---|
committer | Alex <alex@adnab.me> | 2024-03-15 13:17:53 +0000 |
commit | fd2e19bf1bf301bc03aa29ffa3fe1e71008cbe50 (patch) | |
tree | c92172dee172941c3daf32a08927f8ebab0ded9e /doc/book/operations/recovering.md | |
parent | a80ce6ab5ad9834c3721eeb4f626d53c9a8bb1f4 (diff) | |
parent | 8cf3d24875d41d79ab08d637cd38d2a5b9e527dd (diff) | |
download | garage-fd2e19bf1bf301bc03aa29ffa3fe1e71008cbe50.tar.gz garage-fd2e19bf1bf301bc03aa29ffa3fe1e71008cbe50.zip |
Merge pull request 'metadata db snapshotting' (#775) from db-snapshot into main
Reviewed-on: https://git.deuxfleurs.fr/Deuxfleurs/garage/pulls/775
Diffstat (limited to 'doc/book/operations/recovering.md')
-rw-r--r-- | doc/book/operations/recovering.md | 54 |
1 files changed, 54 insertions, 0 deletions
diff --git a/doc/book/operations/recovering.md b/doc/book/operations/recovering.md index 7a830788..6e19db0e 100644 --- a/doc/book/operations/recovering.md +++ b/doc/book/operations/recovering.md @@ -108,3 +108,57 @@ garage layout apply # once satisfied, apply the changes Garage will then start synchronizing all required data on the new node. This process can be monitored using the `garage stats -a` command. + +## Replacement scenario 3: corrupted metadata {#corrupted_meta} + +In some cases, your metadata DB file might become corrupted, for instance if +your node suffered a power outage and did not shut down properly. In this case, +you can recover without having to change the node ID and rebuilding a cluster +layout. This means that data blocks will not need to be shuffled around, you +must simply find a way to repair the metadata file. The best way is generally +to discard the corrupted file and recover it from another source. + +First of all, start by locating the database file in your metadata directory, +which [depends on your `db_engine` +choice](@/documentation/reference-manual/configuration.md#db_engine). Then, +your recovery options are as follows: + +- **Option 1: resyncing from other nodes.** In case your cluster is replicated + with two or three copies, you can simply delete the database file, and Garage + will resync from other nodes. To do so, stop Garage, delete the database file + or directory, and restart Garage. Then, do a full table repair by calling + `garage repair -a --yes tables`. This will take a bit of time to complete as + the new node will need to receive copies of the metadata tables from the + network. + +- **Option 2: restoring a snapshot taken by Garage.** Since v0.9.4, Garage can + [automatically take regular + snapshots](@/documentation/reference-manual/configuration.md#metadata_auto_snapshot_interval) + of your metadata DB file. This file or directory should be located under + `<metadata_dir>/snapshots`, and is named according to the UTC time at which it + was taken. Stop Garage, discard the database file/directory and replace it by the + snapshot you want to use. For instance, in the case of LMDB: + + ```bash + cd $METADATA_DIR + mv db.lmdb db.lmdb.bak + cp -r snapshots/2024-03-15T12:13:52Z db.lmdb + ``` + + And for Sqlite: + + ```bash + cd $METADATA_DIR + mv db.sqlite db.sqlite.bak + cp snapshots/2024-03-15T12:13:52Z db.sqlite + ``` + + Then, restart Garage and run a full table repair by calling `garage repair -a + --yes tables`. This should run relatively fast as only the changes that + occurred since the snapshot was taken will need to be resynchronized. Of + course, if your cluster is not replicated, you will lose all changes that + occurred since the snapshot was taken. + +- **Option 3: restoring a filesystem-level snapshot.** If you are using ZFS or + BTRFS to snapshot your metadata partition, refer to their specific + documentation on rolling back or copying files from an old snapshot. |