aboutsummaryrefslogtreecommitdiff
path: root/doc/book/cookbook/real-world.md
diff options
context:
space:
mode:
authorAlex Auvolat <alex@adnab.me>2024-03-14 16:29:42 +0100
committerAlex Auvolat <alex@adnab.me>2024-03-14 16:31:22 +0100
commita7dddebedd59cfb00955b8648aa0b022581472ba (patch)
tree149b43d7aaa02552f8b3e4fa33019f5815b2a85a /doc/book/cookbook/real-world.md
parent238545e56486b857fab41e0703fc9bccbcd50a2c (diff)
downloadgarage-a7dddebedd59cfb00955b8648aa0b022581472ba.tar.gz
garage-a7dddebedd59cfb00955b8648aa0b022581472ba.zip
[doc-updates] doc: be slightly more critical of LMDB
Diffstat (limited to 'doc/book/cookbook/real-world.md')
-rw-r--r--doc/book/cookbook/real-world.md47
1 files changed, 28 insertions, 19 deletions
diff --git a/doc/book/cookbook/real-world.md b/doc/book/cookbook/real-world.md
index c15ea384..15a58b9b 100644
--- a/doc/book/cookbook/real-world.md
+++ b/doc/book/cookbook/real-world.md
@@ -27,7 +27,7 @@ To run a real-world deployment, make sure the following conditions are met:
[Yggdrasil](https://yggdrasil-network.github.io/) are approaches to consider
in addition to building out your own VPN tunneling.
-- This guide will assume you are using Docker containers to deploy Garage on each node.
+- This guide will assume you are using Docker containers to deploy Garage on each node.
Garage can also be run independently, for instance as a [Systemd service](@/documentation/cookbook/systemd.md).
You can also use an orchestrator such as Nomad or Kubernetes to automatically manage
Docker containers on a fleet of nodes.
@@ -53,9 +53,9 @@ to store 2 TB of data in total.
### Best practices
-- If you have fast dedicated networking between all your nodes, and are planing to store
- very large files, bump the `block_size` configuration parameter to 10 MB
- (`block_size = 10485760`).
+- If you have reasonably fast networking between all your nodes, and are planing to store
+ mostly large files, bump the `block_size` configuration parameter to 10 MB
+ (`block_size = "10M"`).
- Garage stores its files in two locations: it uses a metadata directory to store frequently-accessed
small metadata items, and a data directory to store data blocks of uploaded objects.
@@ -68,20 +68,29 @@ to store 2 TB of data in total.
EXT4 is not recommended as it has more strict limitations on the number of inodes,
which might cause issues with Garage when large numbers of objects are stored.
-- If you only have an HDD and no SSD, it's fine to put your metadata alongside the data
- on the same drive. Having lots of RAM for your kernel to cache the metadata will
- help a lot with performance. Make sure to use the LMDB database engine,
- instead of Sled, which suffers from quite bad performance degradation on HDDs.
- Sled is still the default for legacy reasons, but is not recommended anymore.
-
-- For the metadata storage, Garage does not do checksumming and integrity
- verification on its own. If you are afraid of bitrot/data corruption,
- put your metadata directory on a ZFS or BTRFS partition. Otherwise, just use regular
- EXT4 or XFS.
-
- Servers with multiple HDDs are supported natively by Garage without resorting
to RAID, see [our dedicated documentation page](@/documentation/operations/multi-hdd.md).
+- For the metadata storage, Garage does not do checksumming and integrity
+ verification on its own. Users have reported that when using the LMDB
+ database engine (the default), database files have a tendency of becoming
+ corrupted after an unclean shutdown (e.g. a power outage), so you should use
+ a robust filesystem such as BTRFS or ZFS for the metadata partition, and take
+ regular snapshots so that you can restore to a recent known-good state in
+ case of an incident. If you cannot do so, you might want to switch to Sqlite
+ which is more robust.
+
+- LMDB is the fastest and most tested database engine, but it has the following
+ weaknesses: 1/ data files are not architecture-independent, you cannot simply
+ move a Garage metadata directory between nodes running different architectures,
+ and 2/ LMDB is not suited for 32-bit platforms. Sqlite is a viable alternative
+ if any of these are of concern.
+
+- If you only have an HDD and no SSD, it's fine to put your metadata alongside
+ the data on the same drive, but then consider your filesystem choice wisely
+ (see above). Having lots of RAM for your kernel to cache the metadata will
+ help a lot with performance.
+
## Get a Docker image
Our docker image is currently named `dxflrs/garage` and is stored on the [Docker Hub](https://hub.docker.com/r/dxflrs/garage/tags?page=1&ordering=last_updated).
@@ -187,7 +196,7 @@ upgrades. With the containerized setup proposed here, the upgrade process
will require stopping and removing the existing container, and re-creating it
with the upgraded version.
-## Controling the daemon
+## Controlling the daemon
The `garage` binary has two purposes:
- it acts as a daemon when launched with `garage server`
@@ -245,7 +254,7 @@ You can then instruct nodes to connect to one another as follows:
Venus$ garage node connect 563e1ac825ee3323aa441e72c26d1030d6d4414aeb3dd25287c531e7fc2bc95d@[fc00:1::1]:3901
```
-You don't nead to instruct all node to connect to all other nodes:
+You don't need to instruct all node to connect to all other nodes:
nodes will discover one another transitively.
Now if your run `garage status` on any node, you should have an output that looks as follows:
@@ -328,8 +337,8 @@ Given the information above, we will configure our cluster as follow:
```bash
garage layout assign 563e -z par1 -c 1T -t mercury
garage layout assign 86f0 -z par1 -c 2T -t venus
-garage layout assign 6814 -z lon1 -c 2T -t earth
-garage layout assign 212f -z bru1 -c 1.5T -t mars
+garage layout assign 6814 -z lon1 -c 2T -t earth
+garage layout assign 212f -z bru1 -c 1.5T -t mars
```
At this point, the changes in the cluster layout have not yet been applied.