From ebd21b325e7c30b58d6b3ab621f08cd1bffb0c6d Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Fri, 28 May 2021 18:00:59 +0200 Subject: Write documentation on configuration file and other improvements --- doc/book/src/SUMMARY.md | 16 +- doc/book/src/cookbook/website.md | 2 + doc/book/src/getting_started/01_binary.md | 44 ++++ doc/book/src/getting_started/02_test_deployment.md | 107 ++++++++++ .../getting_started/03_real_world_deployment.md | 154 ++++++++++++++ doc/book/src/getting_started/04_control.md | 75 +++++++ doc/book/src/getting_started/05_cluster.md | 82 ++++++++ doc/book/src/getting_started/06_bucket.md | 74 +++++++ doc/book/src/getting_started/07_files.md | 45 +++++ doc/book/src/getting_started/binary.md | 44 ---- doc/book/src/getting_started/bucket.md | 74 ------- doc/book/src/getting_started/cluster.md | 73 ------- doc/book/src/getting_started/control.md | 77 ------- doc/book/src/getting_started/daemon.md | 222 --------------------- doc/book/src/getting_started/files.md | 42 ---- doc/book/src/reference_manual/cli.md | 4 + doc/book/src/reference_manual/configuration.md | 196 ++++++++++++++++++ doc/book/src/reference_manual/s3_compatibility.md | 6 +- doc/book/src/working_documents/load_balancing.md | 12 +- 19 files changed, 801 insertions(+), 548 deletions(-) create mode 100644 doc/book/src/getting_started/01_binary.md create mode 100644 doc/book/src/getting_started/02_test_deployment.md create mode 100644 doc/book/src/getting_started/03_real_world_deployment.md create mode 100644 doc/book/src/getting_started/04_control.md create mode 100644 doc/book/src/getting_started/05_cluster.md create mode 100644 doc/book/src/getting_started/06_bucket.md create mode 100644 doc/book/src/getting_started/07_files.md delete mode 100644 doc/book/src/getting_started/binary.md delete mode 100644 doc/book/src/getting_started/bucket.md delete mode 100644 doc/book/src/getting_started/cluster.md delete mode 100644 doc/book/src/getting_started/control.md delete mode 100644 doc/book/src/getting_started/daemon.md delete mode 100644 doc/book/src/getting_started/files.md create mode 100644 doc/book/src/reference_manual/cli.md create mode 100644 doc/book/src/reference_manual/configuration.md (limited to 'doc/book') diff --git a/doc/book/src/SUMMARY.md b/doc/book/src/SUMMARY.md index 18fad2cd..b88ebb4c 100644 --- a/doc/book/src/SUMMARY.md +++ b/doc/book/src/SUMMARY.md @@ -3,12 +3,13 @@ [The Garage Data Store](./intro.md) - [Getting Started](./getting_started/index.md) - - [Get a binary](./getting_started/binary.md) - - [Configure the daemon](./getting_started/daemon.md) - - [Control the daemon](./getting_started/control.md) - - [Configure a cluster](./getting_started/cluster.md) - - [Create buckets and keys](./getting_started/bucket.md) - - [Handle files](./getting_started/files.md) + - [Get a binary](./getting_started/01_binary.md) + - [Configuring a test deployment](./getting_started/02_test_deployment.md) + - [Configure a real-world deployment](./getting_started/03_real_world_deployment.md) + - [Control the daemon](./getting_started/04_control.md) + - [Configure a cluster](./getting_started/05_cluster.md) + - [Create buckets and keys](./getting_started/06_bucket.md) + - [Handle files](./getting_started/07_files.md) - [Cookbook](./cookbook/index.md) - [Host a website](./cookbook/website.md) @@ -17,7 +18,8 @@ - [Recovering from failures](./cookbook/recovering.md) - [Reference Manual](./reference_manual/index.md) - - [Garage CLI]() + - [Garage configuration file](./reference_manual/configuration.md) + - [Garage CLI](./reference_manual/cli.md) - [S3 API](./reference_manual/s3_compatibility.md) - [Design](./design/index.md) diff --git a/doc/book/src/cookbook/website.md b/doc/book/src/cookbook/website.md index 2ea82a9a..b3dd1b51 100644 --- a/doc/book/src/cookbook/website.md +++ b/doc/book/src/cookbook/website.md @@ -1 +1,3 @@ # Host a website + +TODO diff --git a/doc/book/src/getting_started/01_binary.md b/doc/book/src/getting_started/01_binary.md new file mode 100644 index 00000000..2719d959 --- /dev/null +++ b/doc/book/src/getting_started/01_binary.md @@ -0,0 +1,44 @@ +# Get a binary + +Currently, only two installations procedures are supported for Garage: from Docker (x86\_64 for Linux) and from source. +In the future, we plan to add a third one, by publishing a compiled binary (x86\_64 for Linux). +We did not test other architecture/operating system but, as long as your architecture/operating system is supported by Rust, you should be able to run Garage (feel free to report your tests!). + +## From Docker + +Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). +We encourage you to use a fixed tag (eg. `v0.3.0`) and not the `latest` tag. +For this example, we will use the latest published version at the time of the writing which is `v0.3.0` but it's up to you +to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). + +For example: + +``` +sudo docker pull lxpz/garage_amd64:v0.3.0 +``` + +## From source + +Garage is a standard Rust project. +First, you need `rust` and `cargo`. +On Debian: + +```bash +sudo apt-get update +sudo apt-get install -y rustc cargo +``` + +Then, you can ask cargo to install the binary for you: + +```bash +cargo install garage +``` + +That's all, `garage` should be in `$HOME/.cargo/bin`. +You can add this folder to your `$PATH` or copy the binary somewhere else on your system. +For the following, we will assume you copied it in `/usr/local/bin/garage`: + +```bash +sudo cp $HOME/.cargo/bin/garage /usr/local/bin/garage +``` + diff --git a/doc/book/src/getting_started/02_test_deployment.md b/doc/book/src/getting_started/02_test_deployment.md new file mode 100644 index 00000000..16f40dce --- /dev/null +++ b/doc/book/src/getting_started/02_test_deployment.md @@ -0,0 +1,107 @@ +# Configuring a test deployment + +This section describes how to run a simple test Garage deployment with a single node. +Note that this kind of deployment should not be used in production, as it provides +no redundancy for your data! +We will also skip intra-cluster TLS configuration, meaning that if you add nodes +to your cluster, communication between them will not be secure. + +First, make sure that you have Garage installed in your command line environment. +We will explain how to launch Garage in a Docker container, however we still +recommend that you install the `garage` CLI on your host system in order to control +the daemon. + +## Writing a first configuration file + +This first configuration file should allow you to get started easily with the simplest +possible Garage deployment: + +```toml +metadata_dir = "/tmp/meta" +data_dir = "/tmp/data" + +replication_mode = "none" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [] + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +Save your configuration file as `garage.toml`. + +As you can see in the `metadata_dir` and `data_dir` parameters, we are saving Garage's data +in `/tmp` which gets erased when your system reboots. This means that data stored on this +Garage server will not be persistent. Change these to locations on your HDD if you want +your data to be persisted properly. + +## Launching the Garage server + +#### Option 1: directly (without Docker) + +Use the following command to launch the Garage server with our configuration file: + +``` +garage server -c garage.toml +``` + +By default, Garage displays almost no output. You can tune Garage's verbosity as follows +(from less verbose to more verbose): + +``` +RUST_LOG=garage=info garage server -c garage.toml +RUST_LOG=garage=debug garage server -c garage.toml +RUST_LOG=garage=trace garage server -c garage.toml +``` + +Log level `info` is recommended for most use cases. +Log level `debug` can help you check why your S3 API calls are not working. + +#### Option 2: in a Docker container + +Use the following command to start Garage in a docker container: + +``` +docker run -d \ + -p 3901:3901 -p 3902:3902 -p 3900:3900 \ + -v ./config.toml:/garage/config.toml \ + lxpz/garage_amd64:v0.3.0 +``` + +To tune Garage's verbosity level, set the `RUST_LOG` environment variable in the configuration +at launch time. For instance: + +``` +docker run -d \ + -p 3901:3901 -p 3902:3902 -p 3900:3900 \ + -v ./config.toml:/garage/config.toml \ + -e RUST_LOG=garage=info \ + lxpz/garage_amd64:v0.3.0 +``` + +## Checking that Garage runs correctly + +The `garage` utility is also used as a CLI tool to configure your Garage deployment. +It tries to connect to a Garage server through the RPC protocol, by default looking +for a Garage server at `localhost:3901`. + +Since our deployment already binds to port 3901, the following command should be sufficient +to show Garage's status, provided that you installed the `garage` binary on your host system: + +``` +garage status +``` + +Move on to [controlling the Garage daemon](04_control.md) to learn more about how to +use the Garage CLI to control your cluster. + +Move on to [configuring your cluster](05_cluster.md) in order to configure +your single-node deployment for actual use! diff --git a/doc/book/src/getting_started/03_real_world_deployment.md b/doc/book/src/getting_started/03_real_world_deployment.md new file mode 100644 index 00000000..81b929c1 --- /dev/null +++ b/doc/book/src/getting_started/03_real_world_deployment.md @@ -0,0 +1,154 @@ +# Configuring a real-world Garage deployment + +To run Garage in cluster mode, we recommend having at least 3 nodes. +This will allow you to setup Garage for three-way replication of your data, +the safest and most available mode avaialble. + +## Generating a TLS Certificate + +You first need to generate TLS certificates to encrypt traffic between Garage nodes +(reffered to as RPC traffic). + +To generate your TLS certificates, run on your machine: + +``` +wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh +chmod +x genkeys.sh +./genkeys.sh +``` + +It will creates a folder named `pki/` containing the keys that you will used for the cluster. + +## Real-world deployment + +To run a real-world deployment, make sure you the following conditions are met: + +- You have at least three machines with sufficient storage space available + +- Each machine has a public IP address which is reachable by other machines. + Running behind a NAT is possible, but having several Garage nodes behind a single NAT + is slightly more involved as each will have to have a different RPC port number + (the local port number of a node must be the same as the port number exposed publicly + by the NAT). + +- Ideally, each machine should have a SSD available in addition to the HDD you are dedicating + to Garage. This will allow for faster access to metadata and has the potential + to drastically reduce Garage's response times. + +Before deploying garage on your infrastructure, you must inventory your machines. +For our example, we will suppose the following infrastructure with IPv6 connectivity: + +| Location | Name | IP Address | Disk Space | +|----------|---------|------------|------------| +| Paris | Mercury | fc00:1::1 | 1 To | +| Paris | Venus | fc00:1::2 | 2 To | +| London | Earth | fc00:B::1 | 2 To | +| Brussels | Mars | fc00:F::1 | 1.5 To | + + +On each machine, we will have a similar setup, +especially you must consider the following folders/files: + + - `/etc/garage/config.toml`: Garage daemon's configuration (see below) + - `/etc/garage/pki/`: Folder containing Garage certificates, must be generated on your computer and copied on the servers + - `/var/lib/garage/meta/`: Folder containing Garage's metadata, put this folder on a SSD if possible + - `/var/lib/garage/data/`: Folder containing Garage's data, this folder will grows and must be on a large storage, possibly big HDDs. + - `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker) + +A valid `/etc/garage/config.toml` for our cluster would be: + +```toml +metadata_dir = "/var/lib/garage/meta" +data_dir = "/var/lib/garage/data" + +replication_mode = "3" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [ + "[fc00:1::1]:3901", + "[fc00:1::2]:3901", + "[fc00:B::1]:3901", + "[fc00:F::1]:3901", +] + +[rpc_tls] +ca_cert = "/etc/garage/pki/garage-ca.crt" +node_cert = "/etc/garage/pki/garage.crt" +node_key = "/etc/garage/pki/garage.key" + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +Please make sure to change `bootstrap_peers` to **your** IP addresses! + +Check the [configuration file reference documentation](../reference_manual/configuration.md) +to learn more about all available configuration options. + +### For docker users + +On each machine, you can run the daemon with: + +```bash +docker run \ + -d \ + --name garaged \ + --restart always \ + --network host \ + -v /etc/garage/pki:/etc/garage/pki \ + -v /etc/garage/config.toml:/garage/config.toml \ + -v /var/lib/garage/meta:/var/lib/garage/meta \ + -v /var/lib/garage/data:/var/lib/garage/data \ + lxpz/garage_amd64:v0.3.0 +``` + +It should be restart automatically at each reboot. +Please note that we use host networking as otherwise Docker containers +can not communicate with IPv6. + +Upgrading between Garage versions should be supported transparently, +but please check the relase notes before doing so! +To upgrade, simply stop and remove this container and +start again the command with a new version of garage. + +### For systemd/raw binary users + +Create a file named `/etc/systemd/system/garage.service`: + +```toml +[Unit] +Description=Garage Data Store +After=network-online.target +Wants=network-online.target + +[Service] +Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1' +ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml + +[Install] +WantedBy=multi-user.target +``` + +To start the service then automatically enable it at boot: + +```bash +sudo systemctl start garage +sudo systemctl enable garage +``` + +To see if the service is running and to browse its logs: + +```bash +sudo systemctl status garage +sudo journalctl -u garage +``` + +If you want to modify the service file, do not forget to run `systemctl daemon-reload` +to inform `systemd` of your modifications. diff --git a/doc/book/src/getting_started/04_control.md b/doc/book/src/getting_started/04_control.md new file mode 100644 index 00000000..018d3268 --- /dev/null +++ b/doc/book/src/getting_started/04_control.md @@ -0,0 +1,75 @@ +# Control the daemon + +The `garage` binary has two purposes: + - it acts as a daemon when launched with `garage server ...` + - it acts as a control tool for the daemon when launched with any other command + +In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started. +You first need to get a shell having access to this binary, which depends of your configuration: + + - with `docker`, run `sudo docker exec -ti garaged bash`, you will now have a shell + where the Garage binary is available as `/garage/garage` + - with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions + +*You can also install the binary on your machine to remotely control the cluster.* + +## Talk to the daemon and create an alias + +`garage` requires 4 options to talk with the daemon: + +``` +--ca-cert +--client-cert +--client-key +-h, --rpc-host +``` + +The 3 first ones are certificates and keys needed by TLS, the last one is simply the address of garage's RPC endpoint. +Because we configure garage directly from the server, we do not need to set `--rpc-host`. +To avoid typing the 3 first options each time we want to run a command, we will create an alias. + +### test deployment + +If you have simply deployed Garage on your local machine, without TLS, you can invoke +`garage` directly without any of these parameters and without making a `garagectl` alias +(replace mentions of `garagectl` in the next sections by `garage`). + + +### `docker` alias + +```bash +alias garagectl='/garage/garage \ + --ca-cert /etc/garage/pki/garage-ca.crt \ + --client-cert /etc/garage/pki/garage.crt \ + --client-key /etc/garage/pki/garage.key' +``` + +### raw binary alias + +```bash +alias garagectl='/usr/local/bin/garage \ + --ca-cert /etc/garage/pki/garage-ca.crt \ + --client-cert /etc/garage/pki/garage.crt \ + --client-key /etc/garage/pki/garage.key' +``` + +Of course, if your deployment does not match exactly one of this alias, feel free to adapt it to your needs! + +## Test the alias + +You can test your alias by running a simple command such as: + +``` +garagectl status +``` + +You should get something like that as result: + +``` +Healthy nodes: +2a638ed6c775b69a… 37f0ba978d27 [::ffff:172.20.0.101]:3901 UNCONFIGURED/REMOVED +68143d720f20c89d… 9795a2f7abb5 [::ffff:172.20.0.103]:3901 UNCONFIGURED/REMOVED +8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED +``` + +...which means that you are ready to [configure your cluster](05_cluster.md)! diff --git a/doc/book/src/getting_started/05_cluster.md b/doc/book/src/getting_started/05_cluster.md new file mode 100644 index 00000000..83beb662 --- /dev/null +++ b/doc/book/src/getting_started/05_cluster.md @@ -0,0 +1,82 @@ +# Configure a cluster + +*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.* + +In this section, we will inform garage of the disk space available on each node of the cluster +as well as the site (think datacenter) of each machine. + +## Test cluster + +As this part is not relevant for a test cluster, you can use this three-liner to create a basic topology: + +```bash +garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do + garagectl node configure -d dc1 -c 1 $id +done +``` + +## Real-world cluster + +For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following): + +| Location | Name | Disk Space | `Capacity` | `Identifier` | `Zone` | +|----------|---------|------------|------------|--------------|--------------| +| Paris | Mercury | 1 To | `2` | `8781c5` | `par1` | +| Paris | Venus | 2 To | `4` | `2a638e` | `par1` | +| London | Earth | 2 To | `4` | `68143d` | `lon1` | +| Brussels | Mars | 1.5 To | `3` | `212f75` | `bru1` | + +### Identifier + +After its first launch, garage generates a random and unique identifier for each nodes, such as: + +``` +8781c50c410a41b363167e9d49cc468b6b9e4449b6577b64f15a249a149bdcbc +``` + +Often a shorter form can be used, containing only the beginning of the identifier, like `8781c5`, +which identifies the server "Mercury" located in "Paris" according to our previous table. + +The most simple way to match an identifier to a node is to run: + +``` +garagectl status +``` + +It will display the IP address associated with each node; from the IP address you will be able to recognize the node. + +### Zones + +Zones are simply a user-chosen identifier that identify a group of server that are grouped together logically. +It is up to the system administrator deploying garage to identify what does "grouped together" means. + +In most cases, a zone will correspond to a geographical location (i.e. a datacenter). +Behind the scene, Garage will use zone definition to try to store the same data on different zones, +in order to provide high availability despite failure of a zone. + +### Capacity + +Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node. +The capacity configured in Garage must be proportional to the disk space dedicated to the node. +Additionaly, the capacity values used in Garage should be as small as possible, with +1 ideally representing the size of your smallest server. + +Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size +1 To and 2 To, as wel as the intermediate size 1.5 To. + +Note that the amount of data stored by Garage on each server may not be strictly proportional to +its capacity value, as Garage will priorize having 3 copies of data in different zones, +even if this means that capacities will not be strictly respected. For example in our above examples, +nodes Earth and Mars will always store a copy of everything each, and the third copy will +have 66% chance of being stored by Venus and 33% chance of being stored by Mercury. + +### Inject the topology + +Given the information above, we will configure our cluster as follow: + +``` +garagectl node configure -z par1 -c 2 -t mercury 8781c5 +garagectl node configure -z par1 -c 4 -t venus 2a638e +garagectl node configure -z lon1 -c 4 -t earth 68143d +garagectl node configure -z bru1 -c 3 -t mars 212f75 +``` diff --git a/doc/book/src/getting_started/06_bucket.md b/doc/book/src/getting_started/06_bucket.md new file mode 100644 index 00000000..b4a2d81d --- /dev/null +++ b/doc/book/src/getting_started/06_bucket.md @@ -0,0 +1,74 @@ +# Create buckets and keys + +*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.* + +In this section, we will suppose that we want to create a bucket named `nextcloud-bucket` +that will be accessed through a key named `nextcloud-app-key`. + +Don't forget that `help` command and `--help` subcommands can help you anywhere, the CLI tool is self-documented! Two examples: + +``` +garagectl help +garagectl bucket allow --help +``` + +## Create a bucket + +Fine, now let's create a bucket (we imagine that you want to deploy nextcloud): + +``` +garagectl bucket create nextcloud-bucket +``` + +Check that everything went well: + +``` +garagectl bucket list +garagectl bucket info nextcloud-bucket +``` + +## Create an API key + +Now we will generate an API key to access this bucket. +Note that API keys are independent of buckets: one key can access multiple buckets, multiple keys can access one bucket. + +Now, let's start by creating a key only for our PHP application: + +``` +garagectl key new --name nextcloud-app-key +``` + +You will have the following output (this one is fake, `key_id` and `secret_key` were generated with the openssl CLI tool): + +``` +Key name: nextcloud-app-key +Key ID: GK3515373e4c851ebaad366558 +Secret key: 7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34 +Authorized buckets: +``` + +Check that everything works as intended: + +``` +garagectl key list +garagectl key info nextcloud-app-key +``` + +## Allow a key to access a bucket + +Now that we have a bucket and a key, we need to give permissions to the key on the bucket! + +``` +garagectl bucket allow \ + --read \ + --write + nextcloud-bucket \ + --key nextcloud-app-key +``` + +You can check at any times allowed keys on your bucket with: + +``` +garagectl bucket info nextcloud-bucket +``` + diff --git a/doc/book/src/getting_started/07_files.md b/doc/book/src/getting_started/07_files.md new file mode 100644 index 00000000..cdd5d945 --- /dev/null +++ b/doc/book/src/getting_started/07_files.md @@ -0,0 +1,45 @@ +# Handle files + +We recommend the use of MinIO Client to interact with Garage files (`mc`). +Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html). +Before reading the following, you need a working `mc` command on your path. + +Note that on certain Linux distributions such as Arch Linux, the Minio client binary +is called `mcli` instead of `mc` (to avoid name clashes with the Midnight Commander). + +## Configure `mc` + +You need your access key and secret key created in the [previous section](bucket.md). +You also need to set the endpoint: it must match the IP address of one of the node of the cluster and the API port (3900 by default). +For this whole configuration, you must set an alias name: we chose `my-garage`, that you will used for all commands. + +Adapt the following command accordingly and run it: + +```bash +mc alias set \ + my-garage \ + http://172.20.0.101:3900 \ + \ + \ + --api S3v4 +``` + +You must also add an environment variable to your configuration to inform MinIO of our region (`garage` by default). +The best way is to add the following snippet to your `$HOME/.bash_profile` or `$HOME/.bashrc` file: + +```bash +export MC_REGION=garage +``` + +## Use `mc` + +You can not list buckets from `mc` currently. + +But the following commands and many more should work: + +```bash +mc cp image.png my-garage/nextcloud-bucket +mc cp my-garage/nextcloud-bucket/image.png . +mc ls my-garage/nextcloud-bucket +mc mirror localdir/ my-garage/another-bucket +``` diff --git a/doc/book/src/getting_started/binary.md b/doc/book/src/getting_started/binary.md deleted file mode 100644 index e48500ac..00000000 --- a/doc/book/src/getting_started/binary.md +++ /dev/null @@ -1,44 +0,0 @@ -# Get a binary - -Currently, only two installations procedures are supported for Garage: from Docker (x86\_64 for Linux) and from source. -In the future, we plan to add a third one, by publishing a compiled binary (x86\_64 for Linux). -We did not test other architecture/operating system but, as long as your architecture/operating system is supported by Rust, you should be able to run Garage (feel free to report your tests!). - -## From Docker - -Our docker image is currently named `lxpz/garage_amd64` and is stored on the [Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). -We encourage you to use a fixed tag (eg. `v0.2.1`) and not the `latest` tag. -For this example, we will use the latest published version at the time of the writing which is `v0.2.1` but it's up to you -to check [the most recent versions on the Docker Hub](https://hub.docker.com/r/lxpz/garage_amd64/tags?page=1&ordering=last_updated). - -For example: - -``` -sudo docker pull lxpz/garage_amd64:v0.2.1 -``` - -## From source - -Garage is a standard Rust project. -First, you need `rust` and `cargo`. -On Debian: - -```bash -sudo apt-get update -sudo apt-get install -y rustc cargo -``` - -Then, you can ask cargo to install the binary for you: - -```bash -cargo install garage -``` - -That's all, `garage` should be in `$HOME/.cargo/bin`. -You can add this folder to your `$PATH` or copy the binary somewhere else on your system. -For the following, we will assume you copied it in `/usr/local/bin/garage`: - -```bash -sudo cp $HOME/.cargo/bin/garage /usr/local/bin/garage -``` - diff --git a/doc/book/src/getting_started/bucket.md b/doc/book/src/getting_started/bucket.md deleted file mode 100644 index b4a2d81d..00000000 --- a/doc/book/src/getting_started/bucket.md +++ /dev/null @@ -1,74 +0,0 @@ -# Create buckets and keys - -*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.* - -In this section, we will suppose that we want to create a bucket named `nextcloud-bucket` -that will be accessed through a key named `nextcloud-app-key`. - -Don't forget that `help` command and `--help` subcommands can help you anywhere, the CLI tool is self-documented! Two examples: - -``` -garagectl help -garagectl bucket allow --help -``` - -## Create a bucket - -Fine, now let's create a bucket (we imagine that you want to deploy nextcloud): - -``` -garagectl bucket create nextcloud-bucket -``` - -Check that everything went well: - -``` -garagectl bucket list -garagectl bucket info nextcloud-bucket -``` - -## Create an API key - -Now we will generate an API key to access this bucket. -Note that API keys are independent of buckets: one key can access multiple buckets, multiple keys can access one bucket. - -Now, let's start by creating a key only for our PHP application: - -``` -garagectl key new --name nextcloud-app-key -``` - -You will have the following output (this one is fake, `key_id` and `secret_key` were generated with the openssl CLI tool): - -``` -Key name: nextcloud-app-key -Key ID: GK3515373e4c851ebaad366558 -Secret key: 7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34 -Authorized buckets: -``` - -Check that everything works as intended: - -``` -garagectl key list -garagectl key info nextcloud-app-key -``` - -## Allow a key to access a bucket - -Now that we have a bucket and a key, we need to give permissions to the key on the bucket! - -``` -garagectl bucket allow \ - --read \ - --write - nextcloud-bucket \ - --key nextcloud-app-key -``` - -You can check at any times allowed keys on your bucket with: - -``` -garagectl bucket info nextcloud-bucket -``` - diff --git a/doc/book/src/getting_started/cluster.md b/doc/book/src/getting_started/cluster.md deleted file mode 100644 index c9c18684..00000000 --- a/doc/book/src/getting_started/cluster.md +++ /dev/null @@ -1,73 +0,0 @@ -# Configure a cluster - -*We use a command named `garagectl` which is in fact an alias you must define as explained in the [Control the daemon](./daemon.md) section.* - -In this section, we will inform garage of the disk space available on each node of the cluster -as well as the site (think datacenter) of each machine. - -## Test cluster - -As this part is not relevant for a test cluster, you can use this one-liner to create a basic topology: - -```bash -garagectl status | grep UNCONFIGURED | grep -Po '^[0-9a-f]+' | while read id; do - garagectl node configure -d dc1 -c 1 $id -done -``` - -## Real-world cluster - -For our example, we will suppose we have the following infrastructure (Capacity, Identifier and Datacenter are specific values to garage described in the following): - -| Location | Name | Disk Space | `Capacity` | `Identifier` | `Datacenter` | -|----------|---------|------------|------------|--------------|--------------| -| Paris | Mercury | 1 To | `2` | `8781c5` | `par1` | -| Paris | Venus | 2 To | `4` | `2a638e` | `par1` | -| London | Earth | 2 To | `4` | `68143d` | `lon1` | -| Brussels | Mars | 1.5 To | `3` | `212f75` | `bru1` | - -### Identifier - -After its first launch, garage generates a random and unique identifier for each nodes, such as: - -``` -8781c50c410a41b363167e9d49cc468b6b9e4449b6577b64f15a249a149bdcbc -``` - -Often a shorter form can be used, containing only the beginning of the identifier, like `8781c5`, -which identifies the server "Mercury" located in "Paris" according to our previous table. - -The most simple way to match an identifier to a node is to run: - -``` -garagectl status -``` - -It will display the IP address associated with each node; from the IP address you will be able to recognize the node. - -### Capacity - -Garage reasons on an arbitrary metric about disk storage that is named the *capacity* of a node. -The capacity configured in Garage must be proportional to the disk space dedicated to the node. -Additionaly, the capacity values used in Garage should be as small as possible, with -1 ideally representing the size of your smallest server. - -Here we chose that 1 unit of capacity = 0.5 To, so that we can express servers of size -1 To and 2 To, as wel as the intermediate size 1.5 To. - -### Datacenter - -Datacenter are simply a user-chosen identifier that identify a group of server that are located in the same place. -It is up to the system administrator deploying garage to identify what does "the same place" means. -Behind the scene, garage will try to store the same data on different sites to provide high availability despite a data center failure. - -### Inject the topology - -Given the information above, we will configure our cluster as follow: - -``` -garagectl node configure --datacenter par1 -c 2 -t mercury 8781c5 -garagectl node configure --datacenter par1 -c 4 -t venus 2a638e -garagectl node configure --datacenter lon1 -c 4 -t earth 68143d -garagectl node configure --datacenter bru1 -c 3 -t mars 212f75 -``` diff --git a/doc/book/src/getting_started/control.md b/doc/book/src/getting_started/control.md deleted file mode 100644 index 9a66a0dc..00000000 --- a/doc/book/src/getting_started/control.md +++ /dev/null @@ -1,77 +0,0 @@ -# Control the daemon - -The `garage` binary has two purposes: - - it acts as a daemon when launched with `garage server ...` - - it acts as a control tool for the daemon when launched with any other command - -In this section, we will see how to use the `garage` binary as a control tool for the daemon we just started. -You first need to get a shell having access to this binary, which depends of your configuration: - - with `docker-compose`, run `sudo docker-compose exec g1 bash` then `/garage/garage` - - with `docker`, run `sudo docker exec -ti garaged bash` then `/garage/garage` - - with `systemd`, simply run `/usr/local/bin/garage` if you followed previous instructions - -*You can also install the binary on your machine to remotely control the cluster.* - -## Talk to the daemon and create an alias - -`garage` requires 4 options to talk with the daemon: - -``` ---ca-cert ---client-cert ---client-key --h, --rpc-host -``` - -The 3 first ones are certificates and keys needed by TLS, the last one is simply the address of garage's RPC endpoint. -Because we configure garage directly from the server, we do not need to set `--rpc-host`. -To avoid typing the 3 first options each time we want to run a command, we will create an alias. - -### `docker-compose` alias - -```bash -alias garagectl='/garage/garage \ - --ca-cert /pki/garage-ca.crt \ - --client-cert /pki/garage.crt \ - --client-key /pki/garage.key' -``` - -### `docker` alias - -```bash -alias garagectl='/garage/garage \ - --ca-cert /etc/garage/pki/garage-ca.crt \ - --client-cert /etc/garage/pki/garage.crt \ - --client-key /etc/garage/pki/garage.key' -``` - - -### raw binary alias - -```bash -alias garagectl='/usr/local/bin/garage \ - --ca-cert /etc/garage/pki/garage-ca.crt \ - --client-cert /etc/garage/pki/garage.crt \ - --client-key /etc/garage/pki/garage.key' -``` - -Of course, if your deployment does not match exactly one of this alias, feel free to adapt it to your needs! - -## Test the alias - -You can test your alias by running a simple command such as: - -``` -garagectl status -``` - -You should get something like that as result: - -``` -Healthy nodes: -2a638ed6c775b69a… 37f0ba978d27 [::ffff:172.20.0.101]:3901 UNCONFIGURED/REMOVED -68143d720f20c89d… 9795a2f7abb5 [::ffff:172.20.0.103]:3901 UNCONFIGURED/REMOVED -8781c50c410a41b3… 758338dde686 [::ffff:172.20.0.102]:3901 UNCONFIGURED/REMOVED -``` - -...which means that you are ready to configure your cluster! diff --git a/doc/book/src/getting_started/daemon.md b/doc/book/src/getting_started/daemon.md deleted file mode 100644 index 0f45daee..00000000 --- a/doc/book/src/getting_started/daemon.md +++ /dev/null @@ -1,222 +0,0 @@ -# Configure the daemon - -Garage is a software that can be run only in a cluster and requires at least 3 instances. -In our getting started guide, we document two deployment types: - - [Test deployment](#test-deployment) though `docker-compose` - - [Real-world deployment](#real-world-deployment) through `docker` or `systemd` - -In any case, you first need to generate TLS certificates, as traffic is encrypted between Garage's nodes. - -## Generating a TLS Certificate - -To generate your TLS certificates, run on your machine: - -``` -wget https://git.deuxfleurs.fr/Deuxfleurs/garage/raw/branch/master/genkeys.sh -chmod +x genkeys.sh -./genkeys.sh -``` - -It will creates a folder named `pki` containing the keys that you will used for the cluster. - -## Test deployment - -Single machine deployment is only described through `docker-compose`. - -Before starting, we recommend you create a folder for our deployment: - -```bash -mkdir garage-single -cd garage-single -``` - -We start by creating a file named `docker-compose.yml` describing our network and our containers: - -```yml -version: '3.4' - -networks: { virtnet: { ipam: { config: [ subnet: 172.20.0.0/24 ]}}} - -services: - g1: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.101 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" - - g2: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.102 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" - - g3: - image: lxpz/garage_amd64:v0.1.1d - networks: { virtnet: { ipv4_address: 172.20.0.103 }} - volumes: - - "./pki:/pki" - - "./config.toml:/garage/config.toml" -``` - -*We define a static network here which is not considered as a best practise on Docker. -The rational is that Garage only supports IP address and not domain names in its configuration, so we need to know the IP address in advance.* - -and then create the `config.toml` file next to it as follow: - -```toml -metadata_dir = "/garage/meta" -data_dir = "/garage/data" -rpc_bind_addr = "[::]:3901" -bootstrap_peers = [ - "172.20.0.101:3901", - "172.20.0.102:3901", - "172.20.0.103:3901", -] - -[rpc_tls] -ca_cert = "/pki/garage-ca.crt" -node_cert = "/pki/garage.crt" -node_key = "/pki/garage.key" - -[s3_api] -s3_region = "garage" -api_bind_addr = "[::]:3900" - -[s3_web] -bind_addr = "[::]:3902" -root_domain = ".web.garage" -index = "index.html" -``` - -*Please note that we have not mounted `/garage/meta` or `/garage/data` on the host: data will be lost when the container will be destroyed.* - -And that's all, you are ready to launch your cluster! - -``` -sudo docker-compose up -``` - -While your daemons are up, your cluster is still not configured yet. -However, you can check that your services are still listening as expected by querying them from your host: - -```bash -curl http://172.20.0.{101,102,103}:3902 -``` - -which should give you: - -``` -Not found -Not found -Not found -``` - -That's all, you are ready to [configure your cluster!](./cluster.md). - -## Real-world deployment - -Before deploying garage on your infrastructure, you must inventory your machines. -For our example, we will suppose the following infrastructure: - -| Location | Name | IP Address | Disk Space | -|----------|---------|------------|------------| -| Paris | Mercury | fc00:1::1 | 1 To | -| Paris | Venus | fc00:1::2 | 2 To | -| London | Earth | fc00:B::1 | 2 To | -| Brussels | Mars | fc00:F::1 | 1.5 To | - -On each machine, we will have a similar setup, especially you must consider the following folders/files: - - `/etc/garage/pki`: Garage certificates, must be generated on your computer and copied on the servers - - `/etc/garage/config.toml`: Garage daemon's configuration (defined below) - - `/etc/systemd/system/garage.service`: Service file to start garage at boot automatically (defined below, not required if you use docker) - - `/var/lib/garage/meta`: Contains Garage's metadata, put this folder on a SSD if possible - - `/var/lib/garage/data`: Contains Garage's data, this folder will grows and must be on a large storage, possibly big HDDs. - -A valid `/etc/garage/config.toml` for our cluster would be: - -```toml -metadata_dir = "/var/lib/garage/meta" -data_dir = "/var/lib/garage/data" -rpc_bind_addr = "[::]:3901" -bootstrap_peers = [ - "[fc00:1::1]:3901", - "[fc00:1::2]:3901", - "[fc00:B::1]:3901", - "[fc00:F::1]:3901", -] - -[rpc_tls] -ca_cert = "/etc/garage/pki/garage-ca.crt" -node_cert = "/etc/garage/pki/garage.crt" -node_key = "/etc/garage/pki/garage.key" - -[s3_api] -s3_region = "garage" -api_bind_addr = "[::]:3900" - -[s3_web] -bind_addr = "[::]:3902" -root_domain = ".web.garage" -index = "index.html" -``` - -Please make sure to change `bootstrap_peers` to **your** IP addresses! - -### For docker users - -On each machine, you can run the daemon with: - -```bash -docker run \ - -d \ - --name garaged \ - --restart always \ - --network host \ - -v /etc/garage/pki:/etc/garage/pki \ - -v /etc/garage/config.toml:/garage/config.toml \ - -v /var/lib/garage/meta:/var/lib/garage/meta \ - -v /var/lib/garage/data:/var/lib/garage/data \ - lxpz/garage_amd64:v0.1.1d -``` - -It should be restart automatically at each reboot. -Please note that we use host networking as otherwise Docker containers can no communicate with IPv6. - -To upgrade, simply stop and remove this container and start again the command with a new version of garage. - -### For systemd/raw binary users - -Create a file named `/etc/systemd/system/garage.service`: - -```toml -[Unit] -Description=Garage Data Store -After=network-online.target -Wants=network-online.target - -[Service] -Environment='RUST_LOG=garage=info' 'RUST_BACKTRACE=1' -ExecStart=/usr/local/bin/garage server -c /etc/garage/config.toml - -[Install] -WantedBy=multi-user.target -``` - -To start the service then automatically enable it at boot: - -```bash -sudo systemctl start garage -sudo systemctl enable garage -``` - -To see if the service is running and to browse its logs: - -```bash -sudo systemctl status garage -sudo journalctl -u garage -``` - -If you want to modify the service file, do not forget to run `systemctl daemon-reload` -to inform `systemd` of your modifications. diff --git a/doc/book/src/getting_started/files.md b/doc/book/src/getting_started/files.md deleted file mode 100644 index 0e3939ce..00000000 --- a/doc/book/src/getting_started/files.md +++ /dev/null @@ -1,42 +0,0 @@ -# Handle files - -We recommend the use of MinIO Client to interact with Garage files (`mc`). -Instructions to install it and use it are provided on the [MinIO website](https://docs.min.io/docs/minio-client-quickstart-guide.html). -Before reading the following, you need a working `mc` command on your path. - -## Configure `mc` - -You need your access key and secret key created in the [previous section](bucket.md). -You also need to set the endpoint: it must match the IP address of one of the node of the cluster and the API port (3900 by default). -For this whole configuration, you must set an alias name: we chose `my-garage`, that you will used for all commands. - -Adapt the following command accordingly and run it: - -```bash -mc alias set \ - my-garage \ - http://172.20.0.101:3900 \ - \ - \ - --api S3v4 -``` - -You must also add an environment variable to your configuration to inform MinIO of our region (`garage` by default). -The best way is to add the following snippet to your `$HOME/.bash_profile` or `$HOME/.bashrc` file: - -```bash -export MC_REGION=garage -``` - -## Use `mc` - -You can not list buckets from `mc` currently. - -But the following commands and many more should work: - -```bash -mc cp image.png my-garage/nextcloud-bucket -mc cp my-garage/nextcloud-bucket/image.png . -mc ls my-garage/nextcloud-bucket -mc mirror localdir/ my-garage/another-bucket -``` diff --git a/doc/book/src/reference_manual/cli.md b/doc/book/src/reference_manual/cli.md new file mode 100644 index 00000000..80789b9d --- /dev/null +++ b/doc/book/src/reference_manual/cli.md @@ -0,0 +1,4 @@ +# Garage CLI + +The Garage CLI is mostly self-documented. Make use of the `help` subcommand +and the `--help` flag to discover all available options. diff --git a/doc/book/src/reference_manual/configuration.md b/doc/book/src/reference_manual/configuration.md new file mode 100644 index 00000000..6c8d5ebc --- /dev/null +++ b/doc/book/src/reference_manual/configuration.md @@ -0,0 +1,196 @@ +# Garage configuration file format reference + +Here is an example `garage.toml` configuration file that illustrates all of the possible options: + +```toml +metadata_dir = "/var/lib/garage/meta" +data_dir = "/var/lib/garage/data" + +block_size = 1048576 + +replication_mode = "3" + +rpc_bind_addr = "[::]:3901" + +bootstrap_peers = [ + "[fc00:1::1]:3901", + "[fc00:1::2]:3901", + "[fc00:B::1]:3901", + "[fc00:F::1]:3901", +] + +consul_host = "consul.service" +consul_service_name = "garage-daemon" + +max_concurrent_rpc_requests = 12 + +sled_cache_capacity = 134217728 +sled_flush_every_ms = 2000 + +[rpc_tls] +ca_cert = "/etc/garage/pki/garage-ca.crt" +node_cert = "/etc/garage/pki/garage.crt" +node_key = "/etc/garage/pki/garage.key" + +[s3_api] +s3_region = "garage" +api_bind_addr = "[::]:3900" + +[s3_web] +bind_addr = "[::]:3902" +root_domain = ".web.garage" +index = "index.html" +``` + +The following gives details about each available configuration option. + +## Available configuration options + +#### `metadata_dir` + +The directory in which Garage will store its metadata. This contains the node identifier, +the network configuration and the peer list, the list of buckets and keys as well +as the index of all objects, object version and object blocks. + +Store this folder on a fast SSD drive if possible to maximize Garage's performance. + +#### `data_dir` + +The directory in which Garage will store the data blocks of objects. +This folder can be placed on an HDD. The space available for `data_dir` +should be counted to determine a node's capacity +when [configuring it](../getting_started/05_cluster.md). + +#### `block_size` + +Garage splits stored objects in consecutive chunks of size `block_size` (except the last +one which might be standard). The default size is 1MB and should work in most cases. +If you are interested in tuning this, feel free to do so (and remember to report your +findings to us!) + +#### `replication_mode` + +Garage supports the following replication modes: + +- `none` or `1`: data stored on Garage is stored on a single node. There is no redundancy, + and data will be unavailable as soon as one node fails or its network is disconnected. + Do not use this for anything else than test deployments. + +- `2`: data stored on Garage will be stored on two different nodes, if possible in different + zones. Garage tolerates one node failure before losing data. Data should be available + read-only when one node is down, but write operations will fail. + Use this only if you really have to. + +- `3`: data stored on Garage will be stored on three different nodes, if possible each in + a different zones. + Garage tolerates two node failure before losing data. Data should be available + read-only when two nodes are down, and writes should be possible if only a single node + is down. + +Note that in modes `2` and `3`, +if at least the same number of zones are available, an arbitrary number of failures in +any given zone is tolerated as copies of data will be spread over several zones. + +**Make sure `replication_mode` is the same in the configuration files of all nodes. +Never run a Garage cluster where that is not the case.** + +Changing the `replication_mode` of a cluster might work (make sure to shut down all nodes +and changing it everywhere at the time), but is not officially supported. + +#### `rpc_bind_addr` + +The address and port on which to bind for inter-cluster communcations +(reffered to as RPC for remote procedure calls). +The port specified here should be the same one that other nodes will used to contact +the node, even in the case of a NAT: the NAT should be configured to forward the external +port number to the same internal port nubmer. This means that if you have several nodes running +behind a NAT, they should each use a different RPC port number. + +#### `bootstrap_peers` + +A list of IPs and ports on which to contact other Garage peers of this cluster. +This should correspond to the RPC ports set up with `rpc_bind_addr`. + +#### `consul_host` and `consul_service_name` + +Garage supports discovering other nodes of the cluster using Consul. +This works only when nodes are announced in Consul by an orchestrator such as Nomad, +as Garage is not able to announce itself. + +The `consul_host` parameter should be set to the hostname of the Consul server, +and `consul_service_name` should be set to the service name under which Garage's +RPC ports are announced. + +#### `max_concurrent_rpc_requests` + +Garage implements rate limiting for RPC requests: no more than +`max_concurrent_rpc_requests` concurrent outbound RPC requests will be made +by a Garage node (additionnal requests will be put in a waiting queue). + +#### `sled_cache_capacity` + +This parameter can be used to tune the capacity of the cache used by +[sled](https://sled.rs), the database Garage uses internally to store metadata. +Tune this to fit the RAM you wish to make available to your Garage instance. +More cache means faster Garage, but the default value (128MB) should be plenty +for most use cases. + +#### `sled_flush_every_ms` + +This parameters can be used to tune the flushing interval of sled. +Increase this if sled is thrashing your SSD, at the risk of losing more data in case +of a power outage (though this should not matter much as data is replicated on other +nodes). The default value, 2000ms, should be appropriate for most use cases. + + +## The `[rpc_tls]` section + +This section should be used to configure the TLS certificates used to encrypt +intra-cluster traffic (RPC traffic). The following parameters should be set: + +- `ca_cert`: the certificate of the CA that is allowed to sign individual node certificates +- `node_cert`: the node certificate for the current node +- `node_key`: the key associated with the node certificate + +Note tha several nodes may use the same node certificate, as long as it is signed +by the CA. + +If this section is absent, TLS is not used to encrypt intra-cluster traffic. + + +## The `[s3_api]` section + +#### `api_bind_addr` + +The IP and port on which to bind for accepting S3 API calls. +This endpoint does not suport TLS: a reverse proxy should be used to provide it. + +#### `s3_region` + +Garage will accept S3 API calls that are targetted to the S3 region defined here. +API calls targetted to other regions will fail with a AuthorizationHeaderMalformed error +message that redirects the client to the correct region. + + +## The `[s3_web]` section + +Garage allows to publish content of buckets as websites. This section configures the +behaviour of this module. + +#### `bind_addr` + +The IP and port on which to bind for accepting HTTP requests to buckets configured +for website access. +This endpoint does not suport TLS: a reverse proxy should be used to provide it. + +#### `root_domain` + +The optionnal suffix appended to bucket names for the corresponding HTTP Host. + +For instance, if `root_domain` is `web.garage.eu`, a bucket called `deuxfleurs.fr` +will be accessible either with hostname `deuxfleurs.fr.web.garage.eu` +or with hostname `deuxfleurs.fr`. + +#### `index` + +The name of the index file to return for requests ending with `/` (usually `index.html`). diff --git a/doc/book/src/reference_manual/s3_compatibility.md b/doc/book/src/reference_manual/s3_compatibility.md index c0fc2863..5f9f527a 100644 --- a/doc/book/src/reference_manual/s3_compatibility.md +++ b/doc/book/src/reference_manual/s3_compatibility.md @@ -1,6 +1,6 @@ -## S3 Compatibility status +# S3 Compatibility status -### Global S3 features +## Global S3 features Implemented: @@ -18,7 +18,7 @@ Not implemented: - most `x-amz-` headers -### Endpoint implementation +## Endpoint implementation All APIs that are not mentionned are not implemented and will return a 400 bad request. diff --git a/doc/book/src/working_documents/load_balancing.md b/doc/book/src/working_documents/load_balancing.md index 583b6086..c436fdcb 100644 --- a/doc/book/src/working_documents/load_balancing.md +++ b/doc/book/src/working_documents/load_balancing.md @@ -1,8 +1,8 @@ -## Load Balancing Data (planned for version 0.2) +# Load Balancing Data (planned for version 0.2) I have conducted a quick study of different methods to load-balance data over different Garage nodes using consistent hashing. -### Requirements +## Requirements - *good balancing*: two nodes that have the same announced capacity should receive close to the same number of items @@ -15,9 +15,9 @@ I have conducted a quick study of different methods to load-balance data over di replicas, independently of the order in which nodes were added/removed (this is to keep the implementation simple) -### Methods +## Methods -#### Naive multi-DC ring walking strategy +### Naive multi-DC ring walking strategy This strategy can be used with any ring-like algorithm to make it aware of the *multi-datacenter* requirement: @@ -38,7 +38,7 @@ This method was implemented in the first version of Garage, with the basic ring construction from Dynamo DB that consists in associating `n_token` random positions to each node (I know it's not optimal, the Dynamo paper already studies this). -#### Better rings +### Better rings The ring construction that selects `n_token` random positions for each nodes gives a ring of positions that is not well-balanced: the space between the tokens varies a lot, and some partitions are thus bigger than others. @@ -150,7 +150,7 @@ removing grisou gipsie : 49.22% 36.52% 12.79% 1.46% on average: 62.94% 27.89% 8.61% 0.57% <-- WORSE THAN PREVIOUSLY ``` -#### The magical solution: multi-DC aware MagLev +### The magical solution: multi-DC aware MagLev Suppose we want to select three replicas for each partition (this is what we do in our simulation and in most Garage deployments). We apply MagLev three times consecutively, one for each replica selection. -- cgit v1.2.3