Merge branch 'main' into optimal-layout

author: Alex Auvolat <alex@adnab.me> 2022-11-07 12:20:59 +0100
committer: Alex Auvolat <alex@adnab.me> 2022-11-07 12:20:59 +0100
commit: 28d7a49f6365fadaffaa903cc10434c1ed28d564 (patch)
tree: 8da5b3213b7ff199af80e64af29a7a1395b9d02d /doc
parent: 3039bb5d431532f0ec907eab5e00f94acc4a3472 (diff)
parent: 66f2daa0259538c64508b37cec89d76a74a71a02 (diff)
download: garage-28d7a49f6365fadaffaa903cc10434c1ed28d564.tar.gz
garage-28d7a49f6365fadaffaa903cc10434c1ed28d564.zip
4 files changed, 224 insertions, 63 deletions
diff --git a/doc/book/connect/apps/index.md b/doc/book/connect/apps/index.md
index 2b642049..05e7cad9 100644
--- a/doc/book/connect/apps/index.md
+++ b/doc/book/connect/apps/index.md
@@ -9,7 +9,7 @@ In this section, we cover the following web applications:
 |------|--------|------|
 | [Nextcloud](#nextcloud)     | ✅       |  Both Primary Storage and External Storage are supported    |
 | [Peertube](#peertube)     | ✅       | Must be configured with the website endpoint     |
-| [Mastodon](#mastodon)     | ❓       | Not yet tested     |
+| [Mastodon](#mastodon)     | ✅       | Natively supported    |
 | [Matrix](#matrix)     | ✅       |  Tested with `synapse-s3-storage-provider`    |
 | [Pixelfed](#pixelfed)     | ❓       |  Not yet tested    |
 | [Pleroma](#pleroma)     | ❓       |  Not yet tested    |
@@ -224,7 +224,135 @@ You can now reload the page and see in your browser console that data are fetche
 
 ## Mastodon
 
-https://docs.joinmastodon.org/admin/config/#cdn
+Mastodon natively supports the S3 protocol to store media files, and it works out-of-the-box with Garage.
+You will need to expose your Garage bucket as a website: that way, media files will be served directly from Garage.
+
+### Performance considerations
+
+Mastodon tends to store many small objects over time: expect hundreds of thousands of objects,
+with average object size ranging from 50 KB to 150 KB.
+
+As such, your Garage cluster should be configured appropriately for good performance:
+
+- use Garage v0.8.0 or higher with the [LMDB database engine](@documentation/reference-manual/configuration.md#db-engine-since-v0-8-0).
+  With the default Sled database engine, your database could quickly end up taking tens of GB of disk space.
+- the Garage database should be stored on a SSD
+
+### Creating your bucket
+
+This is the usual Garage setup:
+
+```bash
+garage key new --name mastodon-key
+garage bucket create mastodon-data
+garage bucket allow mastodon-data --read --write --key mastodon-key
+```
+
+Note the Key ID and Secret Key.
+
+### Exposing your bucket as a website
+
+Create a DNS name to serve your media files, such as `my-social-media.mydomain.tld`.
+This name will be publicly exposed to the users of your Mastodon instance: they
+will load images directly from this DNS name.
+
+As [documented here](@/documentation/cookbook/exposing-websites.md),
+add this DNS name as alias to your bucket, and expose it as a website:
+
+```bash
+garage bucket alias mastodon-data my-social-media.mydomain.tld
+garage bucket website --allow mastodon-data
+```
+
+Then you will likely need to [setup a reverse proxy](@/documentation/cookbook/reverse-proxy.md)
+in front of it to serve your media files over HTTPS.
+
+### Cleaning up old media files before migration
+
+Mastodon instance quickly accumulate a lot of media files from the federation.
+Most of them are not strictly necessary because they can be fetched again from
+other servers.  As such, it is highly recommended to clean them up before
+migration, this will greatly reduce the migration time.
+
+From the [official Mastodon documentation](https://docs.joinmastodon.org/admin/tootctl/#media):
+
+```bash
+$ RAILS_ENV=production bin/tootctl media remove --days 3
+$ RAILS_ENV=production bin/tootctl media remove-orphans
+$ RAILS_ENV=production bin/tootctl preview_cards remove --days 15
+```
+
+Here is a typical disk usage for a small but multi-year instance after cleanup:
+
+```bash
+$ RAILS_ENV=production bin/tootctl media usage
+Attachments:	5.67 GB (1.14 GB local)
+Custom emoji:	295 MB (0 Bytes local)
+Preview cards:	154 MB
+Avatars:	3.77 GB (127 KB local)
+Headers:	8.72 GB (242 KB local)
+Backups:	0 Bytes
+Imports:	1.7 KB
+Settings:	0 Bytes
+```
+
+Unfortunately, [old avatars and headers cannot currently be cleaned up](https://github.com/mastodon/mastodon/issues/9567).
+
+### Migrating your data
+
+Data migration should be done with an efficient S3 client.
+The [minio client](@documentation/connect/cli.md#minio-client) is a good choice
+thanks to its mirror mode:
+
+```bash
+mc mirror ./public/system/ garage/mastodon-data
+```
+
+Here is a typical bucket usage after all data has been migrated:
+
+```bash
+$ garage bucket info mastodon-data
+
+Size: 20.3 GiB (21.8 GB)
+Objects: 175968
+```
+
+### Configuring Mastodon
+
+In your `.env.production` configuration file:
+
+```bash
+S3_ENABLED=true
+# Internal access to Garage
+S3_ENDPOINT=http://my-garage-instance.mydomain.tld:3900
+S3_REGION=garage
+S3_BUCKET=mastodon-data
+# Change this (Key ID and Secret Key of your Garage key)
+AWS_ACCESS_KEY_ID=GKe88df__CHANGETHIS__c5145
+AWS_SECRET_ACCESS_KEY=a2f7__CHANGETHIS__77fcfcf7a58f47a4aa4431f2e675c56da37821a1070000
+# What name gets exposed to users (HTTPS is implicit)
+S3_ALIAS_HOST=my-social-media.mydomain.tld
+```
+
+For more details, see the [reference Mastodon documentation](https://docs.joinmastodon.org/admin/config/#cdn).
+
+Restart all Mastodon services and everything should now be using Garage!
+You can check the URLs of images in the Mastodon web client, they should start
+with `https://my-social-media.mydomain.tld`.
+
+### Last migration sync
+
+After Mastodon is successfully using Garage, you can run a last sync from the local filesystem to Garage:
+
+```bash
+mc mirror --newer-than "3h" ./public/system/ garage/mastodon-data
+```
+
+### References
+
+[cybrespace's guide to migrate to S3](https://github.com/cybrespace/cybrespace-meta/blob/master/s3.md)
+(the guide is for Amazon S3, so the configuration is a bit different, but the rest is similar)
+
 
 ## Matrix
 
diff --git a/doc/book/reference-manual/configuration.md b/doc/book/reference-manual/configuration.md
index 97da0e0e..2d9c3f0c 100644
--- a/doc/book/reference-manual/configuration.md
+++ b/doc/book/reference-manual/configuration.md
@@ -13,6 +13,9 @@ db_engine = "lmdb"
 
 block_size = 1048576
 
+sled_cache_capacity = 134217728
+sled_flush_every_ms = 2000
+
 replication_mode = "3"
 
 compression_level = 1
@@ -28,15 +31,20 @@ bootstrap_peers = [
     "212fd62eeaca72c122b45a7f4fa0f55e012aa5e24ac384a72a3016413fa724ff@[fc00:F::1]:3901",
 ]
 
-consul_host = "consul.service"
-consul_service_name = "garage-daemon"
 
-kubernetes_namespace = "garage"
-kubernetes_service_name = "garage-daemon"
-kubernetes_skip_crd = false
+[consul_discovery]
+consul_http_addr = "http://127.0.0.1:8500"
+service_name = "garage-daemon"
+ca_cert = "/etc/consul/consul-ca.crt"
+client_cert = "/etc/consul/consul-client.crt"
+client_key = "/etc/consul/consul-key.crt"
+tls_skip_verify = false
+
+[kubernetes_discovery]
+namespace = "garage"
+service_name = "garage-daemon"
+skip_crd = false
 
-sled_cache_capacity = 134217728
-sled_flush_every_ms = 2000
 
 [s3_api]
 api_bind_addr = "[::]:3900"
@@ -129,6 +137,21 @@ files will remain available. This however means that chunks from existing files
 will not be deduplicated with chunks from newly uploaded files, meaning you
 might use more storage space that is optimally possible.
 
+### `sled_cache_capacity`
+
+This parameter can be used to tune the capacity of the cache used by
+[sled](https://sled.rs), the database Garage uses internally to store metadata.
+Tune this to fit the RAM you wish to make available to your Garage instance.
+This value has a conservative default (128MB) so that Garage doesn't use too much
+RAM by default, but feel free to increase this for higher performance.
+
+### `sled_flush_every_ms`
+
+This parameters can be used to tune the flushing interval of sled.
+Increase this if sled is thrashing your SSD, at the risk of losing more data in case
+of a power outage (though this should not matter much as data is replicated on other
+nodes). The default value, 2000ms, should be appropriate for most use cases.
+
 ### `replication_mode`
 
 Garage supports the following replication modes:
@@ -276,47 +299,57 @@ be obtained by running `garage node id` and then included directly in the
 key will be returned by `garage node id` and you will have to add the IP
 yourself.
 
-### `consul_host` and `consul_service_name`
+
+## The `[consul_discovery]` section
 
 Garage supports discovering other nodes of the cluster using Consul.  For this
 to work correctly, nodes need to know their IP address by which they can be
 reached by other nodes of the cluster, which should be set in `rpc_public_addr`.
 
-The `consul_host` parameter should be set to the hostname of the Consul server,
-and `consul_service_name` should be set to the service name under which Garage's
+### `consul_http_addr` and `service_name`
+
+The `consul_http_addr` parameter should be set to the full HTTP(S) address of the Consul server.
+
+### `service_name`
+
+`service_name` should be set to the service name under which Garage's
 RPC ports are announced.
 
-Garage does not yet support talking to Consul over TLS.
+### `client_cert`, `client_key`
 
-### `kubernetes_namespace`, `kubernetes_service_name` and `kubernetes_skip_crd`
+TLS client certificate and client key to use when communicating with Consul over TLS. Both are mandatory when doing so.
 
-Garage supports discovering other nodes of the cluster using kubernetes custom
-resources. For this to work `kubernetes_namespace` and `kubernetes_service_name`
-need to be configured.
+### `ca_cert`
 
-`kubernetes_namespace` sets the namespace in which the custom resources are
-configured. `kubernetes_service_name` is added as a label to these resources to
-filter them, to allow for multiple deployments in a single namespace.
+TLS CA certificate to use when communicating with Consul over TLS.
 
-`kubernetes_skip_crd` can be set to true to disable the automatic creation and
-patching of the `garagenodes.deuxfleurs.fr` CRD. You will need to create the CRD
-manually.
+### `tls_skip_verify`
 
-### `sled_cache_capacity`
+Skip server hostname verification in TLS handshake.
+`ca_cert` is ignored when this is set.
 
-This parameter can be used to tune the capacity of the cache used by
-[sled](https://sled.rs), the database Garage uses internally to store metadata.
-Tune this to fit the RAM you wish to make available to your Garage instance.
-This value has a conservative default (128MB) so that Garage doesn't use too much
-RAM by default, but feel free to increase this for higher performance.
 
-### `sled_flush_every_ms`
+## The `[kubernetes_discovery]` section
 
-This parameters can be used to tune the flushing interval of sled.
-Increase this if sled is thrashing your SSD, at the risk of losing more data in case
-of a power outage (though this should not matter much as data is replicated on other
-nodes). The default value, 2000ms, should be appropriate for most use cases.
+Garage supports discovering other nodes of the cluster using kubernetes custom
+resources. For this to work, a `[kubernetes_discovery]` section must be present
+with at least the `namespace` and `service_name` parameters.
+
+### `namespace`
+
+`namespace` sets the namespace in which the custom resources are
+configured.
 
+### `service_name`
+
+`service_name` is added as a label to the advertised resources to
+filter them, to allow for multiple deployments in a single namespace.
+
+### `skip_crd`
+
+`skip_crd` can be set to true to disable the automatic creation and
+patching of the `garagenodes.deuxfleurs.fr` CRD. You will need to create the CRD
+manually.
 
 
 ## The `[s3_api]` section
diff --git a/doc/book/reference-manual/features.md b/doc/book/reference-manual/features.md
index d2d28946..1f21af6e 100644
--- a/doc/book/reference-manual/features.md
+++ b/doc/book/reference-manual/features.md
@@ -106,7 +106,7 @@ to be manually connected to one another.
 
 ### Support for changing IP addresses
 
-As long as all of your nodes don't thange their IP address at the same time,
+As long as all of your nodes don't change their IP address at the same time,
 Garage should be able to tolerate nodes with changing/dynamic IP addresses,
 as nodes will regularly exchange the IP addresses of their peers and try to
 reconnect using newer addresses when existing connections are broken.
diff --git a/doc/drafts/k2v-spec.md b/doc/drafts/k2v-spec.md
index 175bb02e..9d41b2c0 100644
--- a/doc/drafts/k2v-spec.md
+++ b/doc/drafts/k2v-spec.md
@@ -206,8 +206,8 @@ and responses need to be translated.
 
 Query parameters:
 
-| name | default value | meaning |
-| - | - | - |
+| name       | default value | meaning                          |
+|------------|---------------|----------------------------------|
 | `sort_key` | **mandatory** | The sort key of the item to read |
 
 Returns the item with specified partition key and sort key. Values can be
@@ -317,11 +317,11 @@ an HTTP 304 NOT MODIFIED is returned.
 
 Query parameters:
 
-| name | default value | meaning |
-| - | - | - |
-| `sort_key` | **mandatory** | The sort key of the item to read |
-| `causality_token` | **mandatory** | The causality token of the last known value or set of values |
-| `timeout` | 300 | The timeout before 304 NOT MODIFIED is returned if the value isn't updated |
+| name              | default value | meaning                                                                    |
+|-------------------|---------------|----------------------------------------------------------------------------|
+| `sort_key`        | **mandatory** | The sort key of the item to read                                           |
+| `causality_token` | **mandatory** | The causality token of the last known value or set of values               |
+| `timeout`         | 300           | The timeout before 304 NOT MODIFIED is returned if the value isn't updated |
 
 The timeout can be set to any number of seconds, with a maximum of 600 seconds (10 minutes).
 
@@ -346,7 +346,7 @@ myblobblahblahblah
 Example response:
 
 ```
-HTTP/1.1 200 OK
+HTTP/1.1 204 No Content
 ```
 
 **DeleteItem: `DELETE /<bucket>/<partition key>?sort_key=<sort_key>`**
@@ -382,13 +382,13 @@ as these values are asynchronously updated, and thus eventually consistent.
 
 Query parameters:
 
-| name | default value | meaning |
-| - | - | - |
-| `prefix` | `null` | Restrict listing to partition keys that start with this prefix |
-| `start` | `null` | First partition key to list, in lexicographical order |
-| `end` | `null` | Last partition key to list (excluded) |
-| `limit` | `null` | Maximum number of partition keys to list |
-| `reverse` | `false` | Iterate in reverse lexicographical order |
+| name      | default value | meaning                                                        |
+|-----------|---------------|----------------------------------------------------------------|
+| `prefix`  | `null`        | Restrict listing to partition keys that start with this prefix |
+| `start`   | `null`        | First partition key to list, in lexicographical order          |
+| `end`     | `null`        | Last partition key to list (excluded)                          |
+| `limit`   | `null`        | Maximum number of partition keys to list                       |
+| `reverse` | `false`       | Iterate in reverse lexicographical order                       |
 
 The response consists in a JSON object that repeats the parameters of the query and gives the result (see below).
 
@@ -512,7 +512,7 @@ POST /my_bucket HTTP/1.1
 Example response:
 
 ```
-HTTP/1.1 200 OK
+HTTP/1.1 204 NO CONTENT
 ```
 
 
@@ -525,17 +525,17 @@ The request body is a JSON list of searches, that each specify a range of
 items to get (to get single items, set `singleItem` to `true`). A search is a
 JSON struct with the following fields:
 
-| name | default value | meaning |
-| - | - | - |
-| `partitionKey` | **mandatory** | The partition key in which to search |
-| `prefix` | `null` | Restrict items to list to those whose sort keys start with this prefix |
-| `start` | `null` | The sort key of the first item to read |
-| `end` | `null` | The sort key of the last item to read (excluded) |
-| `limit` | `null` | The maximum number of items to return |
-| `reverse` | `false` | Iterate in reverse lexicographical order on sort keys |
-| `singleItem` | `false` | Whether to return only the item with sort key `start` |
-| `conflictsOnly` | `false` | Whether to return only items that have several concurrent values |
-| `tombstones` | `false` | Whether or not to return tombstone lines to indicate the presence of old deleted items |
+| name            | default value | meaning                                                                                |
+|-----------------|---------------|----------------------------------------------------------------------------------------|
+| `partitionKey`  | **mandatory** | The partition key in which to search                                                   |
+| `prefix`        | `null`        | Restrict items to list to those whose sort keys start with this prefix                 |
+| `start`         | `null`        | The sort key of the first item to read                                                 |
+| `end`           | `null`        | The sort key of the last item to read (excluded)                                       |
+| `limit`         | `null`        | The maximum number of items to return                                                  |
+| `reverse`       | `false`       | Iterate in reverse lexicographical order on sort keys                                  |
+| `singleItem`    | `false`       | Whether to return only the item with sort key `start`                                  |
+| `conflictsOnly` | `false`       | Whether to return only items that have several concurrent values                       |
+| `tombstones`    | `false`       | Whether or not to return tombstone lines to indicate the presence of old deleted items |
 
 
 For each of the searches, triplets are listed and returned separately. The
@@ -683,7 +683,7 @@ POST /my_bucket?delete HTTP/1.1
 
 Example response:
 
-```
+```json
 HTTP/1.1 200 OK
 
 [
author	Alex Auvolat <alex@adnab.me>	2022-11-07 12:20:59 +0100
committer	Alex Auvolat <alex@adnab.me>	2022-11-07 12:20:59 +0100
commit	28d7a49f6365fadaffaa903cc10434c1ed28d564 (patch)
tree	8da5b3213b7ff199af80e64af29a7a1395b9d02d /doc
parent	3039bb5d431532f0ec907eab5e00f94acc4a3472 (diff)
parent	66f2daa0259538c64508b37cec89d76a74a71a02 (diff)
download	garage-28d7a49f6365fadaffaa903cc10434c1ed28d564.tar.gz garage-28d7a49f6365fadaffaa903cc10434c1ed28d564.zip