diff options
author | Quentin Dufour <quentin@deuxfleurs.fr> | 2021-03-17 14:44:14 +0100 |
---|---|---|
committer | Quentin Dufour <quentin@deuxfleurs.fr> | 2021-03-17 14:44:14 +0100 |
commit | 0afc701a698c4891ea0f09fae668cb06b16757d7 (patch) | |
tree | e256fcc3c5fff777ae30f97dfecb322b4e56d40b /doc/book/src | |
parent | 6a3dcf39740cda27e61b93582b6fea66991ec4f2 (diff) | |
download | garage-0afc701a698c4891ea0f09fae668cb06b16757d7.tar.gz garage-0afc701a698c4891ea0f09fae668cb06b16757d7.zip |
Doc skeleton + intro
Diffstat (limited to 'doc/book/src')
-rw-r--r-- | doc/book/src/S3_Compatibility.md | 84 | ||||
-rw-r--r-- | doc/book/src/SUMMARY.md | 26 | ||||
-rw-r--r-- | doc/book/src/compatibility.md | 1 | ||||
-rw-r--r-- | doc/book/src/devenv.md | 17 | ||||
-rw-r--r-- | doc/book/src/getting_started.md | 5 | ||||
-rw-r--r-- | doc/book/src/getting_started/bucket.md | 1 | ||||
-rw-r--r-- | doc/book/src/getting_started/cluster.md | 1 | ||||
-rw-r--r-- | doc/book/src/getting_started/files.md | 1 | ||||
-rw-r--r-- | doc/book/src/getting_started/install.md | 3 | ||||
-rw-r--r-- | doc/book/src/img/logo.svg | 44 | ||||
-rw-r--r-- | doc/book/src/internals.md | 158 | ||||
-rw-r--r-- | doc/book/src/intro.md | 62 | ||||
-rw-r--r-- | doc/book/src/quickstart_bucket.md | 71 | ||||
-rw-r--r-- | doc/book/src/related_work.md | 38 | ||||
-rw-r--r-- | doc/book/src/website.md | 1 |
15 files changed, 513 insertions, 0 deletions
diff --git a/doc/book/src/S3_Compatibility.md b/doc/book/src/S3_Compatibility.md new file mode 100644 index 00000000..c0fc2863 --- /dev/null +++ b/doc/book/src/S3_Compatibility.md @@ -0,0 +1,84 @@ +## S3 Compatibility status + +### Global S3 features + +Implemented: + +- path-style URLs (`garage.tld/bucket/key`) +- putting and getting objects in buckets +- multipart uploads +- listing objects +- access control on a per-key-per-bucket basis + +Not implemented: + +- vhost-style URLs (`bucket.garage.tld/key`) +- object-level ACL +- encryption +- most `x-amz-` headers + + +### Endpoint implementation + +All APIs that are not mentionned are not implemented and will return a 400 bad request. + +#### AbortMultipartUpload + +Implemented. + +#### CompleteMultipartUpload + +Implemented badly. Garage will not check that all the parts stored correspond to the list given by the client in the request body. This means that the multipart upload might be completed with an invalid size. This is a bug and will be fixed. + +#### CopyObject + +Implemented. + +#### CreateBucket + +Garage does not accept creating buckets or giving access using API calls, it has to be done using the CLI tools. CreateBucket will return a 200 if the bucket exists and user has write access, and a 403 Forbidden in all other cases. + +#### CreateMultipartUpload + +Implemented. + +#### DeleteBucket + +Garage does not accept deleting buckets using API calls, it has to be done using the CLI tools. This request will return a 403 Forbidden. + +#### DeleteObject + +Implemented. + +#### DeleteObjects + +Implemented. + +#### GetObject + +Implemented. + +#### HeadBucket + +Implemented. + +#### HeadObject + +Implemented. + +#### ListObjects + +Implemented, but there isn't a very good specification of what `encoding-type=url` covers so there might be some encoding bugs. In our implementation the url-encoded fields are in the same in ListObjects as they are in ListObjectsV2. + +#### ListObjectsV2 + +Implemented. + +#### PutObject + +Implemented. + +#### UploadPart + +Implemented. + diff --git a/doc/book/src/SUMMARY.md b/doc/book/src/SUMMARY.md new file mode 100644 index 00000000..3e4618a1 --- /dev/null +++ b/doc/book/src/SUMMARY.md @@ -0,0 +1,26 @@ +# Summary + +[The Garage Data Store](./intro.md) + +- [Getting Started](./getting_started.md) + - [Installation](./getting_started/install.md) + - [Configure a cluster](./getting_started/cluster.md) + - [Create buckets and keys](./getting_started/bucket.md) + - [Handle files](./getting_started/files.md) + +# Cookbooks + +- [Host a website](./website.md) + +# Reference Manual + +- [S3 API](./compatibility.md) + +# Design + +- [Related Work](./related_work.md) +- [Internals](./internals.md) + +# Development + +- [Setup your environment](./devenv.md) diff --git a/doc/book/src/compatibility.md b/doc/book/src/compatibility.md new file mode 100644 index 00000000..acf9968b --- /dev/null +++ b/doc/book/src/compatibility.md @@ -0,0 +1 @@ +# S3 API diff --git a/doc/book/src/devenv.md b/doc/book/src/devenv.md new file mode 100644 index 00000000..6cb7c554 --- /dev/null +++ b/doc/book/src/devenv.md @@ -0,0 +1,17 @@ +# Setup your development environment + +We propose the following quickstart to setup a full dev. environment as quickly as possible: + + 1. Setup a rust/cargo environment. eg. `dnf install rust cargo` + 2. Install awscli v2 by following the guide [here](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html). + 3. Run `cargo build` to build the project + 4. Run `./script/dev-cluster.sh` to launch a test cluster (feel free to read the script) + 5. Run `./script/dev-configure.sh` to configure your test cluster with default values (same datacenter, 100 tokens) + 6. Run `./script/dev-bucket.sh` to create a bucket named `eprouvette` and an API key that will be stored in `/tmp/garage.s3` + 7. Run `source ./script/dev-env-aws.sh` to configure your CLI environment + 8. You can use `garage` to manage the cluster. Try `garage --help`. + 9. You can use the `awsgrg` alias to add, remove, and delete files. Try `awsgrg help`, `awsgrg cp /proc/cpuinfo s3://eprouvette/cpuinfo.txt`, or `awsgrg ls s3://eprouvette`. `awsgrg` is a wrapper on the `aws s3` command pre-configured with the previously generated API key (the one in `/tmp/garage.s3`) and localhost as the endpoint. + +Now you should be ready to start hacking on garage! + + diff --git a/doc/book/src/getting_started.md b/doc/book/src/getting_started.md new file mode 100644 index 00000000..7d3efd8b --- /dev/null +++ b/doc/book/src/getting_started.md @@ -0,0 +1,5 @@ +# Gettin Started + +Let's start your Garage journey! +In this chapter, we explain how to deploy a simple garage cluster and start interacting with it. +Our goal is to introduce you to Garage's workflows. diff --git a/doc/book/src/getting_started/bucket.md b/doc/book/src/getting_started/bucket.md new file mode 100644 index 00000000..c3b39d8d --- /dev/null +++ b/doc/book/src/getting_started/bucket.md @@ -0,0 +1 @@ +# Create buckets and keys diff --git a/doc/book/src/getting_started/cluster.md b/doc/book/src/getting_started/cluster.md new file mode 100644 index 00000000..868379b9 --- /dev/null +++ b/doc/book/src/getting_started/cluster.md @@ -0,0 +1 @@ +# Configure a cluster diff --git a/doc/book/src/getting_started/files.md b/doc/book/src/getting_started/files.md new file mode 100644 index 00000000..c8042dd3 --- /dev/null +++ b/doc/book/src/getting_started/files.md @@ -0,0 +1 @@ +# Handle files diff --git a/doc/book/src/getting_started/install.md b/doc/book/src/getting_started/install.md new file mode 100644 index 00000000..39557f02 --- /dev/null +++ b/doc/book/src/getting_started/install.md @@ -0,0 +1,3 @@ +# Installation + + diff --git a/doc/book/src/img/logo.svg b/doc/book/src/img/logo.svg new file mode 100644 index 00000000..fb02c40b --- /dev/null +++ b/doc/book/src/img/logo.svg @@ -0,0 +1,44 @@ +<?xml version="1.0" encoding="UTF-8"?> +<svg width="128" height="128" version="1.1" viewBox="0 0 33.867 33.867" xmlns="http://www.w3.org/2000/svg" xmlns:cc="http://creativecommons.org/ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> + <metadata> + <rdf:RDF> + <cc:Work rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type rdf:resource="http://purl.org/dc/dcmitype/StillImage"/> + <dc:title/> + </cc:Work> + </rdf:RDF> + </metadata> + <g stroke-width=".14689"> + <path d="m20.613 10.981a2.2034 2.2034 0 0 1-0.73445-0.07638l-9.2042-2.4839a2.2342 2.2342 0 0 1-0.69332-0.32757z"/> + <g fill="#4e4e4e"> + <path class="cls-1" d="m6.6028 26.612 1.3661-0.0088h0.01763q0.75796 0 0.75796 0.71389v2.3003a6.5748 6.5748 0 0 1-2.2886 0.37898q-1.2515 0-1.8861-0.8505t-0.63457-2.3179q0-1.4689 0.7888-2.2827a2.5823 2.5823 0 0 1 1.9301-0.81524 3.5371 3.5371 0 0 1 2.0667 0.64338 1.0385 1.0385 0 0 1-0.18068 0.46711 1.2603 1.2603 0 0 1-0.33932 0.35254 2.5926 2.5926 0 0 0-1.5027-0.51999 1.4175 1.4175 0 0 0-1.1854 0.54203q-0.42304 0.53909-0.42304 1.6966 0 2.1769 1.604 2.1769a4.4743 4.4743 0 0 0 0.97829-0.11457v-0.83728q0-0.3966 0.01763-0.58756h-0.64633a0.60519 0.60519 0 0 1-0.40101-0.11018 0.44067 0.44067 0 0 1-0.12779-0.35254 1.51 1.51 0 0 1 0.088134-0.47446z"/> + <path class="cls-1" d="m13.401 29.379a1.1413 1.1413 0 0 1-0.14689 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.80937-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67863 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2354-0.3687q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.78439 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.7979q-0.0029 0.48474 0.24384 0.68745zm-2.2122-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18214 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14689z"/> + <path class="cls-1" d="m14.115 26.012a1.0547 1.0547 0 0 1 0.14689-0.32169 0.88134 0.88134 0 0 1 0.22474-0.25118 1.1017 1.1017 0 0 1 0.92982 0.78439q0.35254-0.78439 1.1369-0.78439a2.7028 2.7028 0 0 1 0.51118 0.06169 1.9786 1.9786 0 0 1-0.2644 1.0282 2.2357 2.2357 0 0 0-0.3966-0.05288q-0.53762 0-0.86372 0.57287v2.8174a3.0627 3.0627 0 0 1-0.53762 0.04407 3.3785 3.3785 0 0 1-0.55525-0.04407v-2.9525q-0.0059-0.6375-0.33197-0.90191z"/> + <path class="cls-1" d="m21.157 29.379a1.1413 1.1413 0 0 1-0.15423 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.8079-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67864 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2295-0.37457q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.7844 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.8038q0.0088 0.48474 0.25559 0.68745zm-2.2151-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18508 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14395z"/> + <path class="cls-1" d="m22.241 29.344q-0.3966-0.60813-0.3966-1.679t0.50236-1.679a1.5188 1.5188 0 0 1 1.2074-0.60813 1.7039 1.7039 0 0 1 1.1898 0.44067 0.99739 0.99739 0 0 1 0.69626-0.37898 0.82552 0.82552 0 0 1 0.23356 0.24677 1.0282 1.0282 0 0 1 0.14689 0.30847q-0.24678 0.21152-0.24678 0.75796v2.4971q0 1.4013-0.4583 1.983-0.4583 0.58169-1.5071 0.58756a4.2598 4.2598 0 0 1-1.5776-0.29378 1.1854 1.1854 0 0 1 0.27322-0.80202 2.882 2.882 0 0 0 1.1854 0.27322q0.57728 0 0.79761-0.29378a1.322 1.322 0 0 0 0.22034-0.81084v-0.35254a1.6936 1.6936 0 0 1-1.1017 0.41423 1.3014 1.3014 0 0 1-1.1648-0.61106zm2.2651-0.71389v-2.0447a1.1355 1.1355 0 0 0-0.75796-0.36135 0.63604 0.63604 0 0 0-0.57728 0.37898 2.2988 2.2988 0 0 0-0.20712 1.0841q0 0.70508 0.18949 1.04a0.56406 0.56406 0 0 0 0.49796 0.33491 1.1193 1.1193 0 0 0 0.8549-0.43186z"/> + <path class="cls-1" d="m30.105 28.039h-2.4678a1.4924 1.4924 0 0 0 0.23356 0.80643q0.20712 0.28644 0.72711 0.28644a2.6778 2.6778 0 0 0 1.1546-0.30847 1.159 1.159 0 0 1 0.31728 0.66982 2.8467 2.8467 0 0 1-1.6966 0.50236q-0.99151 0-1.4234-0.64338-0.43186-0.64338-0.43186-1.6657 0-1.0282 0.47592-1.6657a1.5923 1.5923 0 0 1 1.3617-0.64338q0.88134 0 1.3617 0.53321a1.9434 1.9434 0 0 1 0.47593 1.344 3.4519 3.4519 0 0 1-0.08813 0.7844zm-1.701-1.8684q-0.7227 0-0.77558 1.0929h1.5335v-0.10576a1.25 1.25 0 0 0-0.18508-0.71389 0.64338 0.64338 0 0 0-0.567-0.27321z"/> + </g> + <path d="m17.034 3.0341a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56586 0.20418 0.20418 0 0 0 0.01757 0.04702l1.8769 3.7099h1.6288l-0.23151-1.2935c-0.0191-0.10429-0.18819-0.84337-0.3483-1.3751l5.4746 1.71c0.07196 0.34089 0.16746 0.65935 0.28112 0.9586h8.8765c0.0978-0.29932 0.17499-0.61834 0.22738-0.9586l5.4627-1.7053c-0.16011 0.53174-0.32713 1.2662-0.34623 1.3705l-0.23151 1.2935h1.6283l1.8593-3.6763 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.4191l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#ffd952"/> + <path d="m17.034 5.4825a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56534 0.20418 0.20418 0 0 0 0.01757 0.04703l1.018 2.0118h2.1632c-0.068234-0.28802-0.15662-0.64282-0.25528-0.97049l3.1073 0.97048h14.121l3.0939-0.96583c-0.09841 0.32682-0.18541 0.67924-0.25321 0.96583h2.1627l1.0005-1.9782 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.41858l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#49c8fa"/> + <path class="cls-2" d="m30.198 13.82a0.39367 0.39367 0 0 1-0.01762 0.13661 0.027909 0.027909 0 0 1 0 0.01175l-0.01762 0.04554-0.01762 0.03379-2.8306 5.5965c-0.39367 0.77705-1.1178 0.75355-0.99592-0.03232l0.56993-3.1817c0.0191-0.10429 0.18655-0.83874 0.34666-1.3705l-5.4629 1.7054c-0.85784 5.5716-8.1891 5.6641-9.3848 0l-5.4746-1.7098c0.16011 0.53174 0.32904 1.2706 0.34813 1.3749l0.56994 3.1816c0.12192 0.78586-0.60225 0.80937-0.99592 0.03232l-2.8482-5.6303a0.20418 0.20418 0 0 1-0.01763-0.04701 0.42304 0.42304 0 0 1 0.2218-0.56553l11.697-5.175a2.9114 2.9114 0 0 1 2.3502 0l11.697 5.175a0.41864 0.41864 0 0 1 0.26294 0.41864z" fill="#ffd952"/> + <path class="cls-3" d="m20.801 14.796 5.0574-2.0359a0.21446 0.21446 0 0 0 0-0.39807c-0.58756-0.24531-1.3132-0.52734-2.0242-0.82259-0.13073-0.05435-1.369 0.83434-1.4821 0.92541l-2.1799 1.7421c-0.52734 0.44214-0.07051 0.86959 0.62869 0.58903z" fill="#45c8ff"/> + <circle class="cls-3" cx="17.135" cy="16.785" r="2.6367" fill="#45c8ff"/> + <path d="m20.613 10.981a2.2034 2.2034 0 0 1-0.73445-0.07638l-9.2042-2.4839a2.2342 2.2342 0 0 1-0.69332-0.32757z"/> + <g fill="#4e4e4e"> + <path class="cls-1" d="m6.6028 26.612 1.3661-0.0088h0.01763q0.75796 0 0.75796 0.71389v2.3003a6.5748 6.5748 0 0 1-2.2886 0.37898q-1.2515 0-1.8861-0.8505t-0.63457-2.3179q0-1.4689 0.7888-2.2827a2.5823 2.5823 0 0 1 1.9301-0.81524 3.5371 3.5371 0 0 1 2.0667 0.64338 1.0385 1.0385 0 0 1-0.18068 0.46711 1.2603 1.2603 0 0 1-0.33932 0.35254 2.5926 2.5926 0 0 0-1.5027-0.51999 1.4175 1.4175 0 0 0-1.1854 0.54203q-0.42304 0.53909-0.42304 1.6966 0 2.1769 1.604 2.1769a4.4743 4.4743 0 0 0 0.97829-0.11457v-0.83728q0-0.3966 0.01763-0.58756h-0.64633a0.60519 0.60519 0 0 1-0.40101-0.11018 0.44067 0.44067 0 0 1-0.12779-0.35254 1.51 1.51 0 0 1 0.088134-0.47446z"/> + <path class="cls-1" d="m13.401 29.379a1.1413 1.1413 0 0 1-0.14689 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.80937-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67863 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2354-0.3687q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.78439 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.7979q-0.0029 0.48474 0.24384 0.68745zm-2.2122-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18214 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14689z"/> + <path class="cls-1" d="m14.115 26.012a1.0547 1.0547 0 0 1 0.14689-0.32169 0.88134 0.88134 0 0 1 0.22474-0.25118 1.1017 1.1017 0 0 1 0.92982 0.78439q0.35254-0.78439 1.1369-0.78439a2.7028 2.7028 0 0 1 0.51118 0.06169 1.9786 1.9786 0 0 1-0.2644 1.0282 2.2357 2.2357 0 0 0-0.3966-0.05288q-0.53762 0-0.86372 0.57287v2.8174a3.0627 3.0627 0 0 1-0.53762 0.04407 3.3785 3.3785 0 0 1-0.55525-0.04407v-2.9525q-0.0059-0.6375-0.33197-0.90191z"/> + <path class="cls-1" d="m21.157 29.379a1.1413 1.1413 0 0 1-0.15423 0.31288 1.0664 1.0664 0 0 1-0.22474 0.25118 0.99592 0.99592 0 0 1-0.8079-0.51705 1.7847 1.7847 0 0 1-1.2603 0.56406q-0.67864 0-1.0282-0.3966a1.3573 1.3573 0 0 1-0.34372-0.9166q0-0.73445 0.48033-1.1149a1.9404 1.9404 0 0 1 1.2295-0.37457q0.40542 0 0.76677 0.03525v-0.2644q0-0.69626-0.66982-0.69626-0.47592 0-1.3485 0.31728a1.2368 1.2368 0 0 1-0.29378-0.7844 4.9164 4.9164 0 0 1 1.9096-0.3966 1.5526 1.5526 0 0 1 1.0752 0.37016q0.41423 0.37016 0.41423 1.1193v1.8038q0.0088 0.48474 0.25559 0.68745zm-2.2151-0.22034a1.2471 1.2471 0 0 0 0.88134-0.42304v-0.77852a5.9182 5.9182 0 0 0-0.66982-0.03525 0.73445 0.73445 0 0 0-0.54643 0.18508 0.6331 0.6331 0 0 0-0.18508 0.46711 0.62282 0.62282 0 0 0 0.14689 0.44067 0.48768 0.48768 0 0 0 0.3731 0.14395z"/> + <path class="cls-1" d="m22.241 29.344q-0.3966-0.60813-0.3966-1.679t0.50236-1.679a1.5188 1.5188 0 0 1 1.2074-0.60813 1.7039 1.7039 0 0 1 1.1898 0.44067 0.99739 0.99739 0 0 1 0.69626-0.37898 0.82552 0.82552 0 0 1 0.23356 0.24677 1.0282 1.0282 0 0 1 0.14689 0.30847q-0.24678 0.21152-0.24678 0.75796v2.4971q0 1.4013-0.4583 1.983-0.4583 0.58169-1.5071 0.58756a4.2598 4.2598 0 0 1-1.5776-0.29378 1.1854 1.1854 0 0 1 0.27322-0.80202 2.882 2.882 0 0 0 1.1854 0.27322q0.57728 0 0.79761-0.29378a1.322 1.322 0 0 0 0.22034-0.81084v-0.35254a1.6936 1.6936 0 0 1-1.1017 0.41423 1.3014 1.3014 0 0 1-1.1648-0.61106zm2.2651-0.71389v-2.0447a1.1355 1.1355 0 0 0-0.75796-0.36135 0.63604 0.63604 0 0 0-0.57728 0.37898 2.2988 2.2988 0 0 0-0.20712 1.0841q0 0.70508 0.18949 1.04a0.56406 0.56406 0 0 0 0.49796 0.33491 1.1193 1.1193 0 0 0 0.8549-0.43186z"/> + <path class="cls-1" d="m30.105 28.039h-2.4678a1.4924 1.4924 0 0 0 0.23356 0.80643q0.20712 0.28644 0.72711 0.28644a2.6778 2.6778 0 0 0 1.1546-0.30847 1.159 1.159 0 0 1 0.31728 0.66982 2.8467 2.8467 0 0 1-1.6966 0.50236q-0.99151 0-1.4234-0.64338-0.43186-0.64338-0.43186-1.6657 0-1.0282 0.47592-1.6657a1.5923 1.5923 0 0 1 1.3617-0.64338q0.88134 0 1.3617 0.53321a1.9434 1.9434 0 0 1 0.47593 1.344 3.4519 3.4519 0 0 1-0.08813 0.7844zm-1.701-1.8684q-0.7227 0-0.77558 1.0929h1.5335v-0.10576a1.25 1.25 0 0 0-0.18508-0.71389 0.64338 0.64338 0 0 0-0.567-0.27321z"/> + </g> + <g> + <path d="m17.034 3.0341a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56586 0.20418 0.20418 0 0 0 0.01757 0.04702l1.8769 3.7099h1.6288l-0.23151-1.2935c-0.0191-0.10429-0.18819-0.84337-0.3483-1.3751l5.4746 1.71c0.07196 0.34089 0.16746 0.65935 0.28112 0.9586h8.8765c0.0978-0.29932 0.17499-0.61834 0.22738-0.9586l5.4627-1.7053c-0.16011 0.53174-0.32713 1.2662-0.34623 1.3705l-0.23151 1.2935h1.6283l1.8593-3.6763 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.4191l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#ff9329"/> + <path d="m17.034 5.4825a2.9114 2.9114 0 0 0-1.1462 0.24753l-11.697 5.1749a0.42304 0.42304 0 0 0-0.22169 0.56534 0.20418 0.20418 0 0 0 0.01757 0.04703l1.018 2.0118h2.1632c-0.068234-0.28802-0.15662-0.64282-0.25528-0.97049l3.1073 0.97048h14.121l3.0939-0.96583c-0.09841 0.32682-0.18541 0.67924-0.25321 0.96583h2.1627l1.0005-1.9782 0.01757-0.03359 0.0181-0.04547a0.027909 0.027909 0 0 0 0-0.01188 0.39367 0.39367 0 0 0 0.01757-0.13643 0.41864 0.41864 0 0 0-0.26303-0.41858l-11.697-5.1749a2.9114 2.9114 0 0 0-1.2041-0.24753z" fill="#4e4e4e"/> + <path class="cls-2" d="m30.198 13.82a0.39367 0.39367 0 0 1-0.01762 0.13661 0.027909 0.027909 0 0 1 0 0.01175l-0.01762 0.04554-0.01762 0.03379-2.8306 5.5965c-0.39367 0.77705-1.1178 0.75355-0.99592-0.03232l0.56993-3.1817c0.0191-0.10429 0.18655-0.83874 0.34666-1.3705l-5.4629 1.7054c-0.85784 5.5716-8.1891 5.6641-9.3848 0l-5.4746-1.7098c0.16011 0.53174 0.32904 1.2706 0.34813 1.3749l0.56994 3.1816c0.12192 0.78586-0.60225 0.80937-0.99592 0.03232l-2.8482-5.6303a0.20418 0.20418 0 0 1-0.01763-0.04701 0.42304 0.42304 0 0 1 0.2218-0.56553l11.697-5.175a2.9114 2.9114 0 0 1 2.3502 0l11.697 5.175a0.41864 0.41864 0 0 1 0.26294 0.41864z" fill="#ff9329"/> + <path class="cls-3" d="m20.801 14.796 5.0574-2.0359a0.21446 0.21446 0 0 0 0-0.39807c-0.58756-0.24531-1.3132-0.52734-2.0242-0.82259-0.13073-0.05435-1.369 0.83434-1.4821 0.92541l-2.1799 1.7421c-0.52734 0.44214-0.07051 0.86959 0.62869 0.58903z" fill="#4e4e4e"/> + <circle class="cls-3" cx="17.135" cy="16.785" r="2.6367" fill="#4e4e4e"/> + </g> + </g> +</svg> diff --git a/doc/book/src/internals.md b/doc/book/src/internals.md new file mode 100644 index 00000000..e712ae07 --- /dev/null +++ b/doc/book/src/internals.md @@ -0,0 +1,158 @@ +**WARNING: this documentation is more a "design draft", which was written before Garage's actual implementation. The general principle is similar but details have not yet been updated.** + +#### Modules + +- `membership/`: configuration, membership management (gossip of node's presence and status), ring generation --> what about Serf (used by Consul/Nomad) : https://www.serf.io/? Seems a huge library with many features so maybe overkill/hard to integrate +- `metadata/`: metadata management +- `blocks/`: block management, writing, GC and rebalancing +- `internal/`: server to server communication (HTTP server and client that reuses connections, TLS if we want, etc) +- `api/`: S3 API +- `web/`: web management interface + +#### Metadata tables + +**Objects:** + +- *Hash key:* Bucket name (string) +- *Sort key:* Object key (string) +- *Sort key:* Version timestamp (int) +- *Sort key:* Version UUID (string) +- Complete: bool +- Inline: bool, true for objects < threshold (say 1024) +- Object size (int) +- Mime type (string) +- Data for inlined objects (blob) +- Hash of first block otherwise (string) + +*Having only a hash key on the bucket name will lead to storing all file entries of this table for a specific bucket on a single node. At the same time, it is the only way I see to rapidly being able to list all bucket entries...* + +**Blocks:** + +- *Hash key:* Version UUID (string) +- *Sort key:* Offset of block in total file (int) +- Hash of data block (string) + +A version is defined by the existence of at least one entry in the blocks table for a certain version UUID. +We must keep the following invariant: if a version exists in the blocks table, it has to be referenced in the objects table. +We explicitly manage concurrent versions of an object: the version timestamp and version UUID columns are index columns, thus we may have several concurrent versions of an object. +Important: before deleting an older version from the objects table, we must make sure that we did a successfull delete of the blocks of that version from the blocks table. + +Thus, the workflow for reading an object is as follows: + +1. Check permissions (LDAP) +2. Read entry in object table. If data is inline, we have its data, stop here. + -> if several versions, take newest one and launch deletion of old ones in background +3. Read first block from cluster. If size <= 1 block, stop here. +4. Simultaneously with previous step, if size > 1 block: query the Blocks table for the IDs of the next blocks +5. Read subsequent blocks from cluster + +Workflow for PUT: + +1. Check write permission (LDAP) +2. Select a new version UUID +3. Write a preliminary entry for the new version in the objects table with complete = false +4. Send blocks to cluster and write entries in the blocks table +5. Update the version with complete = true and all of the accurate information (size, etc) +6. Return success to the user +7. Launch a background job to check and delete older versions + +Workflow for DELETE: + +1. Check write permission (LDAP) +2. Get current version (or versions) in object table +3. Do the deletion of those versions NOT IN A BACKGROUND JOB THIS TIME +4. Return succes to the user if we were able to delete blocks from the blocks table and entries from the object table + +To delete a version: + +1. List the blocks from Cassandra +2. For each block, delete it from cluster. Don't care if some deletions fail, we can do GC. +3. Delete all of the blocks from the blocks table +4. Finally, delete the version from the objects table + +Known issue: if someone is reading from a version that we want to delete and the object is big, the read might be interrupted. I think it is ok to leave it like this, we just cut the connection if data disappears during a read. + +("Soit P un problème, on s'en fout est une solution à ce problème") + +#### Block storage on disk + +**Blocks themselves:** + +- file path = /blobs/(first 3 hex digits of hash)/(rest of hash) + +**Reverse index for GC & other block-level metadata:** + +- file path = /meta/(first 3 hex digits of hash)/(rest of hash) +- map block hash -> set of version UUIDs where it is referenced + +Usefull metadata: + +- list of versions that reference this block in the Casandra table, so that we can do GC by checking in Cassandra that the lines still exist +- list of other nodes that we know have acknowledged a write of this block, usefull in the rebalancing algorithm + +Write strategy: have a single thread that does all write IO so that it is serialized (or have several threads that manage independent parts of the hash space). When writing a blob, write it to a temporary file, close, then rename so that a concurrent read gets a consistent result (either not found or found with whole content). + +Read strategy: the only read operation is get(hash) that returns either the data or not found (can do a corruption check as well and return corrupted state if it is the case). Can be done concurrently with writes. + +**Internal API:** + +- get(block hash) -> ok+data/not found/corrupted +- put(block hash & data, version uuid + offset) -> ok/error +- put with no data(block hash, version uuid + offset) -> ok/not found plz send data/error +- delete(block hash, version uuid + offset) -> ok/error + +GC: when last ref is deleted, delete block. +Long GC procedure: check in Cassandra that version UUIDs still exist and references this block. + +Rebalancing: takes as argument the list of newly added nodes. + +- List all blocks that we have. For each block: +- If it hits a newly introduced node, send it to them. + Use put with no data first to check if it has to be sent to them already or not. + Use a random listing order to avoid race conditions (they do no harm but we might have two nodes sending the same thing at the same time thus wasting time). +- If it doesn't hit us anymore, delete it and its reference list. + +Only one balancing can be running at a same time. It can be restarted at the beginning with new parameters. + +#### Membership management + +Two sets of nodes: + +- set of nodes from which a ping was recently received, with status: number of stored blocks, request counters, error counters, GC%, rebalancing% + (eviction from this set after say 30 seconds without ping) +- set of nodes that are part of the system, explicitly modified by the operator using the web UI (persisted to disk), + is a CRDT using a version number for the value of the whole set + +Thus, three states for nodes: + +- healthy: in both sets +- missing: not pingable but part of desired cluster +- unused/draining: currently present but not part of the desired cluster, empty = if contains nothing, draining = if still contains some blocks + +Membership messages between nodes: + +- ping with current state + hash of current membership info -> reply with same info +- send&get back membership info (the ids of nodes that are in the two sets): used when no local membership change in a long time and membership info hash discrepancy detected with first message (passive membership fixing with full CRDT gossip) +- inform of newly pingable node(s) -> no result, when receive new info repeat to all (reliable broadcast) +- inform of operator membership change -> no result, when receive new info repeat to all (reliable broadcast) + +Ring: generated from the desired set of nodes, however when doing read/writes on the ring, skip nodes that are known to be not pingable. +The tokens are generated in a deterministic fashion from node IDs (hash of node id + token number from 1 to K). +Number K of tokens per node: decided by the operator & stored in the operator's list of nodes CRDT. Default value proposal: with node status information also broadcast disk total size and free space, and propose a default number of tokens equal to 80%Free space / 10Gb. (this is all user interface) + + +#### Constants + +- Block size: around 1MB ? --> Exoscale use 16MB chunks +- Number of tokens in the hash ring: one every 10Gb of allocated storage +- Threshold for storing data directly in Cassandra objects table: 1kb bytes (maybe up to 4kb?) +- Ping timeout (time after which a node is registered as unresponsive/missing): 30 seconds +- Ping interval: 10 seconds +- ?? + +#### Links + +- CDC: <https://www.usenix.org/system/files/conference/atc16/atc16-paper-xia.pdf> +- Erasure coding: <http://web.eecs.utk.edu/~jplank/plank/papers/CS-08-627.html> +- [Openstack Storage Concepts](https://docs.openstack.org/arch-design/design-storage/design-storage-concepts.html) +- [RADOS](https://ceph.com/wp-content/uploads/2016/08/weil-rados-pdsw07.pdf) diff --git a/doc/book/src/intro.md b/doc/book/src/intro.md new file mode 100644 index 00000000..5455ae71 --- /dev/null +++ b/doc/book/src/intro.md @@ -0,0 +1,62 @@ +![Garage's Logo](img/logo.svg) + +# The Garage Geo-Distributed Data Store + +Garage is a lightweight geo-distributed data store. +It comes from the observation that despite numerous object stores +many people have broken data management policies (backup/replication on a single site or none at all). +To promote better data management policies, with focused on the following desirable properties: + + - **Self-contained & lightweight**: works everywhere and integrates well in existing environments to target hyperconverged infrastructures + - **Highly resilient**: highly resilient to network failures, network latency, disk failures, sysadmin failures + - **Simple**: simple to understand, simple to operate, simple to debug + - **Internet enabled**: Made for multi-sites (eg. datacenter, offices, etc.) interconnected through a regular internet connection. + +We also noted that the pursuit of some other goals are detrimental to our initial goals. +The following have been identified has non-goals, if it matters to you, you should not use Garage: + + - **Extreme performances**: high performances constrain a lot the design and the deployment. We always prioritize + - **Feature extensiveness**: Complete implementation of the S3 API + - **Storage optimizations**: Erasure coding (our replication model is simply to copy the data as is on several nodes, in different datacenters if possible) + - **POSIX/Filesystem compatibility**: We do not aim at being POSIX compatible or to emulate any kind of filesystem. Indeed, in a distributed environment, such syncronizations are translated in network messages that impose severe constraints on the deployment. + +## Integration in environments + +Garage speaks (or will speak) the following protocols: + + - [S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) - *SUPPORTED* - Enable applications to store large blobs such as pictures, video, images, documents, etc. S3 is versatile enough to also be used to publish a static website. + - [IMAP](https://github.com/go-pluto/pluto) - *PLANNED* - email storage is quite complex to get good oerformances. +To keep performances optimals, most imap servers only support on-disk storage. +We plan to add logic to Garage to make it a viable solution for email storage. + - *More to come* + +## Use Cases + +**[Deuxfleurs](https://deuxfleurs.fr) :** Garage is used by Deuxfleurs which is a non-profit hosting organization. +Especially, it is used to host their main website, this documentation and some of its members's blogs. Additionally, +Garage is used as a [backend for Nextcloud](https://docs.nextcloud.com/server/20/admin_manual/configuration_files/primary_storage.html). Deuxfleurs also plans to use Garage as their [Matrix's media backend](https://github.com/matrix-org/synapse-s3-storage-provider) and has the backend of [OCIS](https://github.com/owncloud/ocis). + +*Are you using Garage? Open a pull request to add your organization here!* + +## Comparisons to existing software + +**[Minio](https://min.io/) :** Minio shares our *self-contained & lightweight* goal but selected two of our non-goals: *storage optimizations* through erasure coding and *POSIX/Filesystem compatibility* through strong consistency. +However, by pursuing these two non-goals, minio do not reach our desirable properties. +First, it fails on the *simple* property: due to the erasure coding, minio has severe limitations on how drives can be added or deleted from a cluster. +Second, it fails on the *interned enabled* property: due to its strong consistency, minio is latency sensitive. +Furthermore, minio has no knowledge of "sites" and thus can not distribute data to minimize the failure of a given site. + +**[Openstack Swift](https://docs.openstack.org/swift/latest/)** +OpenStack Swift at least fails on the *self-contained & lightweight* goal. +Starting it requires around 8Gb of RAM, which is too much especially in an hyperconverged infrastructure. +It seems also to be far from *Simple*. + +**[Pithos](https://github.com/exoscale/pithos)** +Pithos has been abandonned and should probably not used yet, in the following we explain why we did not pick their design. +Pithos was relying as a S3 proxy in front of Cassandra (and was working with Scylla DB too). +From its designers' mouth, storing data in Cassandra has shown its limitations justifying the project abandonment. +They built a closed-source version 2 that does not store blobs in the database (only metadata) but did not communicate further on it. +We considered there v2's design but concluded that it does not fit both our *Self-contained & lightweight* and *Simple* properties. It makes the development, the deployment and the operations more complicated while reducing the flexibility. + +**[IPFS](https://ipfs.io/)** +*Not written yet* diff --git a/doc/book/src/quickstart_bucket.md b/doc/book/src/quickstart_bucket.md new file mode 100644 index 00000000..bd574e14 --- /dev/null +++ b/doc/book/src/quickstart_bucket.md @@ -0,0 +1,71 @@ +# Configuring a cluster + +First, chances are that your garage deployment is secured by TLS. +All your commands must be prefixed with their certificates. +I will define an alias once and for all to ease future commands. +Please adapt the path of the binary and certificates to your installation! + +``` +alias grg="/garage/garage --ca-cert /secrets/garage-ca.crt --client-cert /secrets/garage.crt --client-key /secrets/garage.key" +``` + +Now we can check that everything is going well by checking our cluster status: + +``` +grg status +``` + +Don't forget that `help` command and `--help` subcommands can help you anywhere, the CLI tool is self-documented! Two examples: + +``` +grg help +grg bucket allow --help +``` + +Fine, now let's create a bucket (we imagine that you want to deploy nextcloud): + +``` +grg bucket create nextcloud-bucket +``` + +Check that everything went well: + +``` +grg bucket list +grg bucket info nextcloud-bucket +``` + +Now we will generate an API key to access this bucket. +Note that API keys are independent of buckets: one key can access multiple buckets, multiple keys can access one bucket. + +Now, let's start by creating a key only for our PHP application: + +``` +grg key new --name nextcloud-app-key +``` + +You will have the following output (this one is fake, `key_id` and `secret_key` were generated with the openssl CLI tool): + +``` +Key { key_id: "GK3515373e4c851ebaad366558", secret_key: "7d37d093435a41f2aab8f13c19ba067d9776c90215f56614adad6ece597dbb34", name: "nextcloud-app-key", name_timestamp: 1603280506694, deleted: false, authorized_buckets: [] } +``` + +Check that everything works as intended (be careful, info works only with your key identifier and not with its friendly name!): + +``` +grg key list +grg key info GK3515373e4c851ebaad366558 +``` + +Now that we have a bucket and a key, we need to give permissions to the key on the bucket! + +``` +grg bucket allow --read --write nextcloud-bucket --key GK3515373e4c851ebaad366558 +``` + +You can check at any times allowed keys on your bucket with: + +``` +grg bucket info nextcloud-bucket +``` + diff --git a/doc/book/src/related_work.md b/doc/book/src/related_work.md new file mode 100644 index 00000000..c1a4eed4 --- /dev/null +++ b/doc/book/src/related_work.md @@ -0,0 +1,38 @@ +## Context + +Data storage is critical: it can lead to data loss if done badly and/or on hardware failure. +Filesystems + RAID can help on a single machine but a machine failure can put the whole storage offline. +Moreover, it put a hard limit on scalability. Often this limit can be pushed back far away by buying expensive machines. +But here we consider non specialized off the shelf machines that can be as low powered and subject to failures as a raspberry pi. + +Distributed storage may help to solve both availability and scalability problems on these machines. +Many solutions were proposed, they can be categorized as block storage, file storage and object storage depending on the abstraction they provide. + +## Related work + +Block storage is the most low level one, it's like exposing your raw hard drive over the network. +It requires very low latencies and stable network, that are often dedicated. +However it provides disk devices that can be manipulated by the operating system with the less constraints: it can be partitioned with any filesystem, meaning that it supports even the most exotic features. +We can cite [iSCSI](https://en.wikipedia.org/wiki/ISCSI) or [Fibre Channel](https://en.wikipedia.org/wiki/Fibre_Channel). +Openstack Cinder proxy previous solution to provide an uniform API. + +File storage provides a higher abstraction, they are one filesystem among others, which means they don't necessarily have all the exotic features of every filesystem. +Often, they relax some POSIX constraints while many applications will still be compatible without any modification. +As an example, we are able to run MariaDB (very slowly) over GlusterFS... +We can also mention CephFS (read [RADOS](https://ceph.com/wp-content/uploads/2016/08/weil-rados-pdsw07.pdf) whitepaper), Lustre, LizardFS, MooseFS, etc. +OpenStack Manila proxy previous solutions to provide an uniform API. + +Finally object storages provide the highest level abstraction. +They are the testimony that the POSIX filesystem API is not adapted to distributed filesystems. +Especially, the strong concistency has been dropped in favor of eventual consistency which is way more convenient and powerful in presence of high latencies and unreliability. +We often read about S3 that pioneered the concept that it's a filesystem for the WAN. +Applications must be adapted to work for the desired object storage service. +Today, the S3 HTTP REST API acts as a standard in the industry. +However, Amazon S3 source code is not open but alternatives were proposed. +We identified Minio, Pithos, Swift and Ceph. +Minio/Ceph enforces a total order, so properties similar to a (relaxed) filesystem. +Swift and Pithos are probably the most similar to AWS S3 with their consistent hashing ring. +However Pithos is not maintained anymore. More precisely the company that published Pithos version 1 has developped a second version 2 but has not open sourced it. +Some tests conducted by the [ACIDES project](https://acides.org/) have shown that Openstack Swift consumes way more resources (CPU+RAM) that we can afford. Furthermore, people developing Swift have not designed their software for geo-distribution. + +There were many attempts in research too. I am only thinking to [LBFS](https://pdos.csail.mit.edu/papers/lbfs:sosp01/lbfs.pdf) that was used as a basis for Seafile. But none of them have been effectively implemented yet. diff --git a/doc/book/src/website.md b/doc/book/src/website.md new file mode 100644 index 00000000..2ea82a9a --- /dev/null +++ b/doc/book/src/website.md @@ -0,0 +1 @@ +# Host a website |