From f6aebefcc9747bf5afad3767e9ae6f9f3aba30ae Mon Sep 17 00:00:00 2001 From: Alex Auvolat Date: Wed, 14 Sep 2022 19:31:13 +0200 Subject: Some work on documentation towards v0.8 --- doc/book/design/benchmarks/index.md | 2 +- doc/book/design/goals.md | 4 ++-- doc/book/design/internals.md | 43 +++++++++++++++++++++++++++++++++++++ doc/book/design/related-work.md | 2 +- 4 files changed, 47 insertions(+), 4 deletions(-) (limited to 'doc/book/design') diff --git a/doc/book/design/benchmarks/index.md b/doc/book/design/benchmarks/index.md index c2215a4a..79cc5d62 100644 --- a/doc/book/design/benchmarks/index.md +++ b/doc/book/design/benchmarks/index.md @@ -1,6 +1,6 @@ +++ title = "Benchmarks" -weight = 10 +weight = 40 +++ With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle), diff --git a/doc/book/design/goals.md b/doc/book/design/goals.md index dea1d2c8..b97d73a9 100644 --- a/doc/book/design/goals.md +++ b/doc/book/design/goals.md @@ -1,13 +1,13 @@ +++ title = "Goals and use cases" -weight = 5 +weight = 10 +++ ## Goals and non-goals Garage is a lightweight geo-distributed data store that implements the [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html) -object storage protocole. It enables applications to store large blobs such +object storage protocol. It enables applications to store large blobs such as pictures, video, images, documents, etc., in a redundant multi-node setting. S3 is versatile enough to also be used to publish a static website. diff --git a/doc/book/design/internals.md b/doc/book/design/internals.md index 05d852e2..777e017d 100644 --- a/doc/book/design/internals.md +++ b/doc/book/design/internals.md @@ -20,6 +20,49 @@ In the meantime, you can find some information at the following links: - [an old design draft](@/documentation/working-documents/design-draft.md) +## Request routing logic + +Data retrieval requests to Garage endpoints (S3 API and websites) are resolved +to an individual object in a bucket. Since objects are replicated to multiple nodes +Garage must ensure consistency before answering the request. + +### Using quorum to ensure consistency + +Garage ensures consistency by attempting to establish a quorum with the +data nodes responsible for the object. When a majority of the data nodes +have provided metadata on a object Garage can then answer the request. + +When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions: + +- Make a request to the two preferred nodes for object metadata +- Try the third node if one of the two initial requests fail +- Check that the metadata from at least 2 nodes match +- Check that the object hasn't been marked deleted +- Answer the request with inline data from metadata if object is small enough +- Or get data blocks from the preferred nodes and answer using the assembled object + +Garage dynamically determines which nodes to query based on health, preference, and +which nodes actually host a given data. Garage has no concept of "primary" so any +healthy node with the data can be used as long as a quorum is reached for the metadata. + +### Node health + +Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection +cannot be established, or a node fails to answer a number of pings, the target node is marked as failed. +Failed nodes are not used for quorum or other internal requests. + +### Node preference + +Garage prioritizes which nodes to query according to a few criteria: + +- A node always prefers itself if it can answer the request +- Then the node prioritizes nodes in the same zone +- Finally the nodes with the lowest latency are prioritized + + +For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md) +and [cluster layout management](@/documentation/reference-manual/layout.md) pages. + ## Garbage collection A faulty garbage collection procedure has been the cause of diff --git a/doc/book/design/related-work.md b/doc/book/design/related-work.md index ade298ec..f96c6618 100644 --- a/doc/book/design/related-work.md +++ b/doc/book/design/related-work.md @@ -1,6 +1,6 @@ +++ title = "Related work" -weight = 15 +weight = 50 +++ ## Context -- cgit v1.2.3