aboutsummaryrefslogtreecommitdiff
path: root/doc/book/design
diff options
context:
space:
mode:
authorMendes <mendes.oulamara@pm.me>2022-10-04 18:14:49 +0200
committerMendes <mendes.oulamara@pm.me>2022-10-04 18:14:49 +0200
commit829f815a897b04986559910bbcbf53625adcdf20 (patch)
tree6db3c27cff2aded754a641d1f2b05c83be701267 /doc/book/design
parent99f96b9564c9c841dc6c56f1255a6e70ff884d46 (diff)
parenta096ced35562bd0a8877a1ee2f755be1edafe343 (diff)
downloadgarage-829f815a897b04986559910bbcbf53625adcdf20.tar.gz
garage-829f815a897b04986559910bbcbf53625adcdf20.zip
Merge remote-tracking branch 'origin/main' into optimal-layout
Diffstat (limited to 'doc/book/design')
-rw-r--r--doc/book/design/benchmarks/index.md2
-rw-r--r--doc/book/design/goals.md6
-rw-r--r--doc/book/design/internals.md43
-rw-r--r--doc/book/design/related-work.md2
4 files changed, 48 insertions, 5 deletions
diff --git a/doc/book/design/benchmarks/index.md b/doc/book/design/benchmarks/index.md
index c2215a4a..79cc5d62 100644
--- a/doc/book/design/benchmarks/index.md
+++ b/doc/book/design/benchmarks/index.md
@@ -1,6 +1,6 @@
+++
title = "Benchmarks"
-weight = 10
+weight = 40
+++
With Garage, we wanted to build a software defined storage service that follow the [KISS principle](https://en.wikipedia.org/wiki/KISS_principle),
diff --git a/doc/book/design/goals.md b/doc/book/design/goals.md
index dea1d2c8..9c2d89f0 100644
--- a/doc/book/design/goals.md
+++ b/doc/book/design/goals.md
@@ -1,23 +1,23 @@
+++
title = "Goals and use cases"
-weight = 5
+weight = 10
+++
## Goals and non-goals
Garage is a lightweight geo-distributed data store that implements the
[Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/API/Welcome.html)
-object storage protocole. It enables applications to store large blobs such
+object storage protocol. It enables applications to store large blobs such
as pictures, video, images, documents, etc., in a redundant multi-node
setting. S3 is versatile enough to also be used to publish a static
website.
Garage is an opinionated object storage solutoin, we focus on the following **desirable properties**:
+ - **Internet enabled**: made for multi-sites (eg. datacenters, offices, households, etc.) interconnected through regular Internet connections.
- **Self-contained & lightweight**: works everywhere and integrates well in existing environments to target [hyperconverged infrastructures](https://en.wikipedia.org/wiki/Hyper-converged_infrastructure).
- **Highly resilient**: highly resilient to network failures, network latency, disk failures, sysadmin failures.
- **Simple**: simple to understand, simple to operate, simple to debug.
- - **Internet enabled**: made for multi-sites (eg. datacenters, offices, households, etc.) interconnected through regular Internet connections.
We also noted that the pursuit of some other goals are detrimental to our initial goals.
The following has been identified as **non-goals** (if these points matter to you, you should not use Garage):
diff --git a/doc/book/design/internals.md b/doc/book/design/internals.md
index 05d852e2..777e017d 100644
--- a/doc/book/design/internals.md
+++ b/doc/book/design/internals.md
@@ -20,6 +20,49 @@ In the meantime, you can find some information at the following links:
- [an old design draft](@/documentation/working-documents/design-draft.md)
+## Request routing logic
+
+Data retrieval requests to Garage endpoints (S3 API and websites) are resolved
+to an individual object in a bucket. Since objects are replicated to multiple nodes
+Garage must ensure consistency before answering the request.
+
+### Using quorum to ensure consistency
+
+Garage ensures consistency by attempting to establish a quorum with the
+data nodes responsible for the object. When a majority of the data nodes
+have provided metadata on a object Garage can then answer the request.
+
+When a request arrives Garage will, assuming the recommended 3 replicas, perform the following actions:
+
+- Make a request to the two preferred nodes for object metadata
+- Try the third node if one of the two initial requests fail
+- Check that the metadata from at least 2 nodes match
+- Check that the object hasn't been marked deleted
+- Answer the request with inline data from metadata if object is small enough
+- Or get data blocks from the preferred nodes and answer using the assembled object
+
+Garage dynamically determines which nodes to query based on health, preference, and
+which nodes actually host a given data. Garage has no concept of "primary" so any
+healthy node with the data can be used as long as a quorum is reached for the metadata.
+
+### Node health
+
+Garage keeps a TCP session open to each node in the cluster and periodically pings them. If a connection
+cannot be established, or a node fails to answer a number of pings, the target node is marked as failed.
+Failed nodes are not used for quorum or other internal requests.
+
+### Node preference
+
+Garage prioritizes which nodes to query according to a few criteria:
+
+- A node always prefers itself if it can answer the request
+- Then the node prioritizes nodes in the same zone
+- Finally the nodes with the lowest latency are prioritized
+
+
+For further reading on the cluster structure look at the [gateway](@/documentation/cookbook/gateways.md)
+and [cluster layout management](@/documentation/reference-manual/layout.md) pages.
+
## Garbage collection
A faulty garbage collection procedure has been the cause of
diff --git a/doc/book/design/related-work.md b/doc/book/design/related-work.md
index ade298ec..f96c6618 100644
--- a/doc/book/design/related-work.md
+++ b/doc/book/design/related-work.md
@@ -1,6 +1,6 @@
+++
title = "Related work"
-weight = 15
+weight = 50
+++
## Context