Refactor file organization

author: Quentin Dufour <quentin@deuxfleurs.fr> 2021-03-17 16:15:18 +0100
committer: Quentin Dufour <quentin@deuxfleurs.fr> 2021-03-17 16:15:18 +0100
commit: 002538f92c1d9f95f2d699337f7d891c6aa0c9a4 (patch)
tree: 054aac5ce5e637c7baf3d15238c8c0c1ed8e97f4 /doc/book/src/design/related_work.md
parent: c50113acf3fd61dcb77bc01bd6e9f226f813bf76 (diff)
download: garage-002538f92c1d9f95f2d699337f7d891c6aa0c9a4.tar.gz
garage-002538f92c1d9f95f2d699337f7d891c6aa0c9a4.zip
1 files changed, 56 insertions, 0 deletions
diff --git a/doc/book/src/design/related_work.md b/doc/book/src/design/related_work.md
new file mode 100644
index 00000000..bae4691c
--- /dev/null
+++ b/doc/book/src/design/related_work.md
@@ -0,0 +1,56 @@
+# Related Work
+
+## Context
+
+Data storage is critical: it can lead to data loss if done badly and/or on hardware failure.
+Filesystems + RAID can help on a single machine but a machine failure can put the whole storage offline.
+Moreover, it put a hard limit on scalability. Often this limit can be pushed back far away by buying expensive machines.
+But here we consider non specialized off the shelf machines that can be as low powered and subject to failures as a raspberry pi.
+
+Distributed storage may help to solve both availability and scalability problems on these machines.
+Many solutions were proposed, they can be categorized as block storage, file storage and object storage depending on the abstraction they provide.
+
+## Overview
+
+Block storage is the most low level one, it's like exposing your raw hard drive over the network.
+It requires very low latencies and stable network, that are often dedicated.
+However it provides disk devices that can be manipulated by the operating system with the less constraints: it can be partitioned with any filesystem, meaning that it supports even the most exotic features.
+We can cite [iSCSI](https://en.wikipedia.org/wiki/ISCSI) or [Fibre Channel](https://en.wikipedia.org/wiki/Fibre_Channel).
+Openstack Cinder proxy previous solution to provide an uniform API.
+
+File storage provides a higher abstraction, they are one filesystem among others, which means they don't necessarily have all the exotic features of every filesystem.
+Often, they relax some POSIX constraints while many applications will still be compatible without any modification.
+As an example, we are able to run MariaDB (very slowly) over GlusterFS...
+We can also mention CephFS (read [RADOS](https://ceph.com/wp-content/uploads/2016/08/weil-rados-pdsw07.pdf) whitepaper), Lustre, LizardFS, MooseFS, etc.
+OpenStack Manila proxy previous solutions to provide an uniform API.
+
+Finally object storages provide the highest level abstraction.
+They are the testimony that the POSIX filesystem API is not adapted to distributed filesystems.
+Especially, the strong concistency has been dropped in favor of eventual consistency which is way more convenient and powerful in presence of high latencies and unreliability.
+We often read about S3 that pioneered the concept that it's a filesystem for the WAN.
+Applications must be adapted to work for the desired object storage service.
+Today, the S3 HTTP REST API acts as a standard in the industry.
+However, Amazon S3 source code is not open but alternatives were proposed.
+We identified Minio, Pithos, Swift and Ceph.
+Minio/Ceph enforces a total order, so properties similar to a (relaxed) filesystem.
+Swift and Pithos are probably the most similar to AWS S3 with their consistent hashing ring.
+However Pithos is not maintained anymore. More precisely the company that published Pithos version 1 has developped a second version 2 but has not open sourced it.
+Some tests conducted by the [ACIDES project](https://acides.org/) have shown that Openstack Swift consumes way more resources (CPU+RAM) that we can afford. Furthermore, people developing Swift have not designed their software for geo-distribution.
+
+There were many attempts in research too. I am only thinking to [LBFS](https://pdos.csail.mit.edu/papers/lbfs:sosp01/lbfs.pdf) that was used as a basis for Seafile. But none of them have been effectively implemented yet.
+
+## Existing software
+
+**[Pithos](https://github.com/exoscale/pithos) :** 
+Pithos has been abandonned and should probably not used yet, in the following we explain why we did not pick their design.
+Pithos was relying as a S3 proxy in front of Cassandra (and was working with Scylla DB too).
+From its designers' mouth, storing data in Cassandra has shown its limitations justifying the project abandonment.
+They built a closed-source version 2 that does not store blobs in the database (only metadata) but did not communicate further on it.
+We considered there v2's design but concluded that it does not fit both our *Self-contained & lightweight* and *Simple* properties. It makes the development, the deployment and the operations more complicated while reducing the flexibility.
+
+**[IPFS](https://ipfs.io/) :**
+*Not written yet*
+
+## Specific research papers
+
+*Not yet written*
author	Quentin Dufour <quentin@deuxfleurs.fr>	2021-03-17 16:15:18 +0100
committer	Quentin Dufour <quentin@deuxfleurs.fr>	2021-03-17 16:15:18 +0100
commit	002538f92c1d9f95f2d699337f7d891c6aa0c9a4 (patch)
tree	054aac5ce5e637c7baf3d15238c8c0c1ed8e97f4 /doc/book/src/design/related_work.md
parent	c50113acf3fd61dcb77bc01bd6e9f226f813bf76 (diff)
download	garage-002538f92c1d9f95f2d699337f7d891c6aa0c9a4.tar.gz garage-002538f92c1d9f95f2d699337f7d891c6aa0c9a4.zip