aboutsummaryrefslogtreecommitdiff
path: root/doc/book/benchmarks/failure_recovery.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/book/benchmarks/failure_recovery.md')
-rw-r--r--doc/book/benchmarks/failure_recovery.md18
1 files changed, 18 insertions, 0 deletions
diff --git a/doc/book/benchmarks/failure_recovery.md b/doc/book/benchmarks/failure_recovery.md
new file mode 100644
index 00000000..59c9399d
--- /dev/null
+++ b/doc/book/benchmarks/failure_recovery.md
@@ -0,0 +1,18 @@
++++
+title = "Failure & recovery"
+weight = 50
++++
+
+# Failure impact
+
+Failures will lead to timeouts, which in turn could
+lead to failed requests (this is a bug if failure enters in Garage tolerance)
+and to increased latency as some retries might be performed.
+
+How we proceed: we pause (`kill -STOP xxx`) one Garage process.
+The idea is we don't want to close the TCP connection that would signal too easily
+that a crash occured. Instead, we want to simulate a network error
+or an overloaded process, ie. a 'non-collaborating' crash.
+
+
+# Recovery impact