KEEP IT SIMPLE, STUPID! ----------------------- Ideas are all over the place for this project. When adding something, let's make sure it's reasonnable to do so and doesn't burden the user with some overcomplicated concepts. We want the minimal and most elegant solution that achieves our goals. TASK LIST ========= - High priority: invite, dht, groups - Medium priority: net Invitation system (invite, MED) -------------------------------- Current issue: if a user PM's another user, that user does not know it! He has to pull the PM shard voluntarily to get the messages. A user may generate invitation tokens: an invite token is a signature of the invited user's pk by the inviter user's sk. The user that redeems the token writes some data that is saved temporarily in the user's identity shard, which might be stored by a 3rd party node for instance if we are on mobile devices that cannot connect directly. Networking improvements (net, HARD) ----------------------- Here are some things to keep in mind that we want at some point: - RPS - Congestion control, proper multiplexing of feeds - Proper management of open connections to peers RPS question: can we integrate a preference for connections to peers that share the same shards? All while preventing the network from being disconnected. Ex: keep 100 total open connections that are sampled by proximity on the set of requested shards (bloom filter) plus 2 or 5 full random for all shards. DHT to find peers for a given shard (dht, EASY) ----------------------------------- First option: use a library for MLDHT, makes everything simple but makes us use UDP which does not work with Tor (can fix this later). Second option: custom DHT protocol (we probably won't be doing this anytime soon, if ever at all) Partial merges, background pulls, caching (cache, req: bg, HARD) ----------------------------------------- Don't even bother to pull all pages in the background, don't require to store all depended pages. Replace that with a cache of the pages we recently/frequently used + a way of distributing the storing of the pages over all nodes. To distribute the pages over all peers, we can use a DHT for example or some kind of rendez-vous hashing. Rendez-vous hashing is reliable but requires full connectivity, to alleviate that we can have only a subset of nodes participate in the distributed storage, then they become the supernodes that everyone calls to get pages. Still pages can be broadcast between secondary peers to alleviate the load of the superpeers. Basically the superpeers are only called for infrequently used pages, for examples those of old data that is only kept for archival purpose. User groups and access control (groups, req: sign, HARD) ------------------------------ Groups with member lists, roles, etc. Use these as access control lists for some shards. Enforce access control in two ways: only push information to peers that have proven they are a certain identity, and usage of a secret key that all group members share to encrypt this data. Trust lists (trust, req: sign, MED) ----------- In their profile (identity shard), people can rate their trust of other people. This information can be combined transitively to evaluate the trust of any individual. Maybe we can make a distributed algorithm for a more efficient calculation of these trust values, open research question. Automated access control based on trust (auto, req: trust, groups, HARD) --------------------------------------- Automated algorithms that take account the trust values in access control decisions (obviously these can only run when an identity with admin privilege is running). COMPLETED TASKS =============== Block store root & GC handling (gc, QUITE EASY) ------------------------------ We want the block store app to be aware of what blocks are needed or not. The Page protocol already implements dependencies between blocks. The block store keeps all pages that have been put for a given delay. Once the delay is passed, the pages are purged if they are not required by a root we want to keep. Partial sync/background pull for big objects (bg, req: gc, QUITE EASY) -------------------------------------------- Implement the copy protocol as a lazy call that launches the copy in background. Remove the callback possibility in MerkleSearchTree.merge so that pulling all the data is not required for a merge. The callback can be only called on the items that are new in the last n (ex. 100) items of the resulting tree, this is not implemented in the MST but in the app that uses it since it is application specific. Fix page hash semantics (pagehash, EASY) ----------------------- Store and transmit the term binary and not the term itself so that we are sure serialization is unique and hash is the same everywhere. Terminology: stop using the word "block", use "page" everywhere. Signed stuff, identity management (sign, MED) --------------------------------- We want all messages that are stored in our data structures to have a correct signature from a certain identity. We can have a special "identity" shard type that enables storing profile information such as nickname or other information that we might want to make public. Architecture for communication primitives (comm, MED) ----------------------------------------- - Encrypted point to point communication (to communicate private info after ACL check) -> SHS - Flooding, gossip -> netgroups Private chat (privchat, MED) ------------ Proof-of-concept for private things: shard for private chat between two people. Epidemic broadcast (ep, EASY) ------------------ When a shard recieves new information from a peer, transfer that information to some other neigbors. How to select such neighbors ? a. All those that we know of b. Those that we are currently connected to c. A random number of known peers Best option: those that we are connected to + some random to reach a quota (for example 10 or so) Implementation: netgroups (lib/net/groups.ex) Shard lifetime and dependency management (dep, MED) ---------------------------------------- Some Shards may pull other shards in, under certain conditions. For example a stored folder shard will just be a list of other shards that we all pull in. We want a way to have toplevel shards, shards that are dependencies of toplevel shards, and shards that we keep for a number of days as a cache but will expire automatically.