Status Communities: Protocol and Product point of view

sanaz · December 7, 2021, 12:42am

The goal of this post is to 1- synchronize view of Vac team and product team regarding the community history problem 2- to be a reference that leads the implementation in a broad level 3- to provide a mapping between the problems (from the product point of view) to the solutions from the protocol perspective

Data consistency

From a product/user point of view, data consistency refers to the desire of community members to get a consistent view of messages they share in any messaging context provided in the community e.g., 1:1 chat, group chats, public channels, etc.

Providing data consistency is a multi-variant problem with the following non-exhaustive list of variables/factors (extracted from the inputs provided by the product team):

Users availability i.e., limitations that users may have re their connectivity and online time
Users bandwidth limitations
Users storage limitations
Available infrastructure: is there any centralized infrastructure like Status fleet nodes that can aid the problem
Users trustworthiness: whether users follow the protocol or may act maliciously
Network reliability: will the messages delivered by the underlying network (i.e., waku v1 or v2) reliably arrive at every network participant

Each combination of these variables defines a User story under which the data consistency should hold.

Below, I will start with the current user story (user story 1 which reflects the current assumptions and constraints) and will present the available waku v2 protocols that can address data consistency under that user story.
Two other user stories are also explained as the next meaningful user stories that we shall focus on in order. The available waku v2 protocols and research efforts for future user stories are also discussed in their respective sections.

User stories are ordered from the least restrictive setting (ideal world setting) to the most restrictive one (real-word setting).

User story 1 (current)

There are some special nodes with 24/7 availability
There are some nodes with intermittent availability
No bandwidth constraint
Nodes with high availability also have enough storage to persist 30 days historical messages
No nodes have storage to persist messages older than 30 days
Status fleet nodes are available
All the nodes are honest
Network is reliable

User story 1 interpretation from protocol pov

Network reliability means there is no message loss, hence those nodes that are available 24/7, and listen to the network will ultimately obtain a consistent view of all the messages in the system. However, nodes with less availability won’t get a consistent view. This can be tackled in WAKU2-STORE protocols which are explained below.
The lack of storage to persist messages older than a month signifies the need for an external storage layer. The current proposed solution is based on Bittorent which is explained below.

In WAKU2-STORE, nodes with high availability persist chat messages for the last 30 days and provide those historical messages to the less available users upon request.
- Assumptions (this protocol provide data consistency due to the following items of the user story):
  - Network reliability
  - high available nodes with a storage limit of 30 days
- Tradeoffs:
  - Privacy issues
- Side requirements: discovery method to locate store nodes
- The spec status: implemented in nim-waku, js-waku, go-waku
An external storage layer is required to address storage of messages older than 30 days. The current solution proposal is based on Bittorent. The community owner, as a high available node with enough storage for 30 days (community owner may use Status fleet nodes instead) of historical messages, is the coordinator of the protocol. The community owner creates a magnet link for every group of messages for the last 7 days, uploads that data into the torrent network, distributes the magnet link with the community members via a hidden channel. Community members get the magnet and download the data through their torrent clients.
- The spec status: the high-level idea is sketched and the spec is in progress, no implementation is available
- Tradeoffs:
  - in its current version, it does not contribute to the reliability of the store nodes, it solves the problem at the status app level
  - as the solution is at the status app level, not composable with other waku v2 protocols(its interaction with other waku v2 protocols is unclear)
  - no support for the browser clients
  - battery and data usage concern for the mobile clients due to running Bittorent client
  - heavily relies on some available infrastructure i.e., the community owner
  - bandwidth inefficient (nodes may end up downloading the same data multiple times i.e., by listening to the network and by downloading the data associated with the magnet link they receive from the community owner)

Note about network reliability: Another focus of this user story should be the integration of waku v2 into Status app as well as switching to waku v2 (WAKU2-Relay) as the default transportation layer. This is specially needed to get closer to the network reliability assumption set out in the current user story. In fact, waku v2 has lower amplification factor than v1, and is more scalable than v1 hence provides a more reliable network layer for a large number of communities.

User story 2: nodes with low availability

In this user story, we shall relax some of the assumptions of user story 1 which are hardly achievable in practice. One major assumption that we made in the prior user story was the high availability of the nodes. In this user story, no nodes will be highly available. This will eliminate the reliance on the centralized infrastructure like Status fleet nodes or community owner’s resources and make our setting closer to a real p2p messaging system.
Below is user story 2, its differences from user story 1 are shown in bold.

There are no nodes with 24/7 availability
A group of nodes can collectively but not individually provide 24/7 availability
No bandwidth constraint
There are nodes with enough storage to persist 30 days historical messages
No nodes have storage to persist messages older than 30 days
No infrastructure is available to aid message persistence
All the nodes are honest
Network is reliable

User story 2 interpretation from protocol pov

WAKU2-STORE can be still used to provide data consistency for live data (30-days old messages), however, it should be extended to address the intermittent availability of the store nodes i.e., they may go offline and miss some of the messages. To address this, a time-based synchronization method relying on WAKU2-FTSTORE protocol is proposed.

In a nutshell, WAKU2-FTSTORE protocol allows store nodes to go offline and fetch the historical messages of that off period from another store node that has been online for the same duration.
- Tradeoffs: relies on nodes synchronized clocks (±20s)
- The spec status: implemented in nim-waku, go-waku, js-waku
The time-based synchronization via WAKU2-FTSTORE deploys the WAKU2-FTSTORE to provide data consistency using the aggregate availability of a group of store nodes. That group of store nodes form a full mesh; each time one of them goes offline and gets online, it runs WAKU2-FTSTORE with every other member of that group to fill the history gap. The solution proposal can be found in this issue
- Tradeoffs:
  - not scalable to a large and unknown number of store nodes
  - only good for a known and small set of store nodes
- Side requirements: relies on a capability discovery method which is not yet available
- The spec states: not yet implemented (very easy to implement though)

User Story 3: unreliable network

There are no nodes with 24/7 availability
A group of nodes can collectively but not individually provide 24/7 availability
No bandwidth constraint
There are nodes with enough storage to persist 30 days historical messages
No nodes have storage to persist messages older than 30 days
No infrastructure is available to aid message persistence
All the nodes are honest
Network is not reliable

User story 3 interpretation from protocol pov

Network unreliability means messages may get lost at the network layer and never reach a subset of nodes. Since there is no synchronization across the nodes, they won’t know whether they lack a message or not. This especially affects the reliability of the WAKU2-Store protocol. Store nodes will need a synchronization mechanism to identify their missing messages and to fetch them from other store nodes.

Waku provides two protocols for this user story, MVDS, and MVDS++:

MVDS can provide data consistency for a small chat groups or 1:1 chats
- In MVDS, a group of users who belong let’s say a close group chat, can frequently synchronize their messages with each other and make sure all the published messages are seen by everyone in that group. Users who go offline can resume the synchronization when they back online.
- Tradeoffs:
  - poor scalability for a large group of peers
  - Communication/bandwidth-heavy
- The spec status: MVDS specs, implementation available in go (other implementations may be available??)
MVDS++ is the extended version of MVDS to provide synchronization across store nodes. The MVDS++ proposal is available in this issue.
- Tradeoffs:
  - not scalable to a large and unknown number of store nodes
  - only good for a known and small set of store nodes.
  - bandwidth/communication heavy
- The spec status: no specs, only the solution proposal is available in this issue. This solution is subject to change due to the scalability issue.

Final notes

Please leave your comments as to whether you agree with the suggested user stories and their order of importance.

r4bbit · December 7, 2021, 11:49am

Hi @sanaz ,

great write-up! I think what you’ve laid out here makes a lot of sense, also with regards to the importance and order of user stories / steps taken.

I do have a comment / question:

User story 1 maps the proposed solution for the MVP (the one provided by John) to what this could look like with Waku2 store nodes (the actual first implementation will use Waku1 and its mailservers). Then, at the end it mentions:

Another focus of this user story should be the integration of waku v2 into Status app as well as switching to waku v2 (WAKU2-Relay) as the default transportation layer.

I might be wrong here, but isn’t the usage of Waku2 store nodes in this user story already implying that Waku2 relay is used? Saying another focus of this user story should be the integration of waku v2 tells me we could implement thisusing Waku2 store nodes but without waku2 relay, which in my understanding isn’t possible.

So I guess my question is: should this be made explicit that this user story requires wakuv2 relay?

If this applies, then, would it make sense to add a user story 0 which kinda does what user story 1 does, but with Wakuv1 mailservers?

Then, user story 2 explicitly describes the usage of Waku2-FTSTORE, implying that we’d go with user story 1 first and only then move to story 2 which enables FTSTORE. I wonder if this is something we want to do right away. If we already have waku2 store nodes in place, what reason would we have not to enable the FTSTORE protocol?

It might be that these questions are too (implementation) specific and may not really apply to your outlines roadmap here. Please let me know if that’s the case.

Hope this make sense!