Waku v2 discv5 Roadmap Discussion
I am planning to write a GitHub issue describing and tracking milestones of a Waku v2 discv5 roadmap.
This post is for discussing the content of this issue.
As this issue will track specific milestone sub-issues, it could also be setup as an epic (as discussed in the 2022 Q1 strategy meeting).
The next section is a first draft of this issue.
For the current state of Waku v2 peer discovery, see current state of waku v2 peer discovery.
Organizational
- Should the waku discv5 roadmap issue be in
- the nim-waku repo and track only nim-waku implementation issues
- vacp2p/research and track all research and implementation issues
- vacp2p/research and only track research issues (no implementation issues)
- Should the issue
- list our current view on future stages, or
- should we manage discussion about future stages in the forum post and only update the roadmap issue once we decide on the respective next stage
separate Waku discv5 network
The Roadmap draft below reflects the approach of having a separate Waku discv5 network.
Advantages
query efficiency (strong)
A separate network avoids the needle-in-the-haystack problem and allows for efficient queries.
Having to search for random nodes until finding one that supports Waku does not leverage the DHT structure to its full extend.
This gets even more weight when introducing capability discovery, because this includes searching for specific nodes. With randomly distributed Waku v2 nodes over the Etherium discv5 network, the O(log(n)) search time cannot be guaranteed.
(DHTs allow retrieving random samples from a distributed set of nodes.
Being part of the Etherium discv5 corresponds to using the DHT as a means for sampling from a large and relatively resilient set of nodes with only a fraction supporting Waku.
On the other hand, having a separate network allows directly sampling from a set of Waku nodes.)
not “polluting” the Etherium discv5 network (weak)
Waku nodes that do not explicitly want to take part in the Etherium discv5 network will not waste resources of the Etherium network.
Disadvantages
loss of anonymity (weak)
This is a weak argument, imho, as the weak anonymity gain comes at the cost of making querying less efficient. Anonymity is only gained by “obscurity” because nodes supporting Waku can still be listed.
Imho, it is better to research methods of protecting anonymity in later stages.
easier to attack with DoS attacks, e.g. eclipse attack
The smaller the DHT, the less resources are necessary for mounting a successful DoS/eclipse attack.
We should (theoretically) analyze the query efficiency loss when being part of the ethereum discv5 network and come back to this option if we cannot find more efficient means of preventing these types of attacks in the future.
Eclipse attacks are made harder by not having a DHT structure managing Waku nodes (DHT discoverability of the Waku capability), and eclipse attacks target this structure.
As said above, when being part of the discv5 network, the DHT structure is only used to sample from a set of random nodes that support Waku with a low probability.
Waku nodes are not actually managed in a DHT structure; they are just scattered within a much larger set of nodes which are in a DHT structure.
Waku v2 Discv5 Roadmap (issue draft)
(All of the following is open for discussion and subject to change. The direct phrasing is just to keep it simpler.)
stage 1 :: Working discv5 based Peer Discovery for Waku v2
The goal of the first stage is a Waku v2 node implementation supporting peer discovery via discv5.
We envision a Waku v2 discv5 DHT separate from the Etherium discv5 DHT.
The implementation for this stage will be based on a nim-eth discv5 feature branch, which generalizes discv5 setup by allowing to choose a different protocol-id
, as suggested by @kdeme in nim-waku issue #770.
Waku v2 will use the protocol ID d5waku
.
Having a different protocol-id
allows an easy check of all incoming messages; messages with a protocol-id
different from d5waku
are ignored.
mile stones + related issues and PRs
- [ ] nim-eth discv5 feature branch supporting a configurable
protocol-id
- nim-waku integration :: #770 which has been addressed in PR #435
introduced validating nodes based on ENRs, which is not sufficient to avoid leakage of Waku nodes into other discv5 networks see #770 discussion.
- nim-waku integration :: #770 which has been addressed in PR #435
- [ ] nim-waku implementation using nim-eth discv5 feature branch
- [ ] nim-waku local test network successfully running Waku v2 discv5
- [ ] …
stage 2 :: Efficiency Considerations
reduce the load on low power devices
Mobile nodes and browsers should use the DHT to query for peers and do not answer queries.
They should indicate this behavior in messages and not be taken into routing tables.
Weak nodes can retrieve bootstrap nodes via DNS or other external sources.
An idea worth considering is stronger nodes offering a service similar to DNS cache resolvers, which perform iterative querying on behalf of the weak node.
In this scenario the mobile node just asks one of the bootstrap nodes retrieved via DNS and directly gets the answer in the following response. This, however, raises privacy considerations.
Mobile nodes might also maintain a simple routing table for caching strong nodes that are not manged in the DNS. Nodes could indicate their interest in offering this service in their responses.
This would require incentivization to be feasible (see future stages).
future stages
The purpose of this section is listing research tasks that (might) have to be addressed to provide a reliable secure peer discovery layer for Waku.
We will decide on the order these should be addressed in after completing the first two stages.
Still, we will keep security and privacy in mind when addressing the first two stages; however, the focus in the first two stages is on getting working results.
incentivization
Nodes that are capable of running the full DHT protocol should be incentivised to do so.
capability discovery
find nodes holding messages of a certain time range
security
defense eclipse attacks
- eclipse attacks + Sybil attack
- can be run by a less powerful attacker controlling a small number of nodes
- eclipse attack :: attacker controls p% of the peers in a network
- defense goal :: a retrieved list of randomly selected peers should only contain O(p%) evil nodes
defense against other DoS attacks
privacy
- hiding the query target
security analysis and attacker model formalization
- Tor model (20% of the nodes are malicious)
- AS-level passive attacker
- Dolev-Yao model