Notes on distributed pub/sub

I did a recent brief yet fruitful dive into pubsub, mainly with the goal of finding some useful research of pubsub over Kademlia. This write-up serves as a summary of the findings related to vacp2p/research#32.

So the first part of the research was to investigate PSS, which used the term pubsub through out their documentation leading us to the idea that they may have implemented a pubsub system over Forwarding-Kademlia network. However, this was sadly not the case. What they call pubsub, is simply a message filter that a node can use to retrieve messages sent to them that they want (I opened a PR to make the naming more appropriate ethersphere/swarm#2170).

When looking for pubsub over Kademlia, the research seemed rather sparse. This probably has to do with the fact that messages aren’t routed by intermediate nodes, so there isn’t really an effective way to form broadcast trees.

The first paper I looked into was scribe, the full notes of which you can find here. The summary of this paper is that it forms a multicast tree over the nodes in a network creating an effective dessemination tree.


Above is an example of one of these trees, node 1 is where all our pubsub messages are deseminated from. Additionally, if a peer wants to send a message to our group, he must send it through one. This gives us a very nice property, which is that node 1 can manage access control. Making group messaging rights a little more simple. The group state, which would be things like access rights are shared between the nodes closest to our topic, so 1 would replicate these things with its neighbours. Whats nice here is no node needs to know the entire tree and must only track its parents and children, additionally the tree is self repairing.

Due to the fact that scribe is built on pastry, the amount of messages a single node has to send is low, as intermediate nodes route and forward the message.

The second paper I looked at was called “Effective Manycast Messaging for Kademlia Network”, the notes of which can be found here. This paper essentially presents how scribes pubsub could work over Kademlia, however there are some considerations due to the fact that messages are not routed by intermediate nodes. The root peer needs to do a lot more stuff, meaning uses a higher bandwidth.

Additionally, libp2p implements a form of pubsub, however from my current understanding of the protocol. Even when a structured discovery is initially used, the routing of a message is then unstructured. Meaning that even if we use Kademlia to discover peers, the message is then gossiped to all peers whereas in our structured pubsub examples it is only routed to the next closest node in our tree. There is of course a trade-off between maintaining the tree used vs gossiping.

Finally, it is important to note, that some of the considerations in the kademlia pubsub example are invalidated if we use forwarding kademlia. Meaning it may operate far more similarly to scribe than we can currently know. So bandwidth for a peer may not be a consideration. What needs to be thought about here is incentives of routing, a thought I had was instead of incentivizing the routing itself, we could simply incentivize the root nodes in the tree. Given a large enough distribution of both groups and nodes, every node would have an equal amount of topics that they are the roots of, so every node in the network would be incentivized.

This is just a quick summary of some of the things read, there’s probably a lot more thinking to do on the issue.

Quick question on something that stuck out:

Additionally, if a peer wants to send a message to our group, he must send it through one. This gives us a very nice property, which is that node 1 can manage access control.

  1. What happens as nodes churn?
  2. Is this [manage access control] a desirable property, or a tool for possible for censorship and control?
  1. The tree is self reparing, the root node shares the state with its neighbours so if it dies the children of the root automatically switch to the next closest node as the root.

  2. It could possibly lead to censorship too and there may need to be a mechanism to resolve this.

Interesting note, due to libp2ps unstructured nature, it looks like the mesh created does not converge: