Optimizing Community Description

The routing part is similar to previous suggestion of using light push + store queries for those large message (no propagation on relay).

In essence, only the last version of CommunityDescription matters. Which opens opportunities in terms of bandwidth and diskspace saving.

We have identified three strategies to leverage this opportunity, that could be used together:

  • lazy pull: messages should not be broadcast on relay but being pulled by the users. Which is justified because only the most recent message matters. It does not make sense for a user to go online and pull all the versions of the message
  • last version only: only the latest version matters, the system from which the users pulls should not store previous versions. Which implies the system being aware of the specific artefact and its versions
  • update: we may want to also enable an update only from the sender, where instead of pushing the full description, could only update specific fields to save on upload

From a high level, we can:

  • use light push + store (gives us lazy pull)
  • move CommunityDescription message to an diff mechanism where instead of pushing the full message every 8 hours, it pushes a diff from last version and user has to grab all diffs to build messages
  • Build a new Waku protocol that enables those strategies (eg specialized store)
  • Integrate an existing protocol that enables those strategies

I believe it still makes sense to use light push + store and shard segregation as short term strategies.

For the long term solution, we need to look at existing protocols (Codex, BitTorrent, IPFS, gundb) and understand whether or not they fit purpose. If they don’t we can review whether this is something that would make sense to add to the Waku protocol family.

edit: missing requirements to make an informed decision are:

  • data size
  • frequency of update
  • frequency of retrieval
  • latency of either
2 Likes