Sorry I still don’t understand what “replacement” mean here. Can you describe the behaviour with and without replacement or provide a reference please?
Yes, so we agree that the spam protection mentioned here is out of scope of mixnet, as it does not protect the mixnet.
The question is then, how is the mixnet protected?
Our PoC encodes full multiaddresses (including peerID) for each hop in the Sphinx packet, eliminating the need for mid-transmission discovery (as mentioned in (6)
above).
To enhance reliability without significantly compromising anonymity, we plan to send messages over 3-4 redundant paths. Finding the optimal number of paths is a work in progress. As noted earlier, for complex use cases such as GossipSub, anonymizing only the first hop may suffice, potentially reducing the overall overhead.
Random selection without replacement means that once a node is selected, it cannot be selected again for the same path. That is, after a node is selected, it’s removed from the pool of available nodes for that specific path. This process continues until the required path length is reached. This ensures each node in the path is unique.
This approach prevents any single node from appearing multiple times in a path, which is crucial for mitigating traffic correlation attacks.
In contrast, random selection with replacement would allow the same node to be potentially selected multiple times for a single path.
The Sphinx packet format provides strong protection against traffic analysis, offering unlinkability and resistance to tagging attacks. Its per-hop integrity checks effectively prevent malformed or spoofed packets from propagating, reducing the risk of such attacks on downstream nodes. Random delays and dummy traffic, while pluggable, significantly enhance protection against timing-based attacks and traffic correlation.
These mechanisms obscure traffic patterns, making targeted attacks harder. However, you’re right that the mixnet doesn’t inherently defend against Sybil or large-scale DoS attacks.
Exploring complementary protection mechanisms like proof-of-stake, rate-limiting, or reputation systems could add value—though they come with trade-offs, like increased complexity or reliance on blockchain infrastructure.
Thank you for the discussion.
This is a good example of working towards finding consensus on an RFC, aligning with our goals for RFC culture to foster more consensus-finding discussions similar to the IETF. cc @Phil
The RFC is currently in its preliminary stage, serving as a starting point for discussion rather than a finalized proposal.
Our goal is to collaboratively refine the ideas presented, seeking consensus to develop the most practical and effective solution.
Here are my thoughts on the various aspects of the libp2p-mix protocol:
Protocol vs Transport
(tbd, will edit)
Routing
The Sphinx package should contain complete routing information, including the full multiaddress (potentially allowing multiple, but limited).
We can modify the Sphinx format to accommodate this.
While mid-transmission discovery might have desirable properties in certain niche situations, it’s worth mentioning in the RFC but not exploring further at this stage.
For architectural consistency within libp2p, the inner-protocol endpoint should remain separate from the mix exit node.
This separation aligns better with libp2p’s modular design principles and allows for greater flexibility in protocol implementation and network topology (it requires thorough anonymity analysis though).
Discovery
Discovery should indeed be separate from the mix protocol definition.
We can move the discovery section to the appendix as an example implementation.
The RFC should focus on defining the interface to the discovery service, explaining what information about peers the discovery service has to deliver.
This approach should remain agnostic to ENRs, with ENR implementation being just one possible method.
For the proof of concept, it’s reasonable to leave discovery out of scope initially.
Ideally, we should aim for a discovery system allowing random sampling from the full peer set, with Discv5 being a close approximation.
Efficient capability discovery, a topic we previously considered researching, could be revisited in Vac ACZ.
This is crucial for mitigating the issue of non-functional peers.
My suggestion is to extend libp2p-kaddht
with efficient capability discovery, replacing ENRs and the dependency on a libp2p-external discovery service.
Message Pushing
To enhance resilience against non-functional peers, messages pushed through gossipsub or lightpush should be transmitted via multiple diverse paths, similar to the approach used in tor-push.
This can be added as a recommendation in the RFC, though not as part of the core libp2p-mix specification.
Nodes implementing mix SHOULD follow this approach for message push protocols.
Spam Protection
Spam protection is not part of the core mix protocol.
libp2p-mix
only defines how mix nodes wrap and unwrap packets, serving as one building block of a broader mix architecture.
Of course, spam protection is a crucial aspect of an architecture using the libp2p-mix protocol in practice and will be addressed as part of future work.
For now, the raw RFC includes a simple PoW mechanism in the appendix as an example. We could remove this in future more mature versions of the RFC.
After establishing a running testnet with the core libp2p-mix protocol,
we will prioritize either discovery or spam protection based on feedback and practical needs (in case nothing even more pressing comes up).
Spam protection will be designed as a pluggable component and most likely defined in a separate document to maintain modularity and flexibility.
This approach allows for adapting spam protection mechanisms to different use cases without overcomplicating the core protocol.
We could explore combining RLN with mix for Waku.
The mix protocol would allow decoupling network parameters from the RLN identity, which could offer desirable privacy properties.
Capturing thoughts on spam protection as I dedicated some brain cells to it.
One attack to prevent is when L
is excessive. Taking resources of the network by forcing mixing of one message by a great number of nodes.
As previously mentioned, RLN Relay applied on exit (or even applied on entry) would not protect against this attack.
It may interesting to consider using RLN to limit the number of unwraps
, instead of messages
per epoch
.
To prevent cyclic paths, so that at worst, a message goes through all nodes to the mixnet, but only once. A rate limit of 1 per epoch
could be interesting.
The epoch
may need to be large enough so that it does not renew by the time the message loops around.
This would also force the user to use different node for each message sent within the epoch.
One attack to prevent is when
L
is excessive. Taking resources of the network by forcing mixing of one message by a great number of nodes.
Good point! The Sphinx packet provides some protection here. The packet size is determined by the maximum path length r , which limits the number of hops (L) to a maximum of r. For most real-time use cases, r = 5 should be sufficient, preventing a loop from exceeding 5 hops. We could even set L = r = 3 to strike a balance between efficiency and good anonymity protection.
It may interesting to consider using RLN to limit the number of
unwraps
, instead ofmessages
perepoch
.
This is an interesting idea. Mix nodes can’t distinguish between packets, so they wouldn’t be able to tell if a packet is being unwrapped for the second time. We’d need to look closely at RLN to see if it could help limit the number of unwraps per epoch.
This would also force the user to use different node for each message sent within the epoch.
If a mix node could figure out whether the same user is behind two messages (in an epoch), it could lead to unwanted correlation attacks. Additionally, restricting node usage across paths could limit the available paths, reducing overall usability of the system.