The Cost of Multiple Waku Implementations

fryorcraken · November 17, 2023, 3:36am

Members of the Waku team regularly bring up that the development of multiple Waku clients (nwaku, go-waku, js-waku) and the usage of Nim are hindering progress. Despite allocating resources to each client, having more engineers does not result in faster delivery due to the high churn in terms of protocol research, design and implementation.

Therefore, I am writing this post to provide a factual view on the benefit and disadvantages of having several implementations, and the impact of using Nim and initiate a discussion on the subject.

My expectations is that, at a minimum, this post encourages a fact based discussion on the place of Nim in the Waku, and possibly Logos, ecosystems and a better alignment across the organisation.
It is my opinion that whether the proposed strategy is accepted is less valuable than strong alignment across the organisation.

Note that I hesitated to keep this discussion internal or public. In an effort to be in line with our principle of transparency and our commitment to build in the open, I have opted to publish this post here.

Nim Benefits

The use of Nim is a topic of debate when it comes to the efficiency of developing multiple Waku clients. In this summary, I will do my best to explain the strategic value of Nim.

In-house Nim Expertise

The Status organization has developed Nim expertise through the previous Nimbus project. Therefore, it is logical to utilize this expertise for new projects like Waku.

However, there is limited transfer of Nim knowledge between teams, as observed within the potential Nim team’s scope.

Talent Pool

Using a specific language can sometimes help attract specific talent. The understanding and enthusiasm for a particular language may indicate deeper analytical skills or system knowledge. This has been observed in the blockchain domain, as seen with IOHK starting with a Haskell-only codebase.

However, this has not yield benefits for Waku, as none of the current Waku team members had prior experience with Nim before contributing to Waku.

Logos Projects Cohesion

The Logos infrastructure project should be seen as a cohesive tech stack for the Logos network state. Using a common system language to achieve this goal is likely to facilitate inter-project integration.

However, Waku is currently focused on resolving the chat use-case for the Status Community, with the core library (status-go) written in Golang. While Codex uses Nim and Nomos uses Rust. Finally, while it is beneficial to integrate a library in the same language, it is also possible to expose c-bindings and enable inter-project integration this way.

The Nim Language

The properties of the Nim language, such as native dependency-free executables and suitability for embedded and hard-realtime systems, make it well-suited for the Waku use case: supporting resource-restricted devices and environments.

Status’ Influence on the Nim Ecosystem

As the primary user of Nim, Status has the opportunity to influence of the direction of the Nim ecosystem.
However, this comes with several downsides at it means most of the battle-testing is done by Status and not other projects or organisations. Moreover, it is not clear how much this opportunity is being leveraged. The lack of dedicated Nim ecosystem team is an example of such missed opportunity.

Nim-libp2p

The maintainers of nim-libp2p are core contributors, which brings several benefits, including:

The ability to influence the nim-libp2p roadmap, such as implementing websocket and WebRTC transports or specific optimizations.
Priority support from maintainers when encountering bugs.
Access to in-house libp2p expertise.

It is important to note that nim-libp2p c-bindings are a possibility, meaning that the aforementioned benefits could still be retained regardless of the Waku language implementation. Additionally, nim-libp2p cannot deviate completely from the libp2p protocol, and most of the features required by Waku (e.g. websocket, webrtc) were already available in other libp2p implementations (such as go, rust, and js).

Nim Hindrances

What are the disadvantages of using Nim and how do they impact delivery?

Ref: Justification - Nim Integration/Support engineer (apologies to readers who are not CCs as this document is internal).

Tooling

Ref: Nim Tooling Wishlist

The lack of expected tooling for a modern language is a daily obstacle to productive development.

In most languages, IDEs provide features such as browsing, auto-completion, hinting, refactoring, and error highlighting, which enable developers to write code faster and reduce their mental load (by relying on the IDE to provide hints on type names, for example).

Linting, static analysis, and dynamic analysis tools help reduce bugs and performance issues in a codebase, saving time spent on investigating and fixing these issues.

“Standard” Library Maintenance

Several “standard” library do not have strong ownership and maintainers. While critical bugs are being fixed by the libraries’ original author, extending them is not always prioritized. Critical libraries such as nim-libp2p have clear ownership, but other libraries originally developed by the Nimbus team remain in a limbo in terms of roadmap and feature extension.

Some library simply do not exist, such as proto file base code generation for protobuf.

Hiring and Retaining Talent

While we expect and observe that senior software engineers see Nim as a tool being used towards a goal, it is not the case from junior and medior engineers who, once familiar with Nim and its lack of tooling, sees it as a daily source of frustration. In the small lifetime of Waku, we have witnessed engineers departing mainly to their distaste of Nim.

This is also an issue when hiring talent as a strong emphasis need to be set on the fact that Nim is the language of choice, trying to filter out any candidate that may leave due to it.

Ecosystem Maturity

Tooling and library maintenance are symptoms of the low maturity of the Nim ecosystem in general. Another aspect to this and consequence of Nim not being a popular language is that sometimes the nwaku team has to be the first one to attempt something in Nim language, such as:

packaging nim library for NodeJS (attempted)
Using websocket (nim in general and within nim-libp2p context)
Using WebRTC, Webtransport (TBD)
interpreting WASM in Nim (TBD for application level message validation)
Embedding nim library on mobile, including using Nim c-binding for Swift/Kotlin/Flutter/etc (TBD, assuming we are not using go-waku)

Not only it means added effort for Waku and nim-libp2p teams to deliver, but also higher chance of encountering bugs from dependencies, as demonstrated with nim-websocket and nim-libp2p websocket transports.

Overhead Cost

In this google sheet I attempted to summarize the cost of the usage of Nim and the R&D of an additional Waku implementation.

Our rough estimation is that additional Waku implementations slow down research effort by 10 to 15% due to the time spent by research engineers to support software engineers in duplicating a new protocol in their codebase.

We also estimate that the usage of Nim itself necessitates an addition 3 contributors to reach the same result with another modern language, 1 on nim-libp2p and 2 assuming a nwaku team of 4 developers.

The estimate are based on the specific experience of developing nwaku in the past 2-3 years. Feedback on the calculation of the estimates is of course welcome.

I believe it best to keep the actual dollar figure internal, and will share them on the Logos Discord internal channel.

The Case for Multiple Native Implementations

From a long term perspective, it is evident that multiple native implementations of Waku will be needed to ensure widespread adoption. While c-bindings are an acceptable temporary solution, mature projects are likely to need the flexibility brought by a library in the same language as their product. Especially considering that Waku sits low in the tech stack.

Ideally, and similarly to libp2p, new implementations would be started by other projects, for which Waku’s value to their product is so undeniable that they have a clear incentive of bootstrapping a new Waku library.

Waku is near MVP stage, with Status and The Waku Network Gen 0 launching soon. There is still a high churn in terms research and engineering. This high churn is happening on all implementations. Which why now is a the right time to discuss the cost of multiple Waku implementations.

Once Waku reaches a stage of protocol maturity where the core protocols are widely used and functional, and new research only exists to enable specific use cases, then it would make sense for the Waku team to revisit the maintenance of several native clients and measure the ROI in terms of effort vs reward (onboarding specific mature projects).

This could include maintaining a Nim library for the like of Codex, Nimbus or Nomos.

Potential Outcomes

What outcomes can we expect from this discussion?

No change

The mandate of the Waku team is to provide a suite of censorship-resistant, privacy-minded, and portable communication protocols. These protocols enable Status Communities, as well as other Status features and Logos projects.

The development of multiple Waku implementations hinders this goal, as stated in this discussion.

Having a “no change” outcome is likely to be the worst outcome when various solutions have been proposed to improve the situation.

More support for Nim Projects and Developers from the Logos Organization

One possible solution is to provide increased organizational support for Nim developers within the Logos Organization.

The obstacles faced by Nim developers have been clearly identified by various Nim teams (Waku, Codex, Nimbus). Many of these obstacles can be addressed by allocating dedicated resources to develop the necessary tools and libraries, as mentioned in the Justification - Nim Integration/Support engineer (internal document).

However, despite all parties recognizing the potential for significant improvement, and despite initial efforts being made to coordinate and define the responsibilities of a Nim team, no concrete progress has been made in terms of commitment to fund and establish such a team, as discussed in this Discord conversation (internal, Logos Discord).

It is important to note that considering the scope of the changes required in terms of tooling and libraries, it is likely to take 3 to 12 months before the project team sees any benefits from the establishment of such a team. Furthermore, the process of hiring and setting up the team itself may take an additional 3 to 6 months (unless we can source it from current CCs).

Regardless of the outcome for Waku, it is important to consider that such a team can still provide benefits to the Logos Organization and should be seen as a long-term solution to the issues outlined in this discussion.

Scoping Each Implementation and Reducing Overlap

One solution that has been implemented to address this issue is defining a specific scope for each Waku implementation, with the aim of avoiding redundant work. An example of this scoping can be found at this link.

However, it has become clear that this solution has limitations when it comes to mitigating redundant work:

Managing discrepancies due to a different scope: The nwaku implementation serves as the reference implementation and service node implementation. Consequently, less effort is dedicated to the development of the “light client” feature. On the other hand, go-waku is the library used for Status applications, including Status mobile which utilizes the light client protocol. This discrepancy creates challenges when trying to align the behavior of both implementations, such as providing a common REST API.
Redundant work is still necessary: Despite this, it is still necessary to implement all protocols in nwaku, as it serves as the reference implementation. This means that any protocol work done on go-waku and js-waku is dupe work (already done in nwaku), regardless of the specific scope of each client.

Dropping Go-Waku

One possible outcome is to drop go-waku. Instead, the research and development effort could be focused on the nwaku client, making it the preferred Waku native library.

However, it has already been assessed that integrating the nwaku lib into the status-go codebase is too difficult. This difficulty led to the creation of go-waku.

Months of work have been dedicated to attempting the integration of nwaku into status-go. It is unclear what could be done differently for a second attempt to be successful. Furthermore, considering the timeline pressure to publish Status, dropping go-waku is not a viable option without jeopardizing the success of the Status app.

Finally, Golang is widely used language in the web3 ecosystem and in distributed systems, that, if not for Status, would be likely needed by other projects. This has also the benefits of encouraging contributions to the Waku ecosystem, with go-waku being the repo with most PRs opened by non-Status CCs.

Dropping nwaku in Favour of go-waku

Dropping nwaku is another possible alternative, the benefits are:

Removing support for one client and focusing all efforts on go-waku and js-waku.
Eliminating frictions caused by the immaturity of the Nim ecosystem and other Nim hindrances.
Golang is designed to build scalable distributed systems.
- This would bring an advantage to run Waku node on a multi-core machine/hardware where it can more efficiently use the resources available (like all the cores/CPUs). This would address the high resource end of the Waku adaptive node concept which may not be efficiently addressed by Nim (as it is single threaded).
- Golang’s stdlib is well-documented, and one of the major strengths of the language is its first-class support for concurrency with goroutines and channels. Whereas for Nim, chronos is used due to the limitation of Nim’s support for concurrency.
C-Bindings for go-waku are mature and already available to provide Waku to other languages.
There is some evidence in terms of efficiency in coding in Golang as for over 2 years, Richard not single handedly caught up with nwaku in terms of feature parity, but also deliver several c-bindings wrappers while still continuing his status-go commitments.
Thanks to go-mobile and integration in Status Mobile, go-waku is already available and tested on mobile platforms.

Note that go-libp2p is the reference implementation for libp2p so there is no risk of not having specific libp2p protocols unavailable.

In relation to the Nim language benefits, note that go-waku is already statically compiled.

However, there are downsides to consider:

Rerunning simulations (RLN, etc.) that were done with nwaku. These simulations should be performed regardless to ensure that the performance for Status meets expectations.
Losing in-house nim-libp2p support:
- As mentioned earlier, integrating nim-libp2p via c-bindings in go-waku is an option.
- The expertise available from the p2p team can still be utilized.
- go-libp2p is the reference implementation for libp2p, which means that any libp2p protocol we need would be implemented there first, reducing the risk of relying on a feature that is lagging (e.g., this would be a concern if we were to use rust-libp2p).
Loosing native Waku library for potential integration in Codex or Nimbus.
Adaptation period for engineers and researchers who do not have previous Golang experience, and for all nwaku engineers to get used to the go-waku codebase.

Dropping js-waku

Status Communities needs to access the Waku network from the browser for a couple of web apps.

Decentralization is one of Status’ principles. Therefore, relying on a proxy, like a Web3 RPC Provider, to access the Waku network is not a viable option.

The Nim language lists “compiles to JavaScript” as the 3rd feature on their website. Attempting to compile nwaku to JavaScript and using it instead of js-waku could be an option.

However, this approach falls under the ecosystem maturity risk category, which includes the following considerations:

Can we make it run in the browser? NodeJS? React Native?
Can we bundle the library in a small package size?
Can it interface with WASM (zerokit)?
Can it be split into several packages for modular and composable usage?

Going down this path would incur an initial steep cost, similar to the nwaku c-binding experiment or the integration of nwaku in status-go, before we can understand whether it is feasible and practical.

Dropping Two Implementations

Taking it a step further, we can consider dropping two implementations. This would consolidate the risks mentioned in previous sections.

Dropping All Implementations in Favour of Using Rust

An alternative approach would be to start fresh with a single codebase that can be used across all languages, facilitated by c-bindings and WASM export.

This would mitigate the risks associated with abandoning each individual implementation.

Proposed Strategy

The previous section listed all possible outcomes with associated benefits and risks.

Only two outcomes have risks that can be easily mitigated:

Creating a Nim team
Dropping nwaku and nim in favour of go-waku for service node and native library needs.

The Nim team outcome has already been discussed in details in More support for Nim projects and developers from the Logos Organization.. As mentioned, this would only yield benefit 3 to 12 months down the path, mitigating most, but not all, of Nim’s cons.
However, it would not address the issue of maintaining several clients during this high-churn period.

This only leave dropping nwaku as a viable option to remove the frictions, frustration and slow-down created by maintaining 3 clients and using Nim.

Overview

How do we move all research and engineering activities from nwaku to go-waku?

Features: go-waku is mostly equal in terms of features, there may be some specific CLI flags that needs to be sorted. Various discrepancies have already been flagged and addressed thanks to the QA effort done by Vac/DST.
Performance: Static analysis is in place for go-waku but some further memory profiling could be done. Relay and RLN simulation should be done for go-waku either way, as it is the main client for Status.
PostgreSQL: Delivering PostgreSQL backend is a Status requirement to deal with a centralized/federated architecture. The research work to provide more a distributed store service has already started. Consideration will be needed to decide whether to switch Status fleet to go-waku + PostgreSQL (already implemented but stress testings would be needed) or keep nwaku and deliver distributed Waku store with go-waku.
nwaku maintenance: as the set service node for Status fleet, nwaku maintenance would need to continue, or stress testing of PostgreSQL go-waku implementation would need to be done to enable a switch to a go-waku only fleet.
Testing: in terms of interoperability testing, nwaku and go-waku are both being tested against js-waku and included in the work done by the DST team. Both go-waku and nwaku also have dedicated test engineers working on test improvements. The DST team is now working on an interop framework that is meant for both go-waku and nwaku from the start.

Milestone impact

10k/1mil milestones: The remaining work is mainly around testing and simulation. This was mainly done for nwaku with planned to do it for go-waku once finalized. We are facing delay in terms of DST simulations. It is likely to reduce the overall work if simulations only need to be done for one client, go-waku, the one being used in Status Communities.

PostgreSQL simulation done in nwaku and other “nwaku as a service” node work would have to replicated in go-waku. However, this would not be a show stopper for Status launch as nwaku as it is would be enough for the launch. This work would be needed as we want to move Status to go-waku service nodes for new features such as distributed store, RLN, etc. Do note there is also the option to keep nwaku and PostgreSQL as research is in progress to distribute the Waku store service.
Gen 0: There is mostly parity in feature with go-waku and nwaku, only local simulations performed by Waku Research team would have to be re-run.

Team impact

We currently have 5 engineer in the nwaku team and 3 in the go-waku team. One of the nwaku engineer is moving to a solution role and another moving to a research role. The combined team would then be 6 engineers. This is a reasonable size and no management overhead is to be expected.

Impact on Other Logos Project

As previously stated, this would have minimum impact on the Status Communities milestone. If anything, it would free time to run more simulation, test stressing and QA efforts on the go-waku library.

There is impact for a potential integration of Waku in Codex or Nimbus. To mitigate this impact, using c-bindings is an option. For example, TheGraphcast successfully uses go-waku via a Rust wrapper of go-waku c-bindings.

If Nomos were to use Waku, then the Rust bindings already use go-waku so there would be no difference.

Proposed Roadmap

Review the proposal above within the Waku team (Nov 2023)
Push this proposal to Logos’ leadership and founders (Dec 2023)
If buy in, review with Status app leadership (Dec 2023)
If agree, go-waku becomes reference implementation (Jan 2024)
1. Onboarding of researcher and nwaku engineers
2. New protocols are now implemented in go-waku
3. Re-organize and re-distribute upcoming go-waku work to nwaku engineers
Move all nwaku specific DST activities to go-waku (10k node simulation)
Re-run simulations for go-waku:
1. RLN+Relay
2. PostgreSQL stress testing may be re-done depending on discussion and agreement with Status app team
Deploy go-waku fleets to replace existing nwaku fleets (some go-waku fleets are already running for Status), dogfood fleet, monitor
go-waku fleets become “default” fleets for js-waku, Status (if applicable)

rymnc · November 18, 2023, 9:41am

I Personally agree with dropping one implementation, over the past year or so I have seen multiple conversations about how hard it is to work with Nim (which is not a skill issue, there are many footguns). If we move to just go-waku+js-waku, we can also drop zerokit, and develop the rln circuit in go using gnark. It is the fastest groth16 implementation, so we can expect a performance boost for RLN. Additionally, we would not have 2 translation layers between zerokit to golang.

We can benefit so much from the tooling available for the language as well.

Disagree however, on dropping all implementations and using Rust, too much overhead to ship gen 0.

Just my 2 cents, I’m not on the Waku team

arnetheduck · November 20, 2023, 12:14pm

Thanks for this long post. Long in, long out

Generally, the decision to own the full stack is strategic, to ensure that we build and retain ownership, licensing and competence for all the parts of infrastructure that we rely on - this is reflected in investments across the board, from, like Waku represents, messaging and privacy, across storage (like codex), consensus (like nimbus and nomos), user experience (like app / desktop) and, in the case of logos, the broader philosophical movement from which our efforts are derived.

From a historical point of view, is remains a contributing reason to why we went with Nim in the first place.

The strategy is also in place as insurance against changes in the nature of the projects that we depend upon as well as our seat at the table affecting the direction of their development.

A key piece of the strategy is also to encourage protocol-first development - ie every time we create an implementation, it should be possible to replicate it in any language by anyone without complication (as exemplified by the ease at which for example go-waku, a second implementation, can be done when a first implementation that has already solved the architectural problems already exists) - this puts additional stress on documentation and well-written standards.

In terms of alignment, the main direction of language support for Research / Logos is Rust / Nim / wasm (and by extension JS) for the time being.

We retain and make exceptions for go-lang in order to support Status via status-go, to avoid costly rewrites of the existing codebase, even though a refactoring of status-go is high on the wishlist - it is a poorly understood codebase that has grown significant amount of feature and tech debt as priorities have shifted in app development.

Our Nim needs have until recently been scaled roughly according to the needs of the Nimbus team (which itself is small) and a research contingent on the waku side - the Nimbus team in turn has delivered not only an extremely efficient implementation of an Ethereum client, but thanks to the seat at the table, also changed the direction of the Ethereum protocol (via the light client, portal, etc initiatives) - as a side effect of that development, they have also delivered an increasing number of high-quality Nim libraries aimed at chain development in general - not only the code, but also the research and understanding that comes with.

Ditto our libp2p implementation - efforts like IDONTWANT, although they are being shipped as part of the EIP-4844 package on Ethereum, address a core need to render gossip more efficient for larger transfers (such as in-chat images) at a protocol level whose raison d’etre can be traced back to early status app and privacy discussions - unlike sharding, it does so without compromising on the anonymity set. This is a good example of the benefit of working across many “zoom levels” of abstraction and layers.

IDONTWANT is currently unique to nim-libp2p - this highlights another important point: the work to improve the core infrastructure protocols and implementation will not go away if we move to a different implementation - they will remain, meaning that if we want to pursue a different implementation, we need to take a seat at the table of that implementation and do the work there. We have done this in the past and continue to do so where applicable, but such seats do not come without compromises - when the upstream projects have different priorities, the priorities of the project owner win - the strategy of owning not just bits and pieces, but significant parts of the stack, dates back to practical experience of this risk turning into reality in areas dear to us.

This cross-pollination works both ways - as waku drives the priority of implementing webrtc and websockets in libp2p, for example, projects like nimbus and codex will benefit from the increased resilience and flexibility, but above all this work puts us in a position to better and more practically understand topics like validator privacy, resilient and flexible networking and connectivity etc when we scale up efforts to logos as well.

The implementation of IDONTWANT was done in deep collaboration with the EF, libp2p and others - their research shows up in the quality of the protocol, but, as this post is focusing on the budgetary aspects, it doesn’t show up in our budget - this is the power of community development that we’re looking to harness - our skills and efforts combined with those of other likeminded developers and orgs to create an environment in which all projects benefit. Incredibly powerful, and something to harness both within status and outside. I like to rant about this point, BTW, when prompted.

It is indeed important for efficiency that projects support each other in their efforts and that they take our collective needs into account when developing their priorities. As we’re growing, that prioritization obviously must take new forms, hence the desire to start a new core Nim infrastructure team. Starting a new team takes time - hiring, organising, building understanding etc. That said, it is happening with hires going through their eval periods and the team being formed as an independent entity, instead of being an afterthought of the Nimbus team.

This highlights another key point - the Nimbus team has, as the outcome of their efforts, solved their most acute needs in terms of Nim infrastructure support. This no longer makes them the best candidates for driving further prioritisation in this area - this responsibility now falls, as evidenced by this post, in part on the waku team which is where growth and active new development is happening - it is thus not enough for waku developers to say “nim tooling is bad”, but rather it becomes both an opportunity and a responsibility of each developer to articulate their needs and/or take action, as can be seen in the writing of this post - the Nimbus team would not, at this stage, be in the right spot to articulate this as efficiently. We do not expect contributors to be passive consumers in this aspect, even if that’s a comfortable place to be - when something itches, we scratch the itch then go back to its root cause and fix that too.

With that out of the way, on to more practical stuff:

Duplicate work

As a reminder from past discussions, it’s good to reiterate that each implementation we currently have has broadly orthogonal use cases, and as such, should have minimal overlap in terms of features and maintenance outside of the protocol, which thanks to being well documented, doesn’t require much additional effort - when those priorities are not adhered to, we create the bed we lie in, ie one of overlapping maintenance efforts. As an example, go-waku exists to serve the needs of status-go - if anything outside of what status-go needs gets developed, that is a direct violation of this mandate and the consequence is inefficiency and increased maintenance for the organisation. Locally, this might have seemed like a reasonable step which is often why it ends up happening - globally, perhaps not.

We assume in our approach, that each developer understands this point and directs their efforts accordingly where the outcome aligns with this broad understanding, something that can be hard to remember in the heat of the moment because “it looks so easy to do” - see xkcd: Automation for a classic introduction to this oft-seen engineering phenomenon - for those of you in lead positions, recognising and directing effort is paramount and a key part of your responsibilities but ultimately, to thrive in an organisation as ours, this is a skill needs to be honed by every engineer. As many project management books will tell you, what you don’t do is often more critical to success than what you do - close to home, there’s an infinite amount of things the Nimbus team has not done (including not follow up on all bells and optional whistles of auxiliary libraries), in order to be where they are today (ie shipping with a comparatively small team on a monthly basis in tune with the rust, go, java and other implementations that have the supposed “language advantage”).

Native implementations

The core architectural granularity of integration in the web3 space remains REST/JSON-RPC in various shapes and forms - ie the interface between applications can well be resolved at this level, including for example our RLN.

This granularity is strategic and important to maintain - it allows us to focus on what we do well and outsource difficult problems to users that more intimately are familiar with their specific use case. RLN is a good example here - it is an example of a “hard problem” (sybil resistance) where we have one proposal (zk proofs) but other users may have others more tailored to their needs (ie private lists etc). By outsourcing this to an RPC protocol, we retain flexibility and can black-box the problem from our perspective but also gain implementation independence.

This same RPC interface can be used either via a loose coupling (ie http) or tight (in-process) and ultimately via well-documented protocols which serve as the basis for advanced consumers. As correctly highlighted above, the step where a native implementation with deep language integration, as opposed to coarse-grained RPC, comes at a separate stage but it is generally not necessary for us to develop useful products - it is however an important a point that needs to be taken into account when developing the product and our choices here, if deliberate, do no result in locking in one particular implementation or language for a specific component. Whether we get go-lang waku PR:s or RPC/API extension requests is downstream from the architectural decisions in how we present and develop the product, not upstream.

When interaction between components becomes so tight that the specific implementation language is of concern, it is perhaps a good time to pause and examine whether the problem architecturally is being approached the best way.

Coming back to waku and its go implementation, this is again a good example - it is quite possible to swap out go-waku for nim-waku in status-go, if desired - waku can be defined as an RPC interaction between an application and the network and this highlights the power of the RPC model - what usually stands in the way of such attempts is indeed tight coupling left unchecked while features keep getting added.

Here is a weekend experiment pr that exemplifies this - it requires several changes on the status-go side but very few on the nwaku side - the tight coupling of activities in status-go (chat protocols, database updates, community permissions and a plethora of other responsibilities in the same paragraph of code) lead to this outcome - the development of waku as, first and foremost, an implementation of an RPC protocol for the consumer means that the implementation waku itself does not need to be changed much to integrate it in any application). The same applies to the other end: as long as the wire protocol of waku is well-documented, the effort to create a native implementation tailored to a particular need is low - the hard problems faced during the first round of development will by this time have been solved and it becomes possible to focus on “second-implementation” problems like fine-tuning, optimisation and low-hanging UX fruit - this applies at this point to both existing and new implementations.

tooling and technicalities

Worth remembering on the tooling front is that every tool out there (apart from the code formatter) developed for C also works with Nim: this includes profilers (vtune), debuggers (gdb/lldb), compiler static analysis (*Sanitizer), valgrind etc.

Re wasm, we use emscripten - https://eth-light.xyz/ is an example of the light client implementation from Nimbus, written in Nim, running in the browser. It is also a excellent example of how we conduct and document protocol work. At some point, we even had a wasm smart contract stack in the works though priorities changed.

collaboration and hunger

Opportunities to collaborate and cross-pollinate are there for anyone that has the hunger to go after them and I’m happy to guide and help to anyone looking for more concrete ways to do this, including spending some of their time productively in a core library / infrastructure team - improving a library, writing a protocol document, providing a guideline for how a specific solution can be made generic, articulating a critical need by writing an issue or making a PR across status for a project you’re not directly involved with are all highly encouraged, as a general principle. When we look for outstanding work, this is often where we find it: someone that went out of their way to connect the dots between the various efforts that we have and shared their experience. It is rewarding in many ways, to look up and see how the seed you sow grows, nurtured by those that you helped when planting it.

Wrap up

I foresee the need to float each other’s boats to remain across our research projects and as such, nwaku provides invaluable support for the other projects we have going outside of its own core development (similar to how nimbus has delivered utility outside of their mandate, in the form of libraries and code that is not a blockchain client specifically) - both for Nim itself but more broadly by making our varied products more useful. That support flows back to waku in a way consistent with what one would expect, ie collaborations, core protocol changes in areas that matter to us as an organisation, access to a broader pool of researchers that we give to and they in turn give back etc as well as the ability to influence where we pool the Nim resources available to us.

I recognise from this post the growing pains in this process - ie as we hunker down to deliver milestones, it becomes easy to miss the benefits that are brought by such collaborations, but also to forget to do the collaboration itself.

Just like nim-libp2p became a team independent of Nimbus as needs grew, so is “core library” support growing into its own effort, with the aim to address our growing needs in areas useful to the projects that use Nim, and this post in and of itself is a signal that this is something worth investing in. I think we can make it work.

P.S. the deleted post is a fat-fingered incomplete version of this one.

alrevuelta · November 22, 2023, 11:12am

Thanks @arnetheduck. Long in, short out in this case

Understand why nim and indeed “we can make it work”. But the issue raised here is not just about nwaku but having 3 implementations. Dropping nwaku was a suggestion, but other solutions can help us deliver without dropping nwaku.
That said, focusing on go-waku would help to iterate faster, prototype and answer some of the open questions we have: incentives, rln v2, store sync. But, once clear, nwaku can be continued with clearer requirements.
“it is quite possible to swap out go-waku for nim-waku in status-go”. If dropping nwaku is a no go, perhaps this swap should be a priority? go-waku would not be needed, and one implementation less would allow us to move faster, and ensure better quality in the existing one(s).

tldr: We could i) pause nwaku + use gowaku to iterate fast, then resume nwaku with clearer specs/mandate, and/or ii) since you mention is possible, swap gowaku for nwaku to have 1 implementation less.

JB_pops · November 24, 2023, 10:05am

Chiming in as there are various “people” considerations.

It can, but as you say - that hasn’t been the case here. We should also be realisitic that the pool of potential core contributors that have experience with Nim is much smaller when compared to other languages. As a project that looks for core contributors to be very mission/principles aligned, the potential pool of candidates becomes even smaller.

And so we need to be confident that once we apply all those filters (skillset/experience, competence and alignment) that not only do enough potential candidates exist but that - more importantly - we are able to attract them to join the project.

This seems like a significant problem to me - losing high quality engineers is very expensive (given the cost of hiring & onboarding) and only further reduces the available pool of talent.

So - regardless of what technical direction/decision you take here - it seems that we need an updated approach to hiring that ensures we can supply the skills needed long term

fryorcraken · November 27, 2023, 5:41am

Thank you for your responses. I am currently digesting this information and intend to continue the discussion in the week or next.

fryorcraken · January 8, 2024, 6:51am

Thank you @arnetheduck and @JB_pops for your insight. I organized my reply by doing light comments first on the reply and then addressing the meat of the subject at the end.

In terms of competence:

nim-lip2p: dropping nwaku does not mean dropping nim-libp2p as c-bindings could be used in nwaku.
Nim lang: as highlighted in the original post, Nim competences in the larger organisation are scarcely being leveraged in the Waku BU. Also, both Nomos and Status have already moved way from Nim.

Yes I acknowledge this and do agree it is a contributor to how easy it was to build go-waku.

This is a good point, similar to the existence of c-lightning who was written in C to push for other implementations of Lightning Bitcoin, seeing C as an unattractive language.

I agree with such approach and this highlight a different path or narrative for the case of the Waku BU owning multiple implementations. It argues in favour of the Status organization to own the go-waku implementation, that was originally started in the Status team.

In an ideal scenario, similarly to libp2p, c-lightning and Ethereum history, the Waku BU should own one native implementation (browser is another story) and projects that use Waku would start with such native implementation using c-bindings/REST API (as you often said) and then move to building a native Waku implementation in their language of choice, possibly in collaboration with other projects, so that such new implementations on Waku does not rely on one, “centralized”, budget, organisation, team.

Long term, it would be better for Waku if the various implementations (Rust, Golang, Nim, etc) are own and maintained by different organisations, teams, or communities.

Unfortunately, we are at a stage where Waku BU now owns two native Waku implementations due to the organisational proximity of Status and Waku.

Thank you for that. This is a good point.

Are you saying hiring has started? I have no news on this front and my efforts to push for the creation of such team seem to have been in vain so far. Which is one of the reason why I started this thread.

One main driver to this is that Golang is a popular language which mean there is a dilemma between only making change in go-waku for status-go vs leveraging the existence of go-waku to add Golang as part of the Waku offering.

This dilemma is influenced by the fact that creating c-bindings from go-waku is significantly cheaper than doing it from nwaku.
Which means that there is more reason to generalize go-waku beyond status-go in terms of cost effort.

In short, considering the effort made to have go-waku, we can now leverage this to build Waku faster and for more languages, as demonstrated with the Rust bindings or even usage of Waku on mobile.

Absolutely, which is also a reason of why we are having this discussion in the first place. We should not be maintaining 3 clients just because of historical reasons. There needs to be a conscious decision of what we should and should not do.

arnetheduck:

This highlights another key point - the Nimbus team has, as the outcome of their efforts, solved their most acute needs in terms of Nim infrastructure support. This no longer makes them the best candidates for driving further prioritisation in this area - this responsibility now falls, as evidenced by this post, in part on the waku team which is where growth and active new development is happening - it is thus not enough for waku developers to say “nim tooling is bad”, but rather it becomes both an opportunity and a responsibility of each developer to articulate their needs and/or take action, as can be seen in the writing of this post - the Nimbus team would not, at this stage, be in the right spot to articulate this as efficiently. We do not expect contributors to be passive consumers in this aspect, even if that’s a comfortable place to be - when something itches, we scratch the itch then go back to its root cause and fix that too.

I believe this is the core of the issue/misalignment here.

The mandate of the Waku team is currently to design and build Waku protocols and software, the shortcoming of the Nim language and ecosystem are perceived and experienced as obstacles to this endeavour.

None of the Waku/Vac JDs I have seen, even older ones 2021-2022 include “enhancing the Nim ecosystem” as part of the job description.
Even in terms of interviews, my understanding is while we make it clear to candidates that they would need to work with Nim, there is no expectations that enhancing the tooling is part of the job.

In terms of owning the stack, I am not seeing any benefits for Waku so far. Maybe it is a problem of visibility. Do note that libp2p expertise has been invaluable for Waku, but not specific to nim-libp2p. On the contrary, what is needed for Waku is already present in other libp2p implementations.

As you said, the Nimbus team has solved their most acute needs in terms of Nim infrastructure support, now Waku (and Codex?) have to take the mantle.
While nim-libp2p development has prioritized Waku needs, it seems that the tooling and libraries “handover” was just not done or discussed. This discussion needs to happen; from my limited knowledge and history, all I can see is that the itch has been ignored.

This unscratched itch led to the current situation: the Waku CCs who worked in Nim feel unsupported, and encumbered by the usage of Nim. The expectation you are stating in terms of contribution back to the ecosystem were not made explicit.
This is made worse by the presence of go-waku: friction due to the Nim ecosystem could be resolved by making go-waku the only native Waku implementation.

This also ties back to JB’s point: We are increasing the “alignment to values” and “required skills” circles where we expect Waku engineer to do what Nimbus CCs did: take ownership of Nim libraries and build tooling such as IDE plugins.

There are a few ways forward: either Waku needs to change how recruitment is done and ensure that nwaku engineers are prepared to significantly contribute back to the Nim ecosystem, or hiring dedicated engineers to Nim support in Waku, or organisational support needs to be setup such as a Nim core team.

Another aspect of this discussion, and happy to be corrected, is that the mandate of Waku and Nimbus teams do differ in a core aspect.
It is my understanding that the choice of language for the Nimbus team is core to the mandate of the team. This is not the case for Waku.
To me this makes a difference in the culture. As a leader, I am willing and seeking to align the team’s culture with the organisation, hence this discussion.
However, I also see deviation to this culture/mandate to use Nim by looking across the org with Status app, go-waku and Nomos.

As originally stated, there are two problems to address: Nim and dual native client ownership.

In your reply, I understand that the value in using Nim mainly comes from owning the full stack. But this has not been fruitful or relevant to Waku (yet?).
As experienced with go-waku so far, we could migrate all R&D from nwaku to go-waku and there would be no short or medium term downside to this.

The long term downside as I understand it would be related to pushing libp2p protocols in a specific direction, but this could still be done working with the nim-libp2p team. By having nim-libp2p integrated in go-waku, or pushing changes using nim-libp2p and then replicated in go-libp2p, or, by reverting back to nwaku when reaching this point.

On the multiple clients subject. When looking at other examples in the ecosystem, it seems that we should thrive for the first core team (Waku BU as it is) to own one native implementation, and encourage other groups/organization/communities to build their own Waku implementation in a different language. Ethereum/EF seems to be a successful example of this model. Libp2p and rust-libp2p might serve as a lesson on the risks of such approach.
In this model, go-waku should be owned by the instantiating project which is the Status app. Going full circle as Waku also originated from the Status app and status-go needs. As you said, status-go needs refactoring but I am guessing monolithic aspect of this repo predate the creation of nwaku. Which then questions the choice of Nim in the first place to integrate in status-go.

I have learned from your reply but it is still not clear to me what benefit using Nim for the reference implementation of Waku brings.
I understand the benefits of using nim-libp2p, but these two decisions (using Nim and using nim-libp2p) can be decoupled.

Moreover, the usage of nim-libp2p has long term benefits, this could be shelved in favour of short and medium terms benefits of iterating and delivering Waku software after with a more mature language, Golang, that we do need to maintain for Status app in any case.

vpavlin · January 8, 2024, 9:54am

This is a very interesting discussion to follow. My 2c to add is something I voiced first during my interview - Whith my ex-Red Hat hat on (i.e. someone who maintained packages and witnessed creation of complex operating system components), I do not believe that having control of the (Nim) ecosystem is a bigger benefit than using hardened and widely adopted components/libraries. In other words, the argument that we are the biggest contributor to and user of Nim ecosystem, hence we control many critical libraries, is not really a benefit if those libraries are not widely used by other parties. Compared to libraries available for example in Golang ecosystem, which are tested and used in various unexpected ways in many projects and thus have higher probability of having corner case bugs discovered and fixed earlier than we’d hit them.

This would be a different story if Nim was on the rise and the adoption would be accelerating, but I am not sure that is the case.

That said, I understand we Status/Logos have significant expertise in Nim, but I have a hard time using that as a justification for not leveraging a more popular and hardened ecosystem.