Making Waku Development Leaner

fryorcraken · March 5, 2025, 5:36am

tl;dr:

Waku contributors: ensure you are familiar with the FURPS for your work - for every piece of code you write that you can link directly to a FURPS, talk to your lead or me; repeat until we got those FURPS right.
Waku leads: same as above, and reach out to me in case of doubt; be aware and highlight any high effort component - maybe we should not do them just yet - and share how you use FURPS to inform your work
Customers and stakeholders (Status, Comms Hubs, IFT): FURPS are the contract, anything not in the Waku FURPS is not to be done. Raise any concerns to me
Vac-QA: beyond the current support features, Waku FURPS define the new behaviour, testing work will come from this and the related spec’s
Vac-DST: beyond agreed baseline benchmark, the focus is on the FURPS.

I have recently noticed several occurrences of scope creep in the team. This happened in different teams, making me the common denominator to this issue. So let’s fix this

Prioritization is not enough

Prioritizing features and tasks are an important aspect of planning work for a software engineering team.

However, once prioritization is done, cutting scope must happen to ensure efficacy. Indeed, doing unnecessary work efficiently is still, inefficient.

MoSCoW and Shape Up are good framework to de-scope less important work and pin point critical features.
An Impact Effort matrix can also help cutting out the low impact work.

I believe this is to be even more important when working with new technologies such as Waku. Indeed, when using mature and familiar frameworks, one may be able to get away with building, delivering, fixing few bugs and exiting.

But this is simply not possible when the technology and protocols are being invented as we build them.
A tight plan-build-deliver-use-observe feedback loop is needed to ensure that the newly built technology goes in a direction useful to the users. Simply because the properties of a new protocols may not be foreseeable until a PoC is built and spec’d.

Feedback loop

Let’s dig digger on the feedback loop:

plan: draft potential new protocols and features to implement
build: get coding
deliver: have working software, usable by users
use: get users to use the software
observe: get feedback, observe usage pattern and software behaviour; caveats and limitations; run simulations
plan: from previous observation, set the direction for next features; planning high (user) impact-low effort (dev) work first.

This neat loop does not take in account that building may not always go in the right direction.

When thinking about appetite for a feature, or comparing impact to effort, there is always an assumption of time spent to build.
Which is after all, why Scrum, Shape Up and Kanban frameworks apply sprint, appetite, story point and time boxing concepts. So that it is clear from the get go that the impact/effort ratio for a feature is X (e.g 80% of happy users for 2 weeks of dev work).

What if we spend more than - e.g. 2 weeks - to build the feature?
Then one needs to pause and review, to eventually reduce the scope or drop the feature.
I am sure many of you have experience with Scrum and how incomplete stories are usually just moved on to the next sprint… there is a world where we don’t have to do that.

Most project management framework have checkpoints, or other strategies, to mitigate under-estimated or complex issues. Those strategies usually involves good team and product owner communication. Whether it happens during a Scrum meeting, or in-between iterations.

Developers must be empowered, and reminded, to speak up when the effort is higher than expected.

Including out-of-scope sessions in definitions of done are also a way to ensure that the scope of work remain clear and limited.

Leveraging FURPS

The FURPS framework has recently been introduced to help communicate commitment between team and stakeholders.
I see it as a structured definition of done system. However, due to the team-wide scope, it has to be limited in details, so it can be digested by the stakeholders.
I’d see it as a one liner high level user story, that may not include intricacies of the system and corner cases.

I do not believe there can be a tool that can be used at all level, from stakeholder to engineer. Simply because the level of details depends on the audience.

Yet, engineers should see FURPS as the source of truth, and ensure that FURPS requirements are covered - this is mostly fine.

Lazy engineering

The problem is the scope added around FURPS, the assumptions that for work to be delivered, everything in this domain must be done.
FURPS does not provide an “exclusion” model (but I will be adding one to Waku FURPS).

Eager and passionate Waku engineers go over and beyond, to deliver feature-complete and kick-ass software… and that’s the problem.

I would like to welcome some lazy engineering. Having developers trying to “stick to the FURPS and not more”. While of course, still assuming good practice in terms of specs, code quality and test coverage.

Potentially practicing more Test Driven Development. Translating FURPS to a suite of tests, and then implementing the code to pass the test, not more.

A good example that hopefully does not pinpoint anyone is the Performance items.
We have tons of metrics across our software… but a set of FURPS will only required one or two metrics commitment.

When adding new metrics for a feature. Sure, a couple more may be useful for debugging and investigation, but that’s it. Stick to demonstrating that the FURPS are fulfilled.

Similarly when implementing a feature, define the narrowest possible scope that would achieve those FURPS requirements.

This include handling pull requests. I have dusted my Pull Request Recommendation article.
Be clear in your feedback, and understand what is given to you. Is that a blocker to merge, worst spending a few days on, or is a nitpick, that can be skipped this time around?

Of course, in case of doubt, raise it, talk to your team and leads. But be lean by being lazy, fulfill the FURPS and not beyond.

And if the end result is not enough, blame it on the FURPS writer (me ) - but again, let’s talk first

Knowing the Waku team, I know it will be a challenge to move away from perfection. Yet, I think it might just improve the situation, due to the good faith of every single Waku CC.

Conclusion

The Waku team does not adopt a strict Scrum or Shape Up framework, what we practice can be most closely qualify as Kanban. Each subteam has the freedom to operate as they wish, withing the Waku team framework (roadmap, FURPS, completion dates and weekly updates).

We do not use strong project management metrics, counting tasks delivered and bugs fixed, and instead focus on high level delivery of features and protocols.

While introducing project metrics may be useful, I am more keen to narrowing work scope across the board, so we can focusing delivery fast and frequently. And getting feedback on the software and feature, before deciding whether to invest more time.

This does involve a cultural aspect to this. Pushing every engineer to question every piece of work. While the team track record is great, I see room from improvement to move from “why are we not doing this?” to “why are we doing this?”.

Moving forward

We will be discussing more about this subject. Especially as we adopt FURPS, they need to be useful to us.

I am not keen to add more processes, and I prefer to rely on a strong engineering culture.

Knowing individual Waku contributors, I think their will to deliver greatness gets in the way of being lazy, and to deliver “just enough” to fill in the FURPS. A good problem to have, yet one to fix.

edit: moved tl;dr to top

fryorcraken · March 5, 2025, 10:58pm

Fat Approach

A fat approach involves maintaining a feature branch of Status-go and proceeding with all necessary work for a cold turkey switch at a targeted version of the Status apps. However, this approach has several drawbacks:

Long development time (6-12 months)
Difficulty with dogfooding (can only dogfood from feature branch)
Need to maintain a feature branch to rebase regularly
High risk when the switch happens, as it’s an all-in-one approach

Lean Approach

We opted for a gradual rollout of nwaku in Status apps, which breaks down into the following steps:

Status Desktop only, relay mode only, Waku core protocols only on Linux, Mac, and Windows.
Add Light mode support.
Add Mobile: build on iOS and Android.
Move peer-to-peer reliability implementation from Go-Waku to Nwaku.

These steps are tracked under specific milestones:

Nwaku in Status Desktop (Relay mode) (1)
Nwaku in Status Mobile and Light Mode MVP (2) and (3)
Messaging API (4)

light mode vs relay mode in Status Desktop settings

Lean vs Fat

Let’s review the decisions made that differentiate a lean vs non-lean (fat) approach.

Integrate nwaku in Status Desktop

Fat approach:

Expose the API of all Waku core protocols in libwaku, C bindings, and Golang wrapper.
Rewrite p2p reliability protocol to fully remove Go-Waku dependency.
Do a hard switch when moving to Nwaku, doing it all at once for desktop builds.

Result: A lot of time spent before a working binary can be handed over to CCs and users.

Lean approach

Focus on “relay mode” only and expose the bare minimal core protocols on the API: discv5, store, relay; peer exchange as client, filter as client, light push as client, are left for later.
Focus on core protocols and re-use Go-Waku’s p2p reliability implementation for now.
Add an abstraction layer so that it’s possible to compile Status-go with Nwaku or Go-Waku; no need to maintain a feature branch.

Result: A binary that is restricted in scope but usable, can be handed over to CCs and interested users early on.

Integrate Nwaku in Status Mobile

Fat approach:

Expose the remaining core Waku API and get Light mode working before trying to build on iOS and Android.
Implement p2p reliability protocols for resource-restricted devices in Nim.

Result: Takes more time to get mobile apps for dogfooding in the hands of CC. The Go-Waku/Nwaku duality remains for longer as the p2p reliability implementation needs to be moved from Go-Waku to Nwaku.

Lean approach:

Move in parallel:

Build Nwaku in Status-go for Android.
Build Nwaku in Status-go for iOS.
Expose Waku Core API for Light mode: peer exchange, light push, and filter as client.

Also, keep using go-waku p2p reliability protocol’s implementation

Result: Official switch to Nwaku can be done faster, without re-writing the p2p reliability code from Go to Nim.
We don’t block Android and iOS build on light protocol API work, we move in parallel and join at the end.

Summary

By aiming for a lean approach, where we descope (light protocols as client API, p2p reliability) and split work into shippable chunks (Desktop relay, then mobile, then p2p reliability in Nim), we can put Nwaku-based binaries in the hands of CCs and users earlier (Desktop in Q1/early Q2) to start dogfooding. This means that as we reach D-Day to switch (H2), we would have had several months of hardening the binaries.

A more fat approach, where nothing is descope, and all work is done as we go along, would delay dogfooding possibilities, increasing risk and decreasing confidence when the switch needs to happen.