Welcome to our eth2 Insights Series, presented by Elias Simos–Protocol Specialist at Bison Trails. In this series, Elias reveals insights he uncovered in his joint research effort with Sid Shekhar (Blockchain Research lead at Coinbase), eth2 Medalla—A journey through the underbelly of eth2’s final call before takeoff.
In this first post, Elias looks at the role of aggregation in eth2 and the fact that over the course of Medalla, it appears somewhat… ineffective!
Aggregation in eth2 is the act of collecting valuable network information from the P2P layer and packaging it together in order to submit it on-chain in an efficient way. At a high level it is the work that some of the validators perform in each consensus round, that involves hashing together individual attestation signatures broadcast by other nodes and then submitting them as a group for inclusion on-chain.
It also happens that attestations are the consensus function that takes up the most chain space–especially in Phase 0 when no transactions will take place on-chain. It then follows that aggregation is not only helpful, but is rather a vital protocol function in an eth2 chain that is designed for scalability.
For context, according to the protocol’s rules, every validator in the active set is either proposing a block or is allocated an attestation slot at every epoch. Attesters are further organized into committees, such that every committee represents a group of validators that have been assigned to attest in the same target slot (or duty slot) in a certain epoch. Within those committees, some attesters are chosen as aggregators.
Given protocol rules as described above, the number of attestations that one would expect to find included on-chain would be roughly equal to the number of validators in the active set epoch. “Roughly” because some validators are chosen as proposers (32 per epoch in Phase 0), and some would likely not participate in consensus due to downtime (the protocol only needs ⅔ of the activated validators to attest in order to reach finality).
What we found, however, was that over the 14,500 epochs that we surveyed (˜2 months in human time), Medalla included between 50% to 100% more attestations than activated validators per epoch!
Before jumping to conclusions, I have to underline that there is a method to the proverbial “madness”—i.e. a reason why this might be the case.
An aggregator's job is first to collect "unaggregated" attestations from a broadcast domain on the P2P network, then to aggregate those into a single proof and to then publish that proof on another gossipsub broadcast domain. The way these aggregated attestations then end up on-chain is that the block proposer will “see” a pool of attestations either from the aggregated attestations broadcast domain or some other sources (e.g. "unaggregated" attestations that were seen on the attestation subnets) and copy those which:
Given the above, a block producer will sometimes include duplicate votes in a block. Consider the following two grouped attestation examples, targeting the same block slot and checkpoints:
Due to the nature of BLS aggregation, these two attestations cannot be aggregated together as they are partially overlapping—and so they both will end up being included on-chain.
Through our research, we observed that often attestations containing duplicate votes were included in different blocks—i.e. some were included with a delay. To some degree, this is expected and benign to the network. However, given that the number of surplus inclusions we observed stands at between 50% to 100%, there might be sense in thinking of these attestations containing duplicate votes as bloat. It all depends on *how* these votes are aggregated!
Consider the simple following example of a committee of five attesters with validator indices [1, 2, 3, 4, 5], attesting to the same target slot. For the surplus votes to stand at 60%—80%, there are various combinations of aggregates that might take up less or more chain state space:
What we observed in Medalla looks more like (B) and less like (A).
In order to reduce the computational load, in the demonstration here I only look at the span of epochs 0 to 1500 (first week of the Medalla testnet). The sample includes roughly 2.3M aggregated attestations.
What one finds here is three distinct clusters:
Most notably, 37% of all the attestations committed on-chain included only one validator vote. Let’s call those “singles” for simplicity!
Adding an extra dimension to help progress this game of “truth-seeking,” we now take a look at the average inclusion delay over included grouped attestations by the number of votes they packed together.
At a glance, this looks like a case of “If you want to go fast, go alone!” The singles were included with a delay of 10 slots on average, while the larger groups were included with a delay of approximately 12 slots. This delay in the inclusion of larger groups might imply that given the ˜75% surplus in validator votes across Medalla, the singles—which appear to be included first—are getting included again in larger aggregates that are included later.
Grouping the singles by the slot they were targeting shows that, on average, it took 22 separate single attestations to get those votes included! For reference, in this exercise I managed to group about 39k separate target slots—80% of the slots in the sample.
Furthermore, when looking at the distribution of inclusion delays among singles, 37% of the singles–14% of the total grouped attestations in the sample–were included with a delay of 1!
Finally, when grouping those singles that were included with a delay of 1 slot by target slot, 40% (5% of all the attestations in the sample) were targeting the same block and getting included at the same time, yet were committed in over three (or more) separate attestations!
Similarly, when extending the analysis of singles targeting the same block to an inclusion delay of between 2 and 5, I found that these were included in over two separate attestations on average—covering around 60% of the sample together with those with an inclusion delay of 1, and mapping to 10% of all the attestations included between epochs 0 and 1500.
At this point, it’s probably a good idea to pack the whirlwind of stats and terminology back into an insight. The high-level finding is that 50% to 100% more attestations than activated validators were included per epoch over the course of Medalla.
When investigating further (in the sample of epochs 0 to 1500), over ⅓ of attestations included on-chain are “singles.” Within those singles, approximately 50% are attestations that, in theory, should have been aggregated–since they were targeting the same block slot and were included at the same time–but they were not!
While I feel inclined to withhold any final judgements until the analysis is expanded over the whole dataset, there are strong indications that aggregation in Medalla has not performed as intended.
Given the lifecycle of aggregations, the probable cause here links back to either the proposer (that is picking singles over aggregates) or the aggregator (that is packing singles as aggregates), with the inclusion delay here likely exaggerating the outcome further. It’s also worth underlining that neither of the two parties is explicitly incentivized by the protocol to be more “efficient” from an aggregations perspective.
A more effective aggregation process not only means less state bloat and more space for useful information to be stored, but also enhances protocol security and overall guarantees, by allowing whistleblowers to survey past history for protocol level violations faster (and more cheaply) due to the reduced volume of data to be surveyed.
In a world of limited resources, an economically “rational” client-side of the network will prioritize building the features that help validators keep good uptime and get votes included as fast as possible. These are the functions that help maximize rewards for operators. Rational operators will optimize similarly. This incentive structure may lead to a negative network-level externality that manifests as inefficient aggregation.
Given the stage eth2 is in its lifecycle, the eth2 development team’s focus is on the implementation side of things–and not on protocol design and specifications. However, there is a world out there, where the protocol *could* also incentivize for efficient aggregation—even as this happens off-chain—by linking the aggregators with the attestation and issue rewards in retrospect at some time in the future.
If aggregation remains a problem that network participants are prompted to solve solely out of good will, then it’s not unreasonable to expect it to be an issue for a while longer.
Thank you to Sid Shekhar, Lakshman Sankar, Paul Hauner and Jim McDonald for their thoughtful questions, prompts and the time they devoted to discussing the original findings of eth2data.github.io; these conversations were incredibly valuable in me both getting a better understanding of protocol rules, but also in uncovering further research questions and motivating me to dig deeper for answers.
It’s not too late to be an eth2 Pioneer. Learn more about the eth2 Pioneer Program for enterprise. We want you to have early access to build on the Beacon Chain!
Bison Trails is a blockchain infrastructure company based in New York City. We built a platform for anyone who wants to participate in 19 new chains effortlessly. We also make it easy for anyone building Web 3.0 applications to connect to blockchain data from 27 protocols with QT. Our goal is for the entire blockchain ecosystem to flourish by providing robust infrastructure for the pioneers of tomorrow.
eth2 Update 006Nov 25 2020
Substrate Ecosystem Update 003Nov 25 2020
Now Available: Libra QTNov 24 2020
eth2 Insights: Network PerformanceNov 23 2020
eth2 Update 005Nov 18 2020
eth2 Insights: Validator EffectivenessNov 16 2020
tBTC and DeFi: How to Get InvolvedNov 12 2020
Coinbase Custody Expands Bison Trails Integration to Add Staking Support for CeloNov 12 2020
eth2 Update 004Nov 11 2020
QT Archival: When a Full Node is Not EnoughNov 11 2020
Bison Trails Newsletter 009 • October 2020Nov 10 2020
eth2 Insights: SlashingsNov 9 2020
View More →