protocols

eth2 Insights: Aggregation Performance

The first post in our eth2 Insights Series reveals important insights into the performance, and importance, of validator attestation aggregation in the eth2 network.

eth2 Insights: Aggregation Performance

By Elias Simos · Nov 2 2020

Welcome to our eth2 Insights Series, presented by Elias Simos–Protocol Specialist at Bison Trails. In this series, Elias reveals insights he uncovered in his joint research effort with Sid Shekhar (Blockchain Research lead at Coinbase), eth2 Medalla—A journey through the underbelly of eth2’s final call before takeoff.

In this first post, Elias looks at the role of aggregation in eth2 and the fact that over the course of Medalla, it appears somewhat… ineffective!


Intro

Aggregation in eth2 is the act of collecting valuable network information from the P2P layer and packaging it together in order to submit it on-chain in an efficient way. At a high level it is the work that some of the validators perform in each consensus round, that involves hashing together individual attestation signatures broadcast by other nodes and then submitting them as a group for inclusion on-chain.

It also happens that attestations are the consensus function that takes up the most chain space–especially in Phase 0 when no transactions will take place on-chain. It then follows that aggregation is not only helpful, but is rather a vital protocol function in an eth2 chain that is designed for scalability.

"Aggregation is not only helpful, but is rather a vital protocol function in an eth2 chain that is designed for scalability."

For context, according to the protocol’s rules, every validator in the active set is either proposing a block or is allocated an attestation slot at every epoch. Attesters are further organized into committees, such that every committee represents a group of validators that have been assigned to attest in the same target slot (or duty slot) in a certain epoch. Within those committees, some attesters are chosen as aggregators.


Something wicked this way comes

Given protocol rules as described above, the number of attestations that one would expect to find included on-chain would be roughly equal to the number of validators in the active set epoch. “Roughly” because some validators are chosen as proposers (32 per epoch in Phase 0), and some would likely not participate in consensus due to downtime (the protocol only needs ⅔ of the activated validators to attest in order to reach finality).

What we found, however, was that over the 14,500 epochs that we surveyed (˜2 months in human time), Medalla included between 50% to 100% more attestations than activated validators per epoch!



Figure 1: Individual attestations included on-chain vs active validators, over time

Before jumping to conclusions, I have to underline that there is a method to the proverbial “madness”—i.e. a reason why this might be the case.

An aggregator's job is first to collect "unaggregated" attestations from a broadcast domain on the P2P network, then to aggregate those into a single proof and to then publish that proof on another gossipsub broadcast domain. The way these aggregated attestations then end up on-chain is that the block proposer will “see” a pool of attestations either from the aggregated attestations broadcast domain or some other sources (e.g. "unaggregated" attestations that were seen on the attestation subnets) and copy those which:

  • 1. Are valid for that block: include attestations that are targeting the slot proposed
  • 2. Add value to that block: include information that was previously not included in the block

Given the above, a block producer will sometimes include duplicate votes in a block. Consider the following two grouped attestation examples, targeting the same block slot and checkpoints:

  • A. signed by validators [1, 2]
  • B. signed by validators [2, 3]

Due to the nature of BLS aggregation, these two attestations cannot be aggregated together as they are partially overlapping—and so they both will end up being included on-chain.


Attestations bloat?!

Through our research, we observed that often attestations containing duplicate votes were included in different blocks—i.e. some were included with a delay. To some degree, this is expected and benign to the network. However, given that the number of surplus inclusions we observed stands at between 50% to 100%, there might be sense in thinking of these attestations containing duplicate votes as bloat. It all depends on *how* these votes are aggregated!

Consider the simple following example of a committee of five attesters with validator indices [1, 2, 3, 4, 5], attesting to the same target slot. For the surplus votes to stand at 60%—80%, there are various combinations of aggregates that might take up less or more chain state space:

  • More lean: 2 attestations: [1, 2, 3, 4] and [2, 3, 4, 5]
  • More bloated: 5 attestations: [1], [1,2], [2,3], [3,4], [4, 5]

What we observed in Medalla looks more like (B) and less like (A).

In order to reduce the computational load, in the demonstration here I only look at the span of epochs 0 to 1500 (first week of the Medalla testnet). The sample includes roughly 2.3M aggregated attestations.



Figure 2: Distribution of aggregated attestations by number of votes they included

What one finds here is three distinct clusters:

  • A. Attestations that included a single validator vote
  • B. Attestations that included between 2 and 15 validator votes
  • C. Attestations that included between 80 and 120 validator votes

Most notably, 37% of all the attestations committed on-chain included only one validator vote. Let’s call those “singles” for simplicity!

Adding an extra dimension to help progress this game of “truth-seeking,” we now take a look at the average inclusion delay over included grouped attestations by the number of votes they packed together.



Figure 3: Inclusion delay vs number of votes each aggregated attestation packed

At a glance, this looks like a case of “If you want to go fast, go alone!” The singles were included with a delay of 10 slots on average, while the larger groups were included with a delay of approximately 12 slots. This delay in the inclusion of larger groups might imply that given the ˜75% surplus in validator votes across Medalla, the singles—which appear to be included first—are getting included again in larger aggregates that are included later.

Grouping the singles by the slot they were targeting shows that, on average, it took 22 separate single attestations to get those votes included! For reference, in this exercise I managed to group about 39k separate target slots—80% of the slots in the sample.



Figure 4: Distribution of aggregated attestations including only 1 vote, targeting the same block.

Furthermore, when looking at the distribution of inclusion delays among singles, 37% of the singles–14% of the total grouped attestations in the sample–were included with a delay of 1!



Figure 5: Distribution of the inclusion delay of “singles.”

Finally, when grouping those singles that were included with a delay of 1 slot by target slot, 40% (5% of all the attestations in the sample) were targeting the same block and getting included at the same time, yet were committed in over three (or more) separate attestations!



Figure 6: Distribution of the inclusion delay of “singles” targeting the same block.

Similarly, when extending the analysis of singles targeting the same block to an inclusion delay of between 2 and 5, I found that these were included in over two separate attestations on average—covering around 60% of the sample together with those with an inclusion delay of 1, and mapping to 10% of all the attestations included between epochs 0 and 1500.


Ineffective aggregation

At this point, it’s probably a good idea to pack the whirlwind of stats and terminology back into an insight. The high-level finding is that 50% to 100% more attestations than activated validators were included per epoch over the course of Medalla.

When investigating further (in the sample of epochs 0 to 1500), over ⅓ of attestations included on-chain are “singles.” Within those singles, approximately 50% are attestations that, in theory, should have been aggregated–since they were targeting the same block slot and were included at the same time–but they were not!

While I feel inclined to withhold any final judgements until the analysis is expanded over the whole dataset, there are strong indications that aggregation in Medalla has not performed as intended.

While I feel included to withhold any final judgements until the analysis is expanded over the whole dataset, there are strong indicators that aggregation in Medalla has not performed as intended."

Given the lifecycle of aggregations, the probable cause here links back to either the proposer (that is picking singles over aggregates) or the aggregator (that is packing singles as aggregates), with the inclusion delay here likely exaggerating the outcome further. It’s also worth underlining that neither of the two parties is explicitly incentivized by the protocol to be more “efficient” from an aggregations perspective.


Conclusion

A more effective aggregation process not only means less state bloat and more space for useful information to be stored, but also enhances protocol security and overall guarantees, by allowing whistleblowers to survey past history for protocol level violations faster (and more cheaply) due to the reduced volume of data to be surveyed.

In a world of limited resources, an economically “rational” client-side of the network will prioritize building the features that help validators keep good uptime and get votes included as fast as possible. These are the functions that help maximize rewards for operators. Rational operators will optimize similarly. This incentive structure may lead to a negative network-level externality that manifests as inefficient aggregation.

"This incentive structure may lead to a negative network-level externality that manifests as inefficient aggregation."

Given the stage eth2 is in its lifecycle, the eth2 development team’s focus is on the implementation side of things–and not on protocol design and specifications. However, there is a world out there, where the protocol *could* also incentivize for efficient aggregation—even as this happens off-chain—by linking the aggregators with the attestation and issue rewards in retrospect at some time in the future.

If aggregation remains a problem that network participants are prompted to solve solely out of good will, then it’s not unreasonable to expect it to be an issue for a while longer.


Acknowledgements

Thank you to Sid Shekhar, Lakshman Sankar, Paul Hauner and Jim McDonald for their thoughtful questions, prompts and the time they devoted to discussing the original findings of eth2data.github.io; these conversations were incredibly valuable in me both getting a better understanding of protocol rules, but also in uncovering further research questions and motivating me to dig deeper for answers.


More eth2 insights

  • The first post in our eth2 Insights Series reveals important insights into the performance, and importance, of validator attestation aggregation in the eth2 network.
  • The second post in our eth2 Insights Series zooms into slashings in Medalla, examining their correlates and probable causes.
  • The third post in our eth2 Insights Series discusses the parameters governing validator effectiveness in eth2 and how validators were distributed along those in Medalla.
  • The fourth post in our eth2 Insights Series discusses Medalla’s arc of development, the metrics to gauge overall network health, and shares perspective on eth2 Mainnet.
  • Read our Q&A with Elias on the challenges faced during the eth2 Medalla Data Challenge and how data-informed research can help improve eth2 in the progression to a mainnet launch.

For Individuals

Are you an individual with a large amount of ETH? Please contact us to learn how to participate in eth2. ETH holders can also participate with LiquidStake powered by Bison Trails.

Contact Us


Become an eth2 Pioneer

It’s not too late to be an eth2 Pioneer. Learn more about the eth2 Pioneer Program for enterprise. We want you to have early access to build on the Beacon Chain!

Contact Us


About Bison Trails


Our mission is to provide superior infrastructure on multiple blockchains, to strengthen the entire ecosystem, and enable the pioneers of tomorrow.

Pioneering Blockchain Infrastructure®

Bison Trails is a blockchain infrastructure company based in New York City. We built a platform for anyone who wants to participate in 19 new chains effortlessly. We also make it easy for anyone building Web 3.0 applications to connect to blockchain data from 27 protocols with QT. Our goal is for the entire blockchain ecosystem to flourish by providing robust infrastructure for the pioneers of tomorrow.


bison cool

THIS DOCUMENT IS FOR INFORMATIONAL PURPOSES ONLY. PLEASE DO NOT CONSTRUE ANY SUCH INFORMATION OR OTHER MATERIAL CONTAINED IN THIS DOCUMENT AS LEGAL, TAX, INVESTMENT, FINANCIAL, OR OTHER ADVICE. THIS DOCUMENT AND THE INFORMATION CONTAINED HEREIN IS NOT A RECOMMENDATION OR ENDORSEMENT OF ANY DIGITAL ASSET, PROTOCOL, NETWORK OR PROJECT. HOWEVER, BISON TRAILS (INCLUDING ITS AFFILIATES AND/OR EMPLOYEES) MAY HAVE, OR MAY IN THE FUTURE HAVE, A SIGNIFICANT FINANCIAL INTEREST IN, AND MAY RECEIVE COMPENSATION FOR SERVICES RELATED TO, ONE OR MORE OF THE DIGITAL ASSETS, PROTOCOLS, NETWORKS, ENTITIES, PROJECTS AND/OR VENTURES DISCUSSED HEREIN.

THE RISK OF LOSS IN CRYPTOCURRENCY, INCLUDING STAKING, CAN BE SUBSTANTIAL AND NOTHING HEREIN IS INTENDED TO BE A GUARANTEE AGAINST THE POSSIBILITY OF LOSS. THIS DOCUMENT AND THE CONTENT CONTAINED HEREIN ARE BASED ON INFORMATION WHICH IS BELIEVED TO BE RELIABLE AND HAS BEEN OBTAINED FROM SOURCES BELIEVED TO BE RELIABLE BUT BISON TRAILS MAKES NO REPRESENTATION OR WARRANTY, EXPRESS OR IMPLIED, AS TO THE FAIRNESS, ACCURACY, ADEQUACY, REASONABLENESS OR COMPLETENESS OF SUCH INFORMATION.

ANY USE OF BISON TRAILS’ SERVICES MAY BE CONTINGENT ON COMPLETION OF BISON TRAILS’ ONBOARDING PROCESS, INCLUDING ENTRANCE INTO APPLICABLE LEGAL DOCUMENTATION AND WILL BE, AT ALL TIMES, SUBJECT TO AND GOVERNED BY BISON TRAILS’ POLICIES, INCLUDING WITHOUT LIMITATION, ITS TERMS OF SERVICE AND PRIVACY POLICY, AS MAY BE AMENDED FROM TIME TO TIME.

Latest News

help

Contact Us

Get in touch

General
Sales
Press
Legal