Welcome to Who’s who in eth2, presented by Elias Simos, Protocol Specialist at Coinbase Cloud. In this series, Elias interviews key contributors to the development and growth of Ethereum and eth2, exploring their involvement in eth2, visions for the future, and perspectives on what eth2 means for the world from people deeply embedded in its ecosystem.
The series includes interviews with notable builders, researchers, infrastructure experts, and leaders from the eth2 ecosystem, along with principal members of eth2’s four client teams—Prysmatic Labs, Sigma Prime, Nimbus, and Teku.
In this fifth post, Elias interviews Luke Youngblood, Senior Staff Software Engineer at Coinbase Cloud, on his expertise in staking infrastructure, the challenges facing eth2 validators, the balance between simplicity and complexity in infrastructure, and how Luke thinks around key considerations for Coinbase’s involvement in eth2.
I've worked in tech for the last couple of decades. I started my career working as a systems engineer and a network engineer. In the mid‐2000s, I built a very large private cloud for McKesson, which is a large healthcare company spread across the US, Canada, and Europe. It was a really great experience because I got a chance to build automation that thousands of developers use to deploy their applications every day.
Then in 2016 I got a job at Amazon Web Services. I wanted to work for the biggest cloud. At Amazon Web Services, I helped some of our large customers migrate applications to the cloud, and really focused on distributed systems.
Separately, I've always been into crypto as a hobby. I started mining Bitcoin back in 2010, in the very early days, and then started fiddling with Ethereum back in 2016‐2017. While I was working at Amazon my brother and I started a small side‐business helping build proof of stake infrastructure for the Tezos Foundation. At the time, back in early 2018, I knew nothing about proof of stake. It was this very new technology. So, early as we were, we ended up being (I believe) the first to build technology that didn't exist at the time—things like remote signing systems.
Fast forward a little bit in the future to late 2018‐early 2019, Coinbase wanted to get into staking rewards. So they were looking for experts in this field. They found me through the Tezos Foundation and acqui‐hired me to come help build out staking rewards at Coinbase. For the last couple of years I've been working at Coinbase on adding staking rewards products. We started with Tezos, Cosmos, Algorand, and, more recently, eth2, which is the biggest staking rewards product we've launched so far.
It's a really interesting question because many of the concepts that we learned at AWS, and the core principles of cloud computing, apply very well to proof of stake blockchain networks, and decentralized networks in general.
So, for example, one of the largest benefits to using cloud services is that you can go global in minutes. The idea is that I can deploy infrastructure to any region in the world. You know—Asia, Europe, United States, Canada, South America, even Africa—I can deploy infrastructure to any continent, any region, within minutes, without having to build data centers, without having to buy hardware.
Applying this concept to decentralized networks means we don't have to limit ourselves to a single data center. If we want to create decentralized networks that are very robust, we can actually provision validators across many different continents, and the networks we provision can be more robust because of it. And so applying those concepts was kind of a revelation for me because I realized that we didn't have to worry about 20% of the hash power of the network going offline if a single data center in China goes off the internet.
Yes, absolutely. I think in all of these decentralized networks the ideal state is that there are thousands of validators running across a wide variety of setups, including home setups, professional setups, and data centers. It would be perfect if there were, let's say, a hundred thousand Ethereum validators and each one was running at a home or on a different internet connection; that would be ideal.
But the reality is that there will be large concentrations of staking power in various locations. Those concentrations of hash power also exist in proof of work networks—including Bitcoin, mostly because of the fact that hardware required to do it is very specialized. With proof of stake, of course, we can use generalized hardware. You don't have to buy ASICS, we don't need as much electricity, so we don't have to be right next to hydroelectric power or some cheap power source—we can provision those systems pretty much anywhere. So they can be much, much more decentralized if we do it properly.
I do think though, just as there are benefits to extreme decentralization (e.g. everyone running validators at home), there are also benefits to decentralization across a variety of data centers and cloud providers. You can imagine that network outcomes would end up being a lot more "random" with a network that is run by hobbyists in its majority—at least with the state of technology as it exists today.
Besides, 32 ETH has ended up being worth quite a lot.
Well, it's a great challenge to solve. There's a couple of challenges there, right? The first is the tooling may not be easy to use to run a validator. You can run validators on Windows, Linux, Mac OS—Prysm’s clients run on all three. However, it does require command line skills, and not everyone will have the skills to easily spin up a validator. But the tooling is constantly improving. And I do expect in a year or so that there will be easy point and click tooling where anyone can provision a validator without having a lot of command line Linux knowledge.
Another aspect is just the cost. I think with staking pools this can actually be reduced in some ways. For example, there are staking pools like Rocket Pool, where you might only need 16 ETH to start a validator and you can receive ETH from others that just want to stake with the staking pool.
Also, a really interesting project that is being worked on—I believe Consensys is working on this right now—is called Secret Shared Validators (SSV). The idea is that we could use threshold signing schemes to actually spread keys across many thousands of validators, so that an individual validator doesn't have to worry about losing their private key material due to a hardware failure, or something like that.
If this proves itself to be a viable strategy in SSV we could probably apply the same concepts to decentralizing validators such that it doesn't require 32 ETH; you could stake much smaller amounts, which would be interesting.
Yeah, it's definitely created some challenges for us. eth2 is pure proof of stake, not delegated. I think the design goals have a good intent behind them. The design goal of not having a single validator stake more than 32 ETH is intended to decentralize the network more, so we have thousands of validators, potentially tens or hundreds of thousands of validators, instead of just perhaps a hundred, like you see in some of the Tendermint‐based proof of stake networks. So it's a good design goal, but the reality is that there will be large staking pools regardless, whether they are exchanges or whether they are decentralized pools, those large staking pools will need to concentrate staking power. So they will just have to adapt and run thousands of validators each.
That's what we've had to do at Coinbase to launch staking rewards; we're running thousands of validators. Now, the other challenge that pure proof of stake brings is regarding funds movement. From a security and risk standpoint, it's actually very easy to do delegated proof of stake because we can delegate funds that are in a cold wallet so they're completely offline and the private keys are not stored online. It's really nice from a security standpoint when we're able to do that. With pure proof of stake, like eth2, of course we have to move those funds into the deposit contract so they're now no longer cold.
I think these are always tough considerations when you're designing a proof of stake network, but, I think in general, if you can minimize funds movement that's always a benefit. And, in general, if you can support delegation, that's also a really nice characteristic to have.
But on the flipside, I also do think that having hundreds of thousands of validators is a great benefit to have. However, some of the same benefits could potentially be achieved in delegated networks if you have the right economic incentives.
So it's actually true that the network tends to centralize. However, I think for large entities that are staking, their goal is also to decentralize as much as possible.
I think you're bringing up a really good point in that we might have less decentralization than we appear to have on the surface. In delegated proof of stake networks this is more visible because the delegations are public and on the blockchain. Whereas the only way we can really determine how centralized or decentralized pure proof of stake is, is by trying to correlate things like deposit addresses.
So it's actually true that the network tends to centralize. However, I think for large entities that are staking, their goal is also to decentralize as much as possible and I'll explain the reasons for this.
For one, if I'm a large staker on the Ethereum network I probably don't want to have a third of the network by stake power. Because if I have a third of the network and there's some vulnerability in my setup, I could potentially have correlated slashing of 100% of my stake. So because the risk increases the closer you get to one third of the network, that dramatically reduces my incentive to want to have that much staking power on a single provider.
Likewise, we want to help promote the health of the network and the stability of the network, because of all the applications that exist on top of it. If a third of the network’s voting power goes offline due to maybe even a data center or an internet outage, finality can halt and the network can stop finalizing blocks which is also bad. So there's a variety of disincentives to discourage concentrating too much.
As large providers like Coinbase, we have to think about how we decentralize across as many different infrastructure providers and as many different hardware and software configurations as possible. Because the ideal state is that if there is a vulnerability or a software defect in one of our stacks it only affects a very small percentage of our total validator population and it doesn't affect our entire amount at stake.
This is where I think Bison Trails’ architecture is really brilliant. When I'm provisioning validators on the Bison Trails infrastructure I can choose from a variety of cloud regions and multiple cloud providers. When I launch what's called a cluster, that cluster is actually spread across two different cloud providers by default with only half of the validator clients on each. So we leverage this capability of Bison Trails as much as possible, and we actually spread our validators across as many cloud regions and cloud providers as possible.
By simply provisioning validators across Amazon in Ireland, Google in Frankfurt, Amazon in Tokyo, Google in Hong Kong, Amazon in Singapore, and each one of those regions has several data centers, they call them availability zones, we can actually achieve really good decentralization across the entire infrastructure stack. So if there was a bad internet weather day that impacts one of those data centers, it's a very small percentage of our total validator population.
Likewise, on the client side, we support all three of the major client implementations—Prysm, Lighthouse, and Teku. This allows us to diversify our footprint on the client side as much as possible, as each client has strengths and weaknesses.
Last, we are leveraging more staking infrastructure providers than Bison Trails. Some of our providers run in the cloud and some run on bare metal, and this diversity gives us greater resilience and decentralization.
You bring up a good point. Based on the way that the network economics are configured, it is advantageous for us to spread across as many providers as possible. So the incentives are aligned for us to decentralize as much as possible so that we don't have single points of failure—and if we do, to minimize their potential impact.
Where I do get somewhat concerned, is that I speculate that not every staking provider operates in the same way that we do. When you're operating in the cloud you have the capability of provisioning all of your validators in a single region, or you have the capability of provisioning them all in a dozen regions, and it's basically no additional cost to do so. So our incentives are aligned to decentralize as much as possible.
On the other hand, if you're a validator service operating in a physical location, you might not have the capability of easily turning on another data center or easily putting half of your validators in Asia and the other half in Europe. In that sense, not all of the actors in the ecosystem will necessarily have the same easy path to decentralization.
When I first got into proof of stake the thing that really appealed to me about it was the energy efficiency. And when I dug in a bit more, and I realized that instead of using raw compute power and electricity to secure the network, you use operational expertise, security skills, and ultimately value at risk, my mind was blown.
Well, first of all, when I first discovered Bitcoin and I started mining back in 2010 on my gaming PC, it was a fascinating hobby to me. I never really thought it would be worth a lot of money. I just thought it was really cool that I could generate money on my computer when I wasn't working. And I thought about how quickly I could send it across borders. I remember sending money to an exchange, and I thought it was fascinating that I could deposit funds and have them almost immediately available for trading. So crypto in general just became this rabbit hole that I went down.
Then when Ethereum launched and all of a sudden we had the world computer and smart contracts, and I realized that we could automate the value transfer between companies or between individuals, and we could codify these things in smart contracts, it was just mind blowing to me to think about the implications that would have for all financial transactions. I was hooked!
When I first got into proof of stake the thing that really appealed to me about it was the energy efficiency. And when I dug in a bit more, and I realized that instead of using raw compute power and electricity to secure the network, you use operational expertise, security skills, and ultimately value at risk, my mind was blown. The votes that protect the ledger are essentially really inexpensive if you have the private key material.
That meant that engineers like myself who happened to have expertise in these areas of infrastructure and security could build a career for ourselves. I never thought my career would end up in crypto, but it's been a really fun journey to kind of end up here.
Sure. Indeed, I first got involved with Tezos—that dates back to early 2018. It was about three months before the betanet launch of Tezos. They had a very successful crowdsale two years before in 2016, and they had promised the community the network would launch in Q2 of 2018. The community, as you can imagine, was really anticipating the launch, and they only had about three months to go. At the time they had zero infrastructure.
Really what we had to build out was the global footprint for the Tezos Foundation bakers. Bakers are basically the same thing as validators on other proof of stake networks. These Tezos Foundation validators, or bakers, would be the only validators running on the network for the first seven cycles, or about a three weeks period of time, while other validators would delegate, or receive delegations, and come online in cycle seven.
It was really interesting because in just about three months we had to build the remote signing infrastructure, including hardware security modules (HSMs) running across four different cloud regions, and all the bakers and boot node infrastructure that was necessary for others to connect their nodes to the network. Of course, we did this all in cloud infrastructure because that was the only way we could actually do it in a three month timeframe with the security requirements we had.
So it was a really exciting time to be a part of the proof of stake movement. We were building new technology—and we were also working crazy hours, while still working our day jobs. At some point I remember I took a couple of weeks off work and flew to Paris and spent time with the Nomadic Labs developer team there in Paris ahead of the launch.
Then the funny thing was, at the time I remember thinking that we were going to build this infrastructure for them and then hand it over to some team—I didn't know who—was going to operate the validators going forward. But what ended up happening is the Tezos Foundation is a charitable organization, and they don't necessarily have a team of DevOps and infrastructure people. So they asked, "Can you operate it for us?" So I ended up operating the Tezos Foundation bakers, and I still operate them to this day. Coinbase has generously allowed me to continue operating them while I work at Coinbase. It gave me a great experience and great background on how to operate proof of stake networks.
Going forward, later in 2018, I participated in Cosmos’ Game of Stakes. Cosmos had a really innovative approach to incentivizing validator participation. They had an incentivized testnet called Game of Stakes that took place in late 2018, and anyone could participate with testnet funds. There were prizes given to validators that remained online the longest, there were uptime leaders, prizes given to validators that never got jailed. People were encouraged to attack each other to try to find security vulnerabilities in the network. It was really great because you got firsthand experience running Cosmos and operating in an adversarial testnet. Then you received Cosmos ATOMs as a prize for participating.
There've been so many proof of stake networks after Cosmos that have copied this incentivized testnet model, because it was so successful. It accomplishes two things. First of all, many of these decentralized networks that are launching today are funded mostly by venture capital. So they have just a few large shareholders. By running an incentivized testnet and giving delegations to dozens of validators, you can very quickly decentralize your token supply and decentralize your network by delegating to validators that have proven competent at running incentivized testnets. So it allows the validator community to become more skilled at running these networks, and then allows individual validators to demonstrate proficiency in these networks.
I think in general delegation is just very nice from a not‐having‐to‐move‐funds perspective. I can keep my funds cold, I can delegate to a validator, and I don't have to worry about my funds that are in cold storage being lost. That's very appealing.
Another thing that would be really nice to see is perhaps allowing validators’ effective balance to grow over time. So on Ethereum 2 validators cannot go above a 32 ETH effective balance, which means that in order for validators to compound their rewards they're effectively going to have to exit at some point and then create more validators.
All of these exiting and depositing transactions are going to cause a lot of churn on the network. So perhaps it would be nice to come up with an economic model that allows validators to continue to compound the rewards without having to exit validating. That would be one nice improvement, but I'm sure there are a lot of economic implications that I haven't thought through completely.
The unique challenge with eth2 is just the number of validators we have to manage. With all the other networks that we've launched staking rewards on in the past, we've only had to operate a single validator and then delegate all funds to that validator. With eth2, of course, we have to be able to launch many thousands of validators, and we also need to orchestrate deposits into the Ethereum mainnet deposit contract.
As I thought through the problem space, what we settled on was the design for an orchestration tool that would allow us to securely create validators. We're using Bison Trails’ API for that. So kudos to the Bison Trails team for designing a really easy to use and user‐friendly API that we can integrate with. Essentially, we want to be able to use automation and allocate new validators on-demand because we realized we couldn't manually stand up a validator every time we need to deposit 32 ETH.
So it all starts with orchestration allocating the validators, and then once we've allocated the validators, we also need to allocate cold storage addresses—the withdrawal address. For example, once you're done staking and you want to exit the system, those funds have to go back somewhere.
Then we need to also orchestrate a deposit, and the deposit itself as a smart contract interaction on the Ethereum mainnet. So we built this tool that helps us to orchestrate the life cycle of thousands of validators’ deposit and withdrawal addresses. Then you need to monitor all the activity, so we built a monitoring system to have an overview of whether our validators are attesting or not.
Then we also use the same data from our monitoring system that checks validator balances to determine how much in rewards to pay our customers. We know exactly how much ETH we've earned every day through attestations and block proposals; then we're able to use that to calculate what APR we should pay our customers minus fees.
That was an interesting way to monitor the health, as well as to monitor the financial health, of our staking rewards product through monitoring validator balances.
Great question! We could monitor for just attestation inclusion distance, and all of those things individually, but what we really like to look for are simple metrics that can give us a strong signal of healthy or unhealthy, up or down—and the balance actually turned out to be one of the best signals for that.
But it’s also because we need to capture the data for financial reporting systems, we have to be able to audit every last penny of ETH and make sure we prove to our accountants that we've paid customers exactly the amount of rewards we've earned. It actually allows us to kill two birds with one stone and use that same monitoring data for financial reporting. We found balance changes were a really good proxy for validator health.
That's exactly it. Simplicity is something we really value quite a bit. When deconstructing the problem of staking we initially thought to take a systems approach of just monitoring the health of all the servers. Then what we realized really quickly was that the server can be completely healthy and we still might be offline for some unknown reason. So the approach we decided to take was to actually index every transaction in the blockchain that we care about—in the case of staking, it's just staking transactions, typically votes and block proposals—and use that index data as our source of truth for monitoring.
That's the approach we initially developed for Tezos when I was first building out the staking infrastructure for the Tezos Foundation. We just index the blockchain itself and we use a separate node from the nodes our validators use. That way, there’s been a few incidents where we've thought our validator was online, everything looked great, logs show we're proposing blocks and voting, but our validator actually lost consensus with the rest of the network. So it actually wasn't voting, and we detect those problems much easier when we actually look at the blockchain data itself.
Think along the following lines: You’ve just downloaded an app on your phone and now you're able to use decentralized lending protocols on Ethereum. You're able to take out a home mortgage from Compound or Aave in minutes, and as a user you're able to do all these amazing things with the DeFi ecosystem that you can't do today.
I'm most excited about the upcoming merge. I think it's going to be fantastic if we're able to pull off the quick merge by early next year. Immediately, it's going to do a couple of things that are really powerful. First of all, it's going to dramatically reduce the cost of gas and fees on the Ethereum network, which has been a pain point for many of us that use DeFi today. It’s going to also increase the revenue for validators on the eth2 beacon chain, because they'll be able to earn fees from including Ethereum mainnet transactions in blocks.
From a staking perspective, it's going to increase the rewards that people can earn. And for just anyone using the Ethereum network, it should actually improve the performance and reduce fees for them. I think that'll be a huge benefit in the near short term, hopefully in the next six to twelve months.
Then going forward I think there are some challenges, obviously. One of the more immediate concerns is that there are a number of EVM‐compatible L1s that are popping up, which don't have some of the same decentralization guarantees and censorship resistances that Ethereum has. If we don't pull off the quick merge in time, and fees continue to rise, and the gas price continues to rise on the Ethereum mainnet, there’s a real risk that a lot of applications could migrate to these other L1s that don't have the strong security and decentralization guarantees that Ethereum has.
Amazing question. Think along the following lines: You’ve just downloaded an app on your phone and now you're able to use decentralized lending protocols on Ethereum. You're able to take out a home mortgage from Compound or Aave in minutes, and as a user you're able to do all these amazing things with the DeFi ecosystem that you can't do today. In that future, all finance is on-chain such that you won’t really even need a bank.
From a network perspective, we have a thousand shards operating and there are tens of thousands of transactions per second, and we've exceeded the Visa Network in terms of transactions per second throughput. And from a finance perspective, crypto probably now has a $10 trillion plus market cap, and decentralized finance protocols are larger than banks in the United State.
Really fun chatting with you Elias, it's been great.
Interview by Elias Simos
Bison Trails is a blockchain infrastructure platform-as-a-service (PaaS) company based in New York City. We built a platform for anyone who wants to participate in 25 new chains effortlessly.
We also make it easy for anyone building Web 3.0 applications to connect to blockchain data from 33 protocols with Query & Transact (QT). Our goal is for the entire blockchain ecosystem to flourish by providing robust infrastructure for the pioneers of tomorrow.
In January, 2021, we announced Bison Trails joined Coinbase to accelerate our mission to provide easy-to-use blockchain infrastructure, now as a standalone product line. The Bison Trails platform will continue to support our customers. With Coinbase’s backing, we will enhance our infrastructure platform and make it even easier to participate in decentralized networks and build applications that connect to blockchain data.
Guide to OsmosisOct 21 2021
Bison Trails Provides Secure Infrastructure for Current to Build Hybrid Financial Applications on the Polkadot EcosystemOct 21 2021
Colchis Capital to Run Provenance Validators on the Bison Trails Platform to Drive Innovation in Real Estate FinancingOct 5 2021
Bison Trails newsletter 017Oct 1 2021
Today we’re announcing the new Substrate staking dashboard.Sep 24 2021
Bison Trails newsletter 016Aug 27 2021
Guide to Binance Smart ChainAug 19 2021
eth2 update 015Aug 4 2021
Bison Trails newsletter 015Jul 22 2021
Bison Trails and CoinList: supporting the growth of innovative networksJul 22 2021
Launching Solana Query & TransactJul 19 2021
View more →