Solana’s Network Performance Will Continue to be Choppy

Paul Fidika
8 min readMay 14, 2022

--

Synopsis: Solana’s transaction-throughput capability has been overestimated, and it has no clear roadmap on scaling (yet). Furthermore, performance does not (yet) degrade gracefully when this peak-performance cap is met. The validator network is brittle by design and can be halted by a powerful bad-actor or network of spam-bots. These are problems fundamental with Solana’s architecture, and cannot be solved quickly or easily (i.e., we won’t see a “Solana 2.0” in production for a year or two at least).

Blockchain generations:

  • Gen 1: (Bitcoin, XRP, Dogecoin) Focuses on the transfer of tokens.
  • Gen 2: (Ethereum, Solana, Cardano) Adds smart-contract functionality, so that on-chain programs can do simple, non-intensive, computations.
  • Gen 2.5 (Cosmos, Polkadot, Avalanche) Side-chains running independent of each other, with the ability to pass messages in some standardized way.
  • Gen 3 (Eth 2.0, NEAR) Maintains a single global state, but this state is sharded across machines. Programs execute asynchronously.

What makes Solana performant:

Solana is the most performant Gen 2 blockchain available. It accomplishes this feat using the following 4 methods:

  1. Execution optimization: Rather than the standard EVM execution environment (Ethereum, Polygon), Solana created “Sealevel”; it’s own custom execution environment. This system allows transactions to be broken into groups and executed in parallel using multi-core CPUs and eventually (hopefully) GPUs. Code is written in Rust and compiled into binary, which is more performant than Solidity-written programs.
  2. Storage optimization: Solana wrote their own custom database software, “Cloudbreak”, which allows for many more disc read / write operations than possible with traditional database software. Cloudbreak can do around 1 million write-operations per second on a single SSD (solid state drive).
  3. Leader-schedule: on Ethereum and Bitcoin, pending transactions are placed inside of a ‘mempool’, which is gossiped between miners, and takes log(n) to reach the entire network. Transactions from this mempool are used to construct the next block. It is not known in advance which machine will produce the next block. Solana instead uses a leader-schedule; the next block-producer (leader) is publicly known and determined one epoch (about 2 days) in advance. This means that pending transactions can be forwarded directly to the next leader to be included in the next block, without the need to gossip the transactions around the validator network and wait for these transactions to be picked up.
  4. High validator requirements: Solana validator nodes cannot be run with consumer-grade hardware in your home. You need a 1 GBit/s unmetered connection, high availability, 4+ SSDs, 256GBs of RAM, etc. This means that all Solana validator nodes must be run in datacenters and cost around $800+ per month (in just hardware) to operate. Restricting validators to only use powerful machines allows for many more transactions per second than if weaker machines were allowed.

Problems:

  • Leader schedules are fundamentally brittle: because the leader is known in advance, that means a bad-actor can halt the network by targeting the leader-machines with a DDoS attack in order of its leader-schedule. Essentially, this is why the Solana network has halted multiple times; mint-bots, trying to mint an NFT first, were spamming the leader with duplicate transactions. All of these duplicated transactions eventually overloaded the leader’s memory. On Ethereum this would be impossible; the next block producer is not known in advance. And even if you knew what machine was going to produce the next block, and took it offline before it could, another machine would simply produce the block instead. This makes leaderless blockchains like Ethereum and Avalanche much more robust to attack. Leader schedules are more performant, but unsafe in an adverse network environment.
  • Fixed Fees / Low cost to spam: On Ethereum, you are charged a “gas fee” for transactions. These gas-fees are based on the total amount of resources consumed by the transaction; i.e., more computations = higher fee. On Solana, fees are fixed at .000005 SOL per signer. These “low fees” are one of Solana’s main marketing points. However, this is an arbitrary number set by Solana itself, not based on any sort of resource-utilization or network availability. This means it is cheap to spam transactions, and you are only charged once for a failed transaction, even if you submitted it 1,000 times. (Solana is currently moving to begin charging fees based loosely on compute-units consumed to encourage optimized programs, and is looking at strategies such as ramping up consecutive failed-transaction fees exponentially to deter spam.)
  • No Gas Marketplace / Poor Network Degradation: Once a network has reached capacity, how do you ration that capacity? Think of it as a bus stop; the bus comes to the next stop, picks everyone up, and leaves. But if the bus has 50 seats, and 100 people trying to get on, how do you decide who gets on, and who has to wait for the next bus? On Ethereum, there is a “gas marketplace” where you bid against other participants to have your transactions included next. You are essentially biding for the set of available seats. This is neat and orderly. On Solana… well, it would be more like a fist fight. Imagine that the 100 people at the bus stop start shoving and kicking each other to try to get on first, and due to a fight breaking out at the bus stop, only 20 people are able to get on the bus before it leaves. This is essentially Solana. Programs are now designed to spam Solana — submit your transaction multiple times automatically — in order to get on the bus first. This leads to a better user experience, in terms of the user getting their transactions in sooner, but the more apps that are “clever” like this, the more the Solana network degrades. Solana wasn’t always like this; back in August 2021, retrying failed transactions wasn’t needed. But usage has picked up, the network has become more unreliable, and block-times have slowed down considerably (going from 400ms to 700ms currently). Solana mainnet has been consistently maxing out around 2.5k — 3k tps, although its testnet was able to do as much as 4k — 5k. Like the rowdy people at the bus stop, total max capacity on Solana goes down as a function of usage. Unfortunately Solana so far has gone with a “we just need bigger buses” strategy to solve this problem, rather than addressing how the network handles bottlenecks.
  • No ability to scale: For most blockchains, such as Bitcoin, Solana or Ethereum, blocks are produced on just ONE MACHINE. Yes you read that correctly; the entire Solana, Ethereum, and Bitcoin blockchains are just one physical computer, humming away, doing all the calculations by itself. Yes, the physical machine doing the calculation rotates, but still, it is fundamentally just one computer. I imagine you can see why this is a problem; what if there is more demand than what any one machine can handle? Well… yeah we’re all fucked. This is why scaling to multiple simultaneous block-producing machines is important. But I’ve never heard anyone even suggest the idea of Solana scaling: layer-2’s, sharding, side-chains… nothing. I believe this was due to a gross overestimate of Solana’s transaction ability; Solana seems to believe their network can handle 50k transactions per second, but realistically it’s struggling to do 3k, and was pushed beyond its full capacity when OpenSea launched on Solana in April. Solana certainly cannot support millions of daily active users. Solana-scaling is likely many years off into the future. The best we can hope for, ironically, is that Solana and crypto in general will fade in popularity, making the network less used and hence more performant.

Solana’s Future

Solana will continue to be used for a few applications, such as orderbooks for synthetic stocks (Serum) or trading NFTs (MagicEden), but Solana simply does not have the capacity or reliability to support all the applications developers want to build on it (i.e., a global settlement layer for payments, on-chain banking, or games that store their state on-chain). These projects will be disappointed and will eventually abandon their efforts in favor of other chains.

Fortunately, as the crypto bubble deflates, it’s likely that Solana’s usage will drop off, as users looking to get rich quick move on to other hobbies. This will alleviate some of Solana’s congestion temporarily.

Recommendations

Solana would be best advised to (1) come up a potential scaling strategy, (2) consider abandoning leader-based block production*, (3) start computing gas usage for programs, and (4) have a gas marketplace to ration network capacity and prioritize transactions. This would make Solana much more similar to Ethereum, and less performant and cheap than it is currently, but ultimately having a scalable, reliable network is more important than fast transaction times and low fees. For a blockchain with billions of dollars locked into it, reliability and robustness to attack are far more important than being fast and inexpensive.

If this plan is executed well, when the next crypto-mania starts again in 2 or 3 years, Solana could be well positioned to overtake Eth 2.0 as the dominant chain. If this plan is not executed well, it is likely Solana will be surpassed by newer chains.

Other Thoughts

Note from Justin Kookie (Solana Dev): some of these issues have been recognized and are being addressed. For example, the GenesysGo Shadow Protocol is a side chain used to keep track of account state. Strata Protocol is also working on bot prevention in candy machine mints, as well as NFT dutch auctions.

For recommended improvements, I want to push for public anchor idl in order to legitimize custom contracts. For example, a custom token-metadata contract can use the Metaplex standard as the base then add functionality. Phantom wallet and Solana explorers could decode the data instead of the NFTs showing up as unknown.

*Dissenting Opinion (from a top Solana Validator): It might be possible to make a leader-based system work robustly, and there are techniques to mitigate DDoS attacks. More experimentation is needed prior to abandoning a leader-based system; leader-based systems have many performance advantages. Sharding is as-of-yet unproven in practice, and may not be practical in that it introduces more problems than it solves. The 50k tps figure was an estimate assuming Solana could utilize GPUs for massive parallelization, but so far this has not happened. If GPUs could be properly utilized, tps for a single machine could theoretically reach into the tens of thousands. Right now, GPUs do almost nothing on Solana; they just offload a tiny bit of PoH (Proof of History) verification, which is so unimportant compute-cost-wise as to be meaningless.

I would recommend that pending transactions be submitted to non-leader validators, and these validators selectively forward the transactions to the leader. There would need to be incentive mechanisms to encourage only forwarding good transactions, and not withholding transactions.

I also think expired transactions should be charged a partial-fee from the fee-payer’s account. Users wouldn’t like paying fees for failed transactions, but it would deter spam-transactions in the sense that all of their failed submitted transactions could potentially have a cost.

Sources:

https://medium.com/solana-labs/sealevel-parallel-processing-thousands-of-smart-contracts-d814b378192
https://docs.solana.com/running-validator/validator-reqs
https://shdw.genesysgo.com/
https://nitter.net/i/status/1497214981150056448

--

--