Optimistic rollups are a method to off-chain computations from congested blockchains. Instead of recording transactions directly on the main chain (L1) with full replication across all nodes, state updates are processed by a separate network which reads messages from an “inbox” smart contract on L1 and publishes only a lossily compressed form (i.e. hash) of the resulting transition.

Since these computations are not directly verified by L1, a mechanism is needed to ensure that transitions are correct, that is, that the rollup does not confirm invalid blocks. In the optimistic model, transitions are assumed correct unless someone challenges a block within a certain timeframe after it was proposed. In this case, an interactive protocol establishes whether the block proposer can prove that its computation was correct (more precisely, a part of the computation specified by the challenger). This interactive protocol is called a fault proof (formerly “fraud proof”).1

Incentives for agents to validate blocks and issue challenges in the event of a fault fall into two categories:

• Invalid transitions at L2 could result in loss of funds locked up in the protocol (either directly on L2 balances or because of insolvency of the rollup itself). This factor is only an incentive if the agent believes that there is a chance that no other agent will validate.
• A validator that issues a successful challenge may be rewarded with a portion of the block proposer’s bond.2

On the other hand, agents incur costs (computational and opportunity) by computing state transitions, which in an expressive enough VM can be arbitrarily high.3 Moreover, these costs are always incurred by verifying, while the rewards are only available if the rollup actually has pending invalid transition proposals. That is, neither of the above incentives work if there are no faults at all. This tradeoff is part of the verifier’s dilemma, some of whose consequences have been observed in the wild.4

Definition. The fault rate (resp. faulty confirmation rate) of an optimistic rollup is the probability that a given proposed (resp. confirmed) state transition is faulty.

Given that transition faults can hide arbitrary behaviour, the average rollup user is not likely to find any non-negligible faulty confirmation rate acceptable. But as observed above, a negligible fault rate removes all incentives for verifiers. If verifiers drop out, disincentives to propose faulty blocks are removed, opening the door to intentional attacks.

In this article, I formulate a simple static game to model the interaction between verifiers and a block proposer who may wish to turn a profit by proposing invalid state transitions. I establish a quantitative proof of the above observation.

Theorem 1. Trustless optimistic rollups with Byzantine block proposers admit no static equilibrium with negligible faulty confirmation rate and non-negligible cost of computation.

In the idealised continuum model, the fault rate and faulty confirmation rates can be made arbitrarily small by increasing the reward for successful challenges.

Theorem 2. Suppose that the block proposer bond size $\delta$ is small compared to the amount that a single attacker could gain by getting a deliberately inserted fault confirmed. Then adjusting $\delta$ has negligible effect on the faulty confirmation rate.

This result doesn’t imply that it is unsafe to transact on optimistic rollups; the so-called any-trust assumption of at least one honest validator — may be quite easily satisfied in practice, even if it is not “rational” to do so from the perspective of the non-collaborative game. It is, however, important to acknowledge the trust model of a network one interacts with. I see three options:

1. Users trust at least one verifier, or that one verifier from a community of verifiers checks all transactions and will challenge in the event of a fault.
2. Extend the protocol with messages where validators somehow cryptographically prove that they are actually carrying out the computation. This is similar (perhaps identical) to the approach of zk rollups.
3. Develop risk-hedging third party services to shield ordinary users from the risk of a nontrivial rollup faulty confirmation rate.

The reality of optimistic rollups today is that all transactions are batched and committed by a centralised PoA sequencer. To interact with the rollup, users must either validate all state transitions themselves, trust that at least one third party is doing so and would enter fault proofs if faults were found (option one), or trust the sequencer to publish only correct transitions.

An optimistic rollup#

Let us suppose that the rollup works as follows:

1. Users submit message calls to the rollup “inbox” on L1. They enclose computation and space fees for L1 and L2.5

2. Aggregators/proposers submit block proposals, that is, messages containing:

• the hash of a state transition
• a target block

to the rollup state contract on L1. An honest aggregator will obtain its state transition by identifying the most recent proposed block they agree with, setting its block number as the target, then reading message calls from the rollup inbox and performing the requested computations, starting from the state at the end of the target block.

The compressed state transition is validated at L1 and either appended to a tree of proposed blocks as a child of the target block, or discarded (for example, if the target already has a confirmed child). In the former case, the block proposer also puts up a fixed amount of currency $\delta$ which is held in bond by the rollup.

3. Validators may stake a fixed amount $\delta$ and challenge any unconfirmed block. The challenger and the block proposer now enter an interactive proof session in which the block proposer is required to produce a partial trace of computation that exposes intermediate states. The session singles out an atomic state transition whose result the proposer and challenger dispute. The corresponding computation is then carried out on L1; if one of the participants' output disagrees with that of the L1 execution, or if they abandon the interactive session before it is complete, their bond is lost. A portion of the lost bond may then be awarded to the other participant.

4. After the deadline of a block passes, the block can be confirmed if its parent block is confirmed. In the event of more than one candidate for the next unconfirmed block, the behaviour is unspecified (we won’t analyse this part of the protocol here).

This protocol is abstracted from the current version of Arbitrum, as reported in its documentation.6

The fault inspection game#

Let’s fix some parameters:

• $\Delta>0$ is the expected payoff for a successful attack. A reasonable proxy for this might be some function of the TVL of the protocol or the median transaction size.
• $\delta>0$ is the bond required to commit a state update. We assume that a successful challenge results in the challenger being awarded half the producer’s bond.
• $\eta>0$ is the cost of computation to verify one commit. For simplicity, in this article I will assume that state transitions are segmented into units that have constant computational verification cost.

In realistic scenarios, we have $\Delta \gg \eta$; meanwhile $\delta$ can be set by the protocol designers to achieve desirable behaviour. We assume that nodes on the L1 network follow protocol, so that the rollup contracts always function as described in the previous section.

A static model#

A single block in the rollup game can be regarded as a finite game of two players: a verifier which may either verify ($V$) or not verify a given state transition, and a proposer, which may either propose a faulty block ($A$) or not. The payoff matrix is $$\begin{array}{r|cc} & A & \neg A \\ \hline V & \frac{1}{2}\delta-\eta & -\eta\\ \neg V & 0 & 0 \end{array}$$

for the verifier and $$\begin{array}{r|cc} & A & \neg A \\ \hline V & -\delta & 0 \\ \neg V & \Delta & 0 \end{array}$$ for the attacker. This game is qualitatively of the “matching pennies” or “inspection” type: each pure strategy profile is a best response for one player and a worst response for the other, and there are no pure strategy equilibria.

If the proposer pursues a mixed strategy, attacking with probability $p$, and the verifier verifies with probability $r$, the utilities are as follows: $$u_A(p) = p((1-r)\Delta-r\delta) \qquad u_V(r) = r(\frac{1}{2}p\delta -\eta)$$ Assuming that our rollup has only one Byzantine block proposer, and that strategies are independent (hence the Bernoulli RVs “attack” and “verify” are independent), the fault rate of the rollup is $p$ and the faulty confirmation rate is $$\frac{\text{unchecked fault rate}}{\text{confirmation rate}} = \frac{p(1-r)}{(1-pr)}$$

which is zero only if $p=0$ or $r=1$ (and undefined if $p=r=1$). The unique (uncorrelated) Nash equilibrium is $$r = \frac{\Delta}{\delta+\Delta} \qquad p =\frac{2\eta}{\delta}$$ provided $2\eta < \delta$. Both utilities are zero and both proposer and verifier are indifferent to changing their strategy. This proves Theorem 1.

This leads to a faulty confirmation rate of $$I(\delta) = \frac{2\eta}{\delta}\left(1- \frac{\Delta}{\delta+\Delta}\right)\left(1-\frac{2\eta \Delta}{\delta(\delta+\Delta)}\right)^{-1} = \frac{2\eta}{\delta+\Delta(1-2\eta\delta^{-1})} > 0.$$ The denominator is unbounded as $\delta\rightarrow\infty$, so the protocol designers can shrink $I(\delta)$ as much as they wish by increasing $\delta$. Interestingly, for small $\delta/\sqrt{\Delta}$ the fault rate actually increases with $\delta$ as $p$ increases faster than $r$.

Pause here to make two observations:

• If $\delta \gg 2\eta$ (which is realistic in practice), we get $p=2\eta\delta^{-1}\ll 1$, and a term can be discarded from the denominatorand to obtain the simplified expression $I(\delta)\approx 2\eta/(\delta+\Delta)$.
• If $\delta \ll \Delta$ (which may sometimes be realistic), then this expression is well approximated by $I(\delta)\approx 2\eta/\Delta$. Also, $r\approx 1$, so nearly every block is verified. In particular, adjusting the bond size has essentially no effect on the faulty confirmation rate. This proves Theorem 2.

On the other hand, if the cost $\eta$ of verification is negligible, then this game has equilibria with $p=0$. Indeed, in this case validators are indifferent as to whether they verify computations or not. For any $\epsilon\in(0,1]$, $\delta>0$ can be chosen such that $p=0,r=\epsilon$ is an equilibrium.

Threshold values#

Suppose that the protocol designers consider a certain faulty confirmation rate $\epsilon>0$ to be acceptable (say, equivalent to one faulty confirmation every 1000 years). Then with acceptable network parameters we have $2\eta < \epsilon(\delta+\Delta(1-2\eta\delta^{-1}))$. If we assume $\delta\gg 2\eta$ (a reasonable assumption in practice), this simplifies to $2\eta < \epsilon(\delta+\Delta)$, and we may solve for $\delta$: $$\delta > \epsilon^{-1}(2\eta-\epsilon\Delta)$$ Inspecting the terms, we find that if $\epsilon\Delta > 2\eta$ (that is, the the per-block expected value lost to attackers is greater than twice the cost of verification), then the RHS is negative, leaving us with a vacuous inequality. That is, we can safely set $\delta$ to any value significantly greater than $2\eta$ and get an acceptable faulty confirmation rate.

A sample calculation#

Let’s pull some figures out of a hat and see what a faulty confirmation rate looks like in practice. Some of the more extravagant DeFi attacks net returns in the tens or even hundreds of millions USD.7 Let’s suppose that faulty confirmations on our rollup could net our attacker $\Delta=$USD10M. Getting a reasonable figure for $\eta$ is a bit harder. Suppose for simplicity that $\eta$ measures just the raw electricity cost of powering the CPU; we grab some figures.

• An intel Core i7 4770K has a peak power draw of 84W and around 40-120 billion instructions per second, depending on which benchmark is used and who you ask.8 For simplicity, let’s suppose this means a performance of 1 billion instructions per Joule (Watt-second).

• The average price of electricity in the US in 2021 was roughly 10 cents per KWh, or $0.1/(3.6*10^6)$ USD per Joule.9

• For number of instructions per block, we make some (rather bold) assumptions: that our rollup blocks contain a similar amount of computation to Ethereum blocks, that a “generic” CPU instruction is similar in real cost to a “generic” EVM instruction, and that CPU instructions per Ethereum block are proportional to the amount of gas used. The typical Ethereum block uses 15M gas, and opcodes that do not affect the state cost from 1-10 gas each. Let us posit that a CPU instruction corresponds to about 5 gas, so there are 3M instructions per block.10

We now compute $\eta$ as: \begin{aligned} \eta &= [\text{CPU instructions per block}] \cdot [\text{Joules per CPU instruction}] \cdot [\text{USD per Joule}] \\ & = 3\cdot 10^6\cdot 10^{-9} \cdot (0.1/3.6) \cdot 10^{-6} \\ & = (0.3/3.6)\cdot 10^{-9} \\ &\approx \mathrm{USD}~8.3*10^{-11}. \end{aligned}

In the $\eta \ll \delta \ll \Delta$ regime — say, $\delta = 1$ USD, $p=2\eta/\mathrm{USD} = 1.7\cdot 10^{-10}$ — we find a faulty confirmation rate of around $1.7\cdot 10^{-10}/5\cdot 10^{8} = 3.4 \cdot 10^{-19}$, an absolutely insignificant figure. Even scaling the number of instructions per block by a factor of 100K — say our rollup has a TPS closer to that of Visa than Ethereum (1000x), and our guesswork conversion from Ethereum gas to power consumption was off by a factor of 100 — hardly puts a dent in this. Moreover, all but $\delta/(\delta+\Delta)\approx$ one in 50M blocks are verified. For larger values of $\delta$, of course, this figure is even smaller.

A real-life rollup with Turing-complete programming capability is likely to measure computation in terms of an internal unit of account much like Ethereum’s concept of “gas.” This provides a natural system of units for $\eta$ that is easier to work with than converting everything to external units: via a market that exchanges L2 gas for the native token of L1, $\eta$ could be compared directly to $\delta$ and $\Delta$, drastically simplifying the above estimation. With this in mind, our approach to estimating $\eta$ in terms of Ethereum’s gas per block may not be as dubious as it at first seems.

Discussion#

For the above analysis to apply in practice, we must assume that attackers are prepared to carry on inserting faults into blocks, getting caught every time, waiting for that once-in-a-billion-years block which slips by unverified. In reality, rational actors wouldn’t tolerate such variance in their payoffs, so the attack rate would actually more likely be exactly zero. Similarly, verifiers are unlikely to tolerate an extremely high variance in payoffs for verification, so I would argue that rational verifiers would treat a very low value of $p$ effectively as zero.

In other words, for regular verification to be incentivised, we need a high enough fault rate to provide regular rewards for verifiers (say, something like $p=0.01$ rather than $p=10^{-10}$). Since these faults are nearly always caught, there are no incentives for attackers. So the reward function for verifiers is dominated by unintentional faults, which we did not include in our analysis above. The fault rate should also not be too high, lest the additional overhead of carrying out the challenge protocol and waiting for L1 confirmation significantly slows down confirmations on the rollup.

Here is one way to ensure a consistently high fault rate without compromising funds locked up in the rollup: define an extension to state that is randomly deemed “faulty” at a fixed rate (Bernoulli process) but is not otherwise coupled to other state fields. For example, each intermediate step in the trace of computing a state transition could be hashed together with a quantity known only at the time of sequencing, such as the timestamp that the sequencer receives the block proposal message, and the state transition declared “faulty” if the result of this hash has a certain number of leading zeroes. If bond sizes are low compared to per-block fees, the loss of a bond for this type of “fault” could be regarded as a simple tax to pay verifiers and running cost of operating a node, rather than a punishment for mishbehaviour.

Extensions#

The real-world game of optimistic rollups has several other features which the simplified model presented above fails to capture. Some of these issues may undermine the conclusion that verifying is not incentivised.

Exposure#

In the introduction I identified two types of incentives for verifiers, but the game we analysed above only contained one: the fault challenge reward. An agent may have risk exposure to rollup faults in various ways: they may have funds locked up in the network whose balance could be directly affected by faults; faults could cause the rollup to become insolvent, preventing users from withdrawing funds to L1; block proposers may stake a bond on previous unconfirmed blocks which is at risk if these blocks turn out to be faulty. Simplifying, we could abstract all these risk types away and posit that the verifier has an exposure $E>0$ which is lost (incurs a negative utility) in the event of a faulty confirmed block. Then the verifier’s utility becomes: $$u_V(p,r)= r\left(\frac{1}{2}p\delta -\eta\right) - Ep(1-r)$$ where the term on the right is the probability that a given block is both faulty and passes without verification. This modification hardly affects the equilibrium analysis: rearranging, we find that in equilibrium $\frac{1}{2}p\delta - \eta + Ep = 0$ and the verifier is indifferent to changing his strategy. In this variant, the verifier’s utility is nonzero, but rather decreases if $p$ increases. The conclusions regarding equilibria with $p=0$ and the effect of scaling $\delta$ pass unscathed.

Multiple verifiers#

In a real-life decentralised rollup, many agents compete to identify faults and receive rewards. There are different ways one could structure this competition, leading to different distributions of verification rates among different nodes. Regardless, the block proposer chooses his strategy $p$ against the total verification rate, and the verifiers compete among themselves in the residual game at fault rate $p$, whatever that may be. Again, the basic conclusions (Theorem 1 and 2) are unmolested.

Timing and ordering#

Incentives in the verifier-proposer game may be subject to message ordering effects (including MEV). The ordering of transactions and protocol messages — block proposals and challenges — could all affect the values of network exposure, bond size, and attack payoff parameters. Some of these effects may be visible in a simple repeated version of the static game, but to see the full range of strategies a fully dynamic model is needed. Some aspects of the protocol itself can only be correctly formulated dynamically: blocks may be challenged at any time within the challenge window, and multiple blocks may be proposed on top of unconfirmed blocks. Network latency complicates the game between multiple verifiers. Advanced strategies such as SPV mining depend crucially on timing effects.

Incomplete information#

The rate of confirmed invalid blocks (or blocks with unavailable data) and successful challenges can always be established retroactively by processing the entire rollup chain from start to finish (the most recently confirmed block). Hence, with a lengthy computation one can recover the historical values of $pr$ and $p(1-r)$. Assuming that fault and verification rates are independent, one can therefore deduce $r$, even though verification of non-faulty blocks cannot be observed directly.

However, moves within the challenge window are not fully observable. In particular, block proposers cannot know for sure that the target block for their proposal is valid unless they verify it themselves. The verifier-proposer game could be approximated without going to the fully dynamic regime by a repeated game with moves that become observable after a fixed finality period.

Even if all agents in the game are “rational” — in particular, there are no altruistic verifiers and no unintended faults — in a game of incomplete information participants cannot be sure of this. A risk-averse client with positive exposure may base their strategy on an assumption that some invalid blocks are produced by irrational or malfunctioning agents, even if these are always succesfully challenged.

Irrationality and altruism#

In practice agents may behave altruistically or “irrationally” and verify the rollup even in the absence of incentives. They may also gain socialised utility from the increased network value of a rollup with a faulty confirmation rate of zero. These effects do affect the conclusion of Theorem 1: taking them into account, the system may remain in equilibrium with zero faults (though it will be impossible for an observer to verify this). The conclusion of Theorem 2 is unaffected.

A malfunctioning block proposer could also be regarded as “irrational” in the context of the block proposer game; as we saw, the existence of these is quite important for the incentives of verifiers.

Refined lightweight validation and fault strategies#

We have assumed that a verifier can choose only to verify a block in its entirety or not at all. In reality, the strategy space is more complex: incentives may also differ for partially verifying blocks. For example, certain faults may be detectable with little or even no computation (for example, if user transactions are failing because the rollup has become insolvent). If a fault is detected (and $p<0$), the expected payoff for verification increases and the verifier may then choose to go through the full computation needed to locate the fault.

Conversely, Byzantine nodes may deliberately restrict the faults they submit to evade this kind of lightweight verification. Hence, there is an enlarged strategy space for both (Byzantine) proposers and verifiers.

A refined classification of faults with separate rates also enlarges the space of acceptability thresholds for users: some kinds of faults may be more tolerable than others.

Protocol extensions#

Recent work in optimistic rollup design proposes a market of intermediaries for fast withdrawals.11 These intermediaries would also take on counterparty risk for misbehaviour of the rollup, receiving a fee in return. With such a system, a rollup with a non-negligible bad transition rate could be acceptable for non-expert users, since the costs thereof would be absorbed by intermediaries who model this risk and factor it into their business model.

Bibliography#

Al-Bassam, M., Sonnino, A. and Buterin, V., 2018. Fraud and data availability proofs: Maximising light client security and scaling blockchains with dishonest majorities. arXiv preprint arXiv:1809.09044. https://arxiv.org/abs/1809.09044

Luu, L., Teutsch, J., Kulkarni, R. and Saxena, P., 2015, October. Demystifying incentives in the consensus computer. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (pp. 706-719).

1. Al-Bassam, Sonnino, Buterin (2018). ↩︎

2. This is the case for Arbitrum, currently the largest rollup by TVL (cf. https://l2beat.com). ↩︎

3. Luu et al (2015). ↩︎

4. For a discussion of the various kinds of fees involved, see https://barnabe.substack.com/p/understanding-rollup-economics-from↩︎

5. I looked at the Sandra Whetstone (FLOPS) and Dhrystone (integer IPS) metrics on https://www.cpu-world.com/benchmarks/browse/910_80,965_61,993_80,1035_96/; compare also https://setiathome.berkeley.edu/cpu_list.php (FLOPS only). ↩︎

6. There’s plenty wrong with these assumptions, but I don’t know how to improve them without a lot more work. For the block size target, see https://ethereum.org/en/developers/docs/gas/#block-size; for information about EVM opcode gas cost, see https://github.com/ethereum/yellowpaper. Note that some opcodes have much higher prices to offset the running cost of storing state, which is not paid for directly in Ethereum. ↩︎