1
0
By Po, Qi Zhou from QuarkChain Team
Special thanks to Toni and Dragan for feedback and review!

Ethereum is scaling L1 by gradually raising the block gas limit. However, increasing the gas limit substantially higher (e.g., the 100Ă increase proposed by Dankrad) quickly hits hard limitsâââdisk I/O and CPU execution speed. Prewarming and EIP-7928 block-level access lists (BAL) remove most I/O read stalls, shifting the primary bottleneck to execution itself. Meanwhile, current clients still execute transactions sequentially, fundamentally capping throughput.
BAL (an idea our team also explored two years earlier) unlocks perfect parallel execution, yet its performance ceiling remains unclear. To address this question, we built a pure-execution environment with:
Using this environment, we benchmarked per-transaction parallel execution with BAL. Our results show pure-execution throughput exceeding 10 GigaGas/s on a modern 16-core commodity PC, whereas the current Reth client achieves only about 1.2 GigaGas/s under the same conditions. This indicates that EVM execution can scale an order of magnitude beyond current client baselines once the aforementioned bottlenecks are fully addressed.
Ethereum is increasing its gas limit from 45 M to 60 M in the Fusaka upgrade. Suppose the gas limit were scaled by 100x, the resulting block would contain roughly 4.5 G gas. To keep validation time under three seconds, validators would therefore require at least 1.5 GigaGas/s of execution throughput. However, Baseâs public benchmarks show that modern clients on commodity hardware reach a maximum of only about 600 MGas/s. This limitation is primarily due to sequential execution: although multi-core CPUs are available, existing clients process transactions serially, leaving most cores underutilized.

The gap between current performance (~0.6 GGas/s) and what 100Ă scaling requires (~1.5 GGas/s) is still substantialâââwhich motivates our push toward fully parallel EVM execution.
To study the ultimate parallel execution performance that BAL brings, we constructed a pure-execution environment by removing all unrelated non-execution parts, enabling us to measure the true upper bound of BAL-powered parallelism. Leveraging Rustâs no-GC design, fine-grained control over multi-thread scheduling and Rethâs high performance, we modified the Reth client and used revm as the EVM execution engine for this experiment.
Benchmark suite available here:
đ https://github.com/dajuguan/evm-benchmark
Our evaluation began by aligning sequential performance for revm and then progressively introducing parallel execution. Analysis of parallel scaling revealed that the latency of the longest-running transactions forms the critical path that limits overall speedup. To alleviate this constraint, we simulated larger block gas limits, which unlocked substantial parallelism with BAL. With 16 threads and 1G block gas limit, pure-execution throughput reached ~14Â GGas/s.
We first attempted to reproduce Rethâs benchmark results. In a sequential run on mainnet data with the KZG setup preloaded, pure execution reached 1,212Â MGas/s.
This sequential result serves as our reference point for all following experiments.
To evaluate both the actual speedup and the effect of Amdahlâs law on transaction-level parallelism, we conducted per-transaction parallel execution experiments to quantify the impact of the longest-running transactions on the achievable speedup.
Detailed results are shown below (where âlongest txs latencyâ is the total execution time of the longest-running transactions in each block):

Overall, the scaling results align closely with Amdahlâs law: although throughput increases with more threads, block execution time is constrained by the longest transaction, which accounts for about 70% of total execution time under 16 threads, capping the achievable speedup at roughly 5Ă instead of the ideal 16Ă for a 16-core machine. This indicates that scalability is determined by per-block critical paths rather than raw compute capacity.
This critical-path limitation can be mitigated by reducing the dominance of the longest transaction, for example through EIP-7825: transaction gas limit cap or by increasing block gas limitâââthe approach explored in this article.
Since per-block critical paths limit concurrency, we experimented with higher-gas âmega blocksâ to increase parallelism. To simulate this, we executed the transactions of multiple consecutive mainnet blocks, namely a mega block or a batch, in parallel, and then committed the state (noop in the experiment) only after all transactions in the batch had completed. This effectively aggregates multiple blocks into a single large execution unit.
We first evaluated a batch of 50 blocks, simulating an average block gas usage of 1,053 M, across different thread counts. Full results are shown below:

With such large blocks, the longest running transactions no longer dominate the critical pathâââthey contribute less than 20% of total execution time under 16 threads. Throughput scales almost linearly with thread count: with 16 threads, we achieve 14 GGas/s, roughly a 10Ă speedup over sequential execution and close to ideal linear scaling. This is extremely encouraging. In our experiments, the one major remaining critical path is the point_evaluation precompile, which is not trivially parallelizable.
To evaluate how parallel execution scales with increasing block gas usage, we executed batches of consecutive blocks while varying the block batch sizeâââthe number of blocks grouped into a single mega blockâââthereby simulating different effective block gas usage.

As the block gas usage increases, throughput continues to rise, but the incremental parallelism gains shrink from ~30% down to ~3% for each doubling of block gas. Once the batch size exceeds ~50 blocks (â1,053M block gas), further increases in block gas yield only marginal additional throughput.
Our experiments show that combining EIP-7928 with mega blocks enables transaction execution to scale exceptionally well, achieving 14 GigaGas/s of pure-execution throughput on a modern 16-core commodity processor. However, several open questions remain:
We excluded sender recovery from the pure-execution benchmark. In our experiment, enabling it cuts throughput by roughly 2/3, dropping to about 5 GigaGas/s under the mega-block configuration (1,053 M block gas).
Possible mitigation: GPU-accelerated sender recovery.
The point_evaluation precompile and sender recovery for 7702 transactions exhibit low gas-per-time efficiency. Their gas pricing may need to be revisited in the EIP-7928Â era.
Higher block gas limits may require retaining the current transaction gas limit cap to maintain high parallelism.
Builder performance is expected to become the dominant bottleneck. Improving BAL building is essential to keep up with pure-execution throughput.
State commit is another major bottleneck. Speeding up state-root computation and optimizing trie commit are necessary to sustain high-throughput execution.
We also explored different task scheduling strategies, e.g., prioritizing heavy-gas transactions by sorting them by gas used or gas limit, alongside the simple ordered-list scheduler (OLS), where transactions stay in natural block order and each new transaction is assigned to the first available core. When applied to mainnet data, however, prioritizing heavy-gas transactions yielded only marginal performance improvements and did not significantly affect overall throughput.
To evaluate the impact on overall throughput, we compared scheduling heavy-gas transactions first (by gas used or gas limit) against the OLS.


Toniâs analysis suggests that prioritizing heavy-gas transactions could outperform OLS by 20â80% in worst-case scenarios. In practice, however, using real mainnet data (representing the average case), the improvement is only around 10%, and scheduling by gas limit, gas used, or OLS shows minimal difference. On mega blocks, OLS performs nearly identically to gas-limit scheduling. These observations indicate that transaction scheduling is not the primary bottleneck; rather, the inherent distribution of transactions on mainnet forms the critical path.
If you have any questions, please visit our community channels for support and more information.
Discord | Twitter | Telegram | Website |Â Blog
1
0
Securely connect the portfolio youâre using to start.