Part 1: Can We Kill Moloch? ZK Basics and Virtual Machines
January 4, 2023
Written by: Cam
Inefficient outcomes are all around us. They plague our desires for optimization with the intolerable existence between the vastness of our expectations and the smallness of our reality. But what drives these situations? Why do they continue to occur?
For the unfamiliar, Moloch is a god-like figure representing sacrifice. Effectively, the force driving sub-optimal Nash equilibria in the many “games” around us. If you’ve ever read or heard about game theory before, you have probably heard of the prisoner’s dilemma. This is a simple problem where Moloch is at work and can be expressed by the notional format used to express game theoretic outcomes such as (-1,-1). This is the root of memes you may be familiar with such as (3,3). In particular, the prisoner’s dilemma goes as follows:
Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of speaking to or exchanging messages with the other. The police admit they don't have enough evidence to convict the pair on the principal charge. They plan to sentence both to a year in prison on a lesser charge. Simultaneously, the police offer each prisoner a Faustian bargain.
The possible outcomes are:
- If A and B each betray the other, each of them serves two years in prison
- If A or B rats on each other while the other stays silent, the snitch will be set free and the convicted member will serve three years in prison
- If A and B both remain silent, both of them will serve one year in prison (on the lesser charge).
The resultant nash equilibrium ends up being somebody snitching despite the optimal overall outcome being for both to remain silent. This is Moloch at work.
Why does this matter to crypto?
Moloch is the force pushing us towards self damaging outcomes. When I think about it in the context of crypto as we know it, I see a lot of discussion in trade-offs, particularly in the context of scaling the base layer technology to the needs we so fervently dream of (though is the demand actually there?). The most mimetic representation of these concessions is the scaling trilemma.
The warfare over such tradeoffs is extremely tribal - people love to get L1 weird - and to be honest, I see the current market Nash equilibrium as Moloch at work. Pick 2 of 3, it's a powerful meme. Most “next generation” Layer 1s seem to favour sacrificing decentralization over the other two, and why wouldn't they? Its like the silent killer, it doesn't matter when the times are good and you’re not really going to notice it missing….
Until it does.
And we’re seeing a lot of this today with the discussion around the merge, censorship of transactions, FTX, and so and so forth. But all of this is really just a fancy way of asking myself about how do we move away from the suboptimal equilibrium Moloch is forcing us into? The trilemma is a self-imposed limitation in some ways. It comes from us putting these three constraints into a system and needing to think outside the box (triangle) to escape the conditions.
What really started all of this Moloch talk? When I started this research topic, I was really hoping to take an approach of exploring how thought leaders in the zero knowledge space think about the future of the tech and what it looks like post mass adoption. What does it really mean to integrate ZK? In doing so I was having a conversation with one of the people I think is pushing the absolute limits of the tech - GuiltyGyoza. GuiltyGyoza is the founder of Zee Prime portfolio company Topology. Committing acts of on-chain wizardry, they are pushing the limits of the Starkware ecosystem to build games and more on-chain. When I asked him about what this future might look like, naturally he came back with a poetic response:
On the most abstract level, my conviction lies in using zero knowledge proof, verifiable encryption, and verifiable obfuscation to protect the provenance (original creator; inventorship) of knowledge. This knowledge needs to satisfy two properties:
(1) It takes the form of a pure function, meaning there should be no side effect; potential side effects are all encapsulated via monad for example. Function purity protects the value of the knowledge.
(2) It is composable, and composition does not break its obfuscation. this allows others to build upon the knowledge in trust minimized ways.
Zero knowledge proofs work with two parties - the prover and the verifier. Basically they allow a prover to demonstrate something is true while not disclosing some data to the verifier. In the context of a blockchain it effectively allows for compression of information. This leads to more information processing in the same set of constraints (such as node data throughput) - in other words scalability. We can think of this as improving the UX of blockchains. Currently, most are relatively slow (TPS) and expensive. Leveraging the qualities of zk proofs mentioned above allows us to ameliorate both of these.
ELI5: Can we have our cake and eat it too?
“Just watch me” - likely Thomas Cromwell
Zero Knowledge Proofs
We hear a lot about them and the potential they bring, but what does this look like, what does it mean? What matters in the ZK universe and what should we care about? Here we seek to address these questions. Our content is broken down into the following segments for this paper, while part two will focus on the other use cases such as privacy, verification, and more:
- Why ZKPs matters (computation, privacy, outsourcing trust)
- Types of ZKPs
- The ZKP pipeline (how they interact with blockchains, from coding language - > proof ->prover -> output (application/chain)
- Why does the hardware matter to ZKPs and what options are out there?
- What does the post ZK adoption world look like?
- The ZK Scaling Landscape
As always, we feel it is important to maintain a healthy degree of skepticism. One important thing to remember when talking about transformative technologies: their impact is almost always overestimated in the short term, and underestimated in the long term. Through this post I aim to deliver a handful of mental models which you can use to broadly assess the various aspects/projects using Zero Knowledge tech.
Why Do ZKPs Matter?
While there are three main use case categories currently, undoubtedly different uses of the tech will emerge as adoption continues. Below are the core three features:
Outsourced Verifiable Computation
This is effectively a trust shortcut. It allows one to run computation on a third party while a proof can be output to demonstrate computational integrity. Why does this matter to us? Well for one you can outsource computation to faster off chain alternatives while maintaining a similar level of trust (trust is generated in distributed systems via repeated computation). This is obviously good for scalability, and expanding use cases.
Doing computation but hiding certain parts of it. As you can imagine this is highly desirable. Many of the commonly discussed consumer side ZK use cases hinge around leveraging this. These include: private KYC, anonymous voting, and proof of ownership or activity (such as ZKPs on filecoin to prove correct data storage).
As mentioned earlier, this amounts to information compression. Sublinear verifier times allow for this compression and are what gives zero knowledge tech its scaling properties. Proofs typically have a fixed number of group elements (think transactions), while the actual proof size is much smaller (eg. many transactions in the size of one)
In the diagram above we can effectively see how this information compression works. Throughput increases along the x axis while computational time is the y axis. Note that we can increase throughput relative to a native non zk system for the same amount of computational time.
Type of ZKPs (STARKs and SNARKs) and How They Work
The two main kinds of blockchain related zero knowledge proofs come in the form of Zero Knowledge Succinct Non-interactive Argument of Knowledge (zkSNARK, SNARKS) and Zero Knowledge Scalable Transparent Arguments of Knowledge (zkSTARK, STARKS). Both can be considered within the family of research called Computational Integrity (CI). All systems within CI have two common characteristics: Arithmetization (information compression) and Low Degree Compliance (verifier solving prover’s statement). This section is the most technical, and while we have strived to deliver a manageable read, some of the concepts, namely elliptic curves, are fundamentally complex subjects.
Arithmetization is the process in which we transform computation statements into a polynomial. This works by first turning the statement of computational values (execution trace) into a set of rules (polynomials) that the execution trace satisfies. From here we can then transform the execution trace into a polynomial itself, then reduce that and the set of rules into a single polynomial that can be tested to see if the execution trace does in fact fit into the rules.
Low Degree Compliance
LDC requires usage of random sampling to ensure the prover is not being fraudulent. This is the process in which the polynomial schemes that are committed are tested via random inputs. In SNARKs this is usually done by using the reference string from the trusted setup (KZG polynomial commitment scheme), while in STARKs this is done via publicly random tokens (transparent setup) - the tradeoff here being proof sizes are larger to ensure the ability to catch fraudulent behaviour. LDC is done by testing that the arithmetized polynomial satisfies a certain condition (i.e. is this true).
There are different approaches to this based on the type of zk approach. STARKs use a method called FRI (Fast Reed-Solomon Interactive Oracle Proof of Proximity), while SNARKS often - not always - use the KZG (Kate, Zaverucha and Goldberg) commitment scheme. Commitment schemes are basically needed to set-up the circuit (computer) and have a baseline of results you know to be true, which the prover’s results can be tested against (preprocessing). There are also Bulletproofs which use the Halo scheme that trades smaller proof sizes, and no trusted setup for slightly slower proving times than SNARKs.
The differences between can effectively be summed up via the table above. Note that FRI based solutions are interactive, and the cost of being non-interactive however is the need for a trusted set-up to generate the set of points (pairings) the committed polynomials must satisfy.
SNARKs and STARKs use different types of cryptography to secure randomness of the proof. SNARKs use Elliptic Curves while STARKs use Collision Resistant Hashes. The ELI5 here is that collision resistance is important, it is a system that makes it very hard to find an input even if you have the output. This idea is foundational to all cryptography. It is a characteristic called trapdoor functions - easy to solve one way, hard to solve the other way. The gap between these can be thought of as the cryptographic security of the system. Prior to computers, a simple example of this difficulty was factoring (how do you know which factors were used to arrive at the output).
The original gigabrain - Eratosthenes
One can also see how this property with factoring weakens dramatically with the advancement of computational resources (brute force computation of all factors of a given number). For elliptic curves, current mathematics does not have an algorithm that can close the gap in the way something like General Number Field Sieve did for prime factoring.
⋆Caution Heavier Content⋆
In elliptic curve systems we have a form of cryptography based around a property of the curve. They look like the diagram below:
An Example of how an Elliptic Curve is used - sorta
You will notice they are symmetrical sideways (along the x axis). What this allows us to do is to have a set of values (input, output) that are related to each other in some way that is very difficult to know. Basically you can multiply the input by some number and get the output, but even if you have both the input and the output it is difficult to know what the input was multiplied by. If you want additional details on ECC check out these two resources: here and here.
When it comes to a zk system, we can use these curves to come up with a set of points (input A and output D values) that demonstrates the relationship that must be proven (satisfied). When a proof is submitted we see that the relationship is satisfied between the inputs and outputs thus proving the proof system to be true. For a bad actor, if they were submitting fraudulent proofs, the inputs and outputs would differ from expected, unless they knew the hidden values we used to multiply against our input. This all works if n is secret. If n is not secret, it is possible to submit fraudulent proofs that appear to satisfy the rules.
On the other hand, STARKs rely on hash functions. Hash functions basically turn some data of arbitrary size into a fixed length piece of data. The idea behind them is that it is very difficult to find two inputs that hash the same output. This creates the gap we described as cryptographic security before. Once provided we can verify the hash easily based on the inputs (doing the hash ourselves).
Going back to our generalized idea of ZK, it allows a party to prove a statement to another party (the verifier) without that party revealing any additional information in an efficient manner. In order to do this, both the prover and verifier require some common knowledge (the set of points from the elliptic curve) that is used every time the circuit is run. This is what enables a balance between privacy for the prover and their honesty. To do this, and generate what is known as the reference string (the set of points Ax/Dx, preprocessing), an honest party is required to create the secret n (the relationship between input/output) and subsequently destroy it. To get an idea of just how crazy this can get, I highly recommend this radiolab episode on the original Z cash ceremony.
As mentioned earlier this secret is effectively a backdoor into the proofs and hence the generating party needs to be “trusted” to destroy the secret afterwards. Some models allow for Updatable CRS/SRS, which allow for a continuous process in which additional parties can contribute to the string. While still a trusted set-up, this does improve security considerations.
Further to this, you may hear of different proving systems such as Groth16 or PLONK. These are basically different implementations of SNARK systems (note there are some for STARKS as well). While there are many out there, and new ones keep being developed (such as Halo and SuperSonic), most you will find will use Groth16 or a variant of PLONK as these have been around for some time and have the most resources/research around them. zkSNARK research has effectively followed two lines of research:
Groth 16 → GKM+18 → Sonic 19 → PLONK 19/MARLIN 19
Ligero 17 → Aurora 18 → Fractal 19
The family tree
The Ligero line is largely focused on post quantum security, while the Groth line is focused on elimination of the trusted setup for short efficient setups. In reality the lines are much more complicated given the different divergent/overlapping paths and this is not an exhaustive summary. Some steps and branches have been skipped for simplicity sake. Further details on this, and the specific systems can be found in this article by koala1992. One can really think of categorization here by Symmetric versus Asymmetric Cryptography as seen in the family tree diagram. The Starkware team provides a great deal of information on the subject as well in their medium post series starting here.
To expand on these ideas, basically the different types of implementations vary across a set of attributes such as the post quantum security previously mentioned, requirements of a trusted setup, proof size, verification time, etc.. Additional resources on proof systems can be found here: zkp science.
To simplify our thoughts on the proof system space, we can distill it down to a handful of mental models to pick apart the key aspects of what matter:
Trusted versus non trusted setups. Ideally we do not want trusted setups.
SNARK versus STARK. STARKs have post quantum security, no trusted setup, and better scalability, but have large proof sizes. S in SNARK is for Succinctness (small proof sizes), while S in STARK is for scalability. Source: Matter labs
STARK proof sizes are currently a killer for a lot of people. There are some solutions to this such as using a SNARK to prove the STARK (Polygon Hermez), but this approach sacrifices post quantum security.
New proving systems are coming out every couple of years. As is evidenced by the research paths above. This can make development frustrating as the space moves frequently. Some resources are seeking to create models that are proof system agnostic. The team at Manta Network have done a ton of work on this front with their openzl library.
In the future the lines may be blurred between the three. For instance, Plonky2 (SNARK) looks alot like a STARK but with preprocessing. In the end the general technology is extremely valuable for the use cases we mentioned above, and the differences within the technology are far smaller than the differences between the technology and not (i.e. zk versus non zk systems).
In the context of blockchain focus on aspects that can create centralization issues such as proof sizes (Are resource requirements reasonable?), trusted setups (what is the ceremony like? Are there any red flags in construction?), and legal bottlenecks. How does the cryptographic company building the prover/circuit intend to monetize?
SNARKy STARKs and STARKy SNARKs
How Are ZK Proofs Used With Blockchains?
ZK proofs are typically used within blockchain systems by compressing some logic into a format that is ZK friendly (such that it can be proved by a ZK proof). This process is what we are calling the ZK pipeline. It looks something like this:
Source: ZK Whiteboard Sessions - Module One, by Prof. Dan Boneh
Note that even before the Domain Specific Language (DSL), you may be rewriting from a classical coding language. This is one of the biggest hurdles a lot of ZK virtual machines face right now. Finding developer talent in the DSL is incredibly tough currently.
The other kind of DSL talent seems much more plentiful these days…
In the context of a blockchain, ZK based scaling solutions bundle transactions together and submit a validity proof to the layer 1 chain. This should not be a new idea to anyone at this point. For other ZK use cases on chain, such as Sismo, they require a user to reveal control or ownership or a given asset (wallet) in order to generate privacy creating abstractions somewhere else.
Why does Hardware Matter to ZKPs? (Hardware Acceleration)
Now that we understand how ZK proofs are used in the context of blockchains, we can begin to take a look at how that proving process happens. Once distilled into the primitives necessary for the proof, the process is dominated by two main types of computation or math: Multi-Scalar Multiplications (MSM) and Fast Fourier Transforms (FFT). These difficult computational tasks become the bottlenecks for the proof generation process. Amber Labs and Georgios Konstantopoulos both explain this in their work on this front. Effectively what you need to know here is that different proof systems will result in different workload ratios of MSMs to FFTs.
Why do these ratios matter? Well it ultimately impacts the hardware that is most preferable for such computation. While GPUs are widely used, the difficulty of MSMs and FFTs in particular lends themselves to usage of Field Programmable Gate Arrays (FPGAs) or Application Specific Integrated Circuits (ASICs) to be efficient. Both of these are essentially customizable computational hardware.
Your basic mental model for these and how they interact with ZK Proofs should be as follows:
- FPGAs are nimble and can be re-flashed while ASICs are static but more efficient
- ASICs as a result of higher degree specification are more efficient but are much more costly and time intensive to produce, as well as rigid. Further to this, the speed differential between new proving algorithms (remember Groth etc.) versus the development of these custom chips, means they will for the foreseeable future lag behind and be unsuitable for investment. Generalized approaches supporting multiple curves hoping to capture most current systems are what seem to be coming to market.
- GPUs can handle MSM reasonably well due to the parallelization/multicore nature of them
- More FFT requires more customized logic (such as FPGAs or ASICs)
- FPGA flexibility makes them the most preferable option at this time given the quickly changing landscape of ZK implementations.
What Does The World Look Like Post ZK Adoption?
A vision of the future post ZK adoption is actually where I have been most disappointed in writing this. Perhaps it is still too early, but generally speaking it feels creativity is relatively limited at this point. Scalability is first and foremost the main attraction here. Scaling blockchains to serve the next billion users.
Other than this, in these early days, I’ve found it's hard for builders to articulate more creative uses of the technology. Recently, I saw this blog post on using the technology that discusses a verification standard of changes to digital images. This is all in an effort to fight the ever-present disinformation issue. Little hints like this give us a strong indication that it is still early despite the fact we have been hearing about ZK for the last x number of years.
The ZKP Scaling Landscape
Although current uses of zero knowledge proofs range across base layer protocols, layer 2s/scaling solutions, and individual applications, most of the usage of the technology/mathematics has been focused around scaling transactional throughput via layer 2s. In this section we are ultimately seeking to map the ZK scaling ecosystem. A big shoutout here goes to ingopedia and Ventali’s awesome ZK. They are great resources if you seek to school yourself on all this ZK. Ventali’s is more focused on the ecosystem, while the ingopedia is a deeper resource for the technology itself - do with that what you will.
Our first category of protocols are zkEVMs. zkEVMs are a relatively recent phenomena as a result of breakthroughs in recursive proofs. The holy grail here is really EVM equivalence. Vitalik has written about this in the past and provided a broader overview of the types we see emerging:
The teams can generally be categorized across the different types (1 through 4).
Each of the different types effectively makes a tradeoff between the level of compatibility with the base layer, and trading off proving time. With time we expect much will exist in the type 2.5 range and provide enough compatibility to satisfy most applications. While some teams are pursuing non-EVM machines, at this point EVM dominance remains strong and is the key focus for most teams.
- Matter Labs zkEVM - also known as zkSync, they have taken the approach of doing a type 4 style implementation and using retracing to get to higher levels of compatibility. zkSync 1.0 uses a PLONK implementation which required trusted setup, however Matter Labs are working on Redshift which aims to create transparent SNARKs removing such a need.
- Hermez zkEVM - Polygon’s zkEVM. The overarching design principle here seems to be closer equivalence while balancing computational tradeoffs. This uses STARKs with SNARK proofs that improves on some of the drawbacks of STARKs (mainly proof size) at the expense of post quantum security. Designs here are quite public and we can see how participants can permissionlessly contribute (sequencers and aggregators). This is obviously very attractive as it is a clear path to decentralization as a roll-up.
- Polygon Zero (Mir Protocol) - Formerly known as Mir Protocol, Polygon Zero is a Plonky2 based zkEVM. Focused on extremely fast proofs. The differentiator between Zero and Hermez is that Zero allows for partial privacy. It does this by enabling blocks on inline code that can be proved and verified, while allowing the rest of the program to be fully transparent.
- Scroll - Striving for complete equivalence. This naturally comes at the expense of computing cost. Built in a fully open way and partly in coordination with Appliedzkp (Ethereum Foundation). Scroll uses Halo2 which is a plonkish implementation with a bulletproof commitment scheme. This means they are transparent and do not require a trusted setup while maintaining their succinctness. Scroll uses a Sequencer/Relayer/Coordinator set-up. This set-up looks fine and the remaining details to be answered are around how permissionless the system is.
- Appliedzkp - This is the Ethereum Foundation’s research into building a zkEVM.
- ConsenSys zkEVM (PDF warning) / gnark library - A Groth16/PLONK based zkevm library.
- TinyZKEVM - No Github activity in over a year.
- OlaVM - Opted for a register based VM (instead of stack based), Removal of FFT, and zkunfriendly opcodes.
- Sovereign Labs - A simple zkEVM implementation built on Risc0.
Overall Polygon’s Hermez and Scroll look like leaders here (both type 2.5). Perhaps most interestingly, one of the biggest differences between these two is the micro opcodes of Hermez. This matters because the cost of each cycle of the zk circuit is basically the sum of the opcodes. If you are natively implementing each EVM opcode into your zk circuit (full equivalence) it will be extremely expensive to compute as some opcodes are much more expensive than others (Push vs. Add). One can create environments where opcodes that are supported are optimized for the zk environment (type 4) however you will need to reimplement at compile time (retracing). Thus we can think about whether full equivalence is worth the trade off in the cost differential in computation. A type 4 implementation while trading off compatibility can be orders of magnitude cheaper than type 1 (think $0.1 vs. $0.0001 per txn). Anything below type 3 (eg. Type 1 / 2) is natively supporting more complex opcodes which explodes the cost of computation.
Going back to our idea of improving UX, the vast differential in cost of computation is important for achieving this. The use case limitations of transactions at that $0.1 versus $0.0001 vary a lot. Furthermore, while we think there are leaders currently (basically allowing for a better L1 style experience that is cheaper and faster while supporting most existing apps), longer term, a type 4 style zkevm will support use cases that are valuable and distinctly separate from those supported by higher types.
Trying to understand Polygon’s Product Suite
Also worth noting that anytime we do things such as retracing, there is a risk of translation not being perfect. Put differently, this means that something could be interpreted wrong, leading to potential exploit.
The remaining questions across all of the protocols are around value accrual. We have yet to see a strong model to support tokens of roll-ups. Polygon’s v1 design towards MATIC (Proof of Donation) was an attempt in this direction however the new direction of Proof of Efficiency will likely be more interesting to observe.
While zkEVMs are the immediate future of zk within the web 3 ecosystem, zkVMs are much more generalizable and extendible to far larger groups of developers.
It likely needs not be repeated, but the vast majority of developer resources in the world are not exactly Solidity devs. A universal zkVM opens a wider door to development of new applications and use cases. The biggest problem with this space at the moment (such as the Starkware ecosystem) is that development is still in a zk-optimized language such as Cairo.
The table above from Miden’s presentation at ETH Amsterdam effectively sums this up.
- Delphinus zkWASM - As the name suggests its a WASM based zkVM. This is still very early, but seeks to solve the developer issue via WASM.
- zkMove - Similar to Delphinus, pursuing a Move based zkVM. They are using Halo2 as their proving system just like Scroll.
- zkRiscV - zkVM based on the RISC-V RV32I instruction set. This is fairly niche and likely not particularly relevant.
- Tritron VM - Another Rust based solution, however it has focused on improving recursion verification times.
- Starknet - A clear leader in the space. While being Cairo based, it still has the largest ecosystem of projects and developers. Starknet also benefits from Nethermind’s Warp transpiler which enables solidity to cairo loading. This is not as clean as deploying to something like the low number type zkEVMs, but does certainly help adoption in the short to medium term. We are already seeing the benefits of the scaling attributes in these ecosystems, with projects like Topology and Briq.
- DistaffVM - a Rust based STARK vm. No activity on their github in quite some time.
- Polygon Miden - A STARK based VM that is based on Distaff. Their goal for Miden is to focus on supporting arbitrary logic and transactions. Their structure has transactions being sent to a network of execution nodes which bundle the transactions for the block from which the STARK proof is generated. There are limited details on what the permissionlessness of the node network looks like, as well as tokenomics for the roll-up.
- Uqbar - an extremely interesting project focused on bringing smart contract execution to the Urbit P2P ecosystem. It is built on Cairo as a roll-up for ethereum, although not EVM. Bringing this blockchain environment to the Urbit Nook environment will enable new kinds of applications, while using Cairo to make proving more performant.
We expect in the future to see interesting things from systems like Move, Triton and Uqbar, but in near term the clear ecosystem to watch is Starknet.
While there are many other zk layer 2s we will cover those in part 2, as many have a specific focus such as privacy or defi.
Through the explorations in zk technology, I hope that even the most general reader can enjoy a decent degree of understanding how zk works in the context of blockchains. The aim here was to provide simplified explanations and facilitate broader ideation leveraging the technology.
We can see that the information compression attributes of the technology will undoubtedly help us improve the throughput of state machines and scale to the next set of users. While this aspect is relatively clear, what is not, is what will drive those users over to web 3 in the first place.
In my discussions on this with Hocwyn Tipwex, I think they elegantly articulated what impact the technology could have:
Roughly speaking, when both the efficiency of scaling and the development environment’s seamlessness (ie, a middleware/language problem) are solved, any kind of computation can be done trustlessly, any kind of software can be integrated into on-chain applications, NFTs will become the substrate for much more complex computational objects, and all economic coordination will start to live on chain (as well as many forms of cultural coordination where the implications of one party’s contributions are computed locally and then broadcast).
It really restores and also goes far beyond the earliest hopes people had for crypto as basically “backend payment system for web2”. It can be truly revolutionary.
In a period of great skepticism towards blockchain in general, we can look toward such things as the real powers of revanchism that Su could have only dreamed of.
In our next part of the series, we will explore the variety of other use cases that are driving novel blockchain applications.