Preliminaries - ZK-relations

In the course of this documentation we often describe certain relations that are then proved using ZK-SNARKs. The purpose of this section is to explain how each such a description is structured and what each part of the description means.

Even though we typically don't reach such low-level details it is important to remember that every single input of a relation is a field element (think of the scalar field of the underlying pairing system, see Cryptography). In the code we denote the type of a scalar field element by Scalar. When designing constraints these low level details often ends up being crucial -- we want our constraints to be representable by small arithmetic circuits of the scalar field (in other words: to be "SNARK-friendly") for them to incur low complexity upon arithmetization. For more background we refer to any of the excellent resources on SNARKs that are available online, such as the MOOC https://zk-learning.org/

ZK-relations

We describe relations in boxes with structure as below:

relation R

inputs:
    - x_1
    - x_2
    - ...
    - x_n

witnesses:
    - w_1
    - w_2
    - ...
    - w_m 

constraints:
    1. C_1
    2. C_2
    3. ...
    4. C_r

If you are interested in the precise mathematical description of what this box means, we refer to Mathematical description of R below. If not, we will give some intuitions about what to expect to be listed in the 3 sections of the box:

Inputs

In this section we specify "public inputs" to the relation (we denote the vector of public inputs as x), i.e., inputs that the user does not have to keep secret and which are exposed on-chain. If this was a signature scheme than we would likely expose the public key in this section. Or we could expose an encrypted message in this section, or any piece of data that is necessary to input in the plain. In many cases hashes of some larger structures are exposed as inputs, this has two purposes:

  • Compression: the hash is smaller than the underlying, possibly large structure.

  • Privacy: the underlying data is often hashed with some random salt, in order for this to serve a role of a non-revealing commitment.

Witnesses

This section contains data w that is necessary for the prover to demonstrate that for a given input x the relation R(x) holds. Indeed, in typical implementation of the proving function would look like:

generate_proof(R, x, w)

thus would take as input a description of the relation R (perhaps as an arithmetic circuit), an input vector x and a "witness" w that allows the prover to generate all necessary values on the circuit that are required to generate the proof.

The reader might be tempted to think that w are inputs to the relation -- this intuition, while not wrong, might be confusing at times and lead to misconceptions in notation and in implementation. A better intuition to keep in mind is what's described in Mathematical description of R -- roughly speaking the only inputs to R are x but the remaining ones (that appear in the constraints) -- y can all be efficiently computed given the witness w.

Constraints

The constraints together define a binary predicate R(x) on public inputs (see below for technical details). The constraints are typically written as pseudocode in a high-level language but it's important to keep in mind that in the end they are supposed to be written as arithmetic circuits (or using a similar arithmetization). These constraints might often involve explicitly, or implicitly some other variables, not mentioned as inputs and/or witnesses. These should be thought of as bound by an existential quantifier \exists.

Example

relation R

inputs:
    - x_1
    - x_2
    - x_3

witnesses:
    - w_1

constraints:
    1. w_1 * w_1 = y_1
    2. y_1 <= x_1
    3. hash(x_2, y_1) = y_2
    4. hash(y_2) = x_3

Note that in the above:

  • All the variables y_1, y_2 can be computed efficiently given x and w . But without w it would not be possible to compute them.

  • The constraint y_1 <= x_1 is written in a high level language. On a low level to make this constraint representable as an arithmetic circuit new implicit variables must likely be introduced (the binary decomposition of y_1) and many arithmetic constraints, just to emulate this one high level constraint. In this case, the implicit variables don't show up as witnesses, but are still computable from y_1 and x, and thus from w and x.

Mathematical description of R

Mathematically the above box describes a relation R(x)R(x) with x=(x1,x2,...,xn)x=(x_1, x_2, ..., x_n). To define what R R is, let us denote by y=(y1,...,yn)y=(y_1, ..., y_{n'}) the vector of all variables besides xx that appear in the formulas C1,...,CrC_1, ..., C_r, thus the formulas are really C1(x,y),C2(x,y),...,Cr(x,y)C_1(x, y), C_2(x, y), ..., C_r(x, y) . Then RR is defined to be:

R(x)=y(C1(x,y)C2(x,y)Cr(x,y))R(x) = \exists_{y} (C_1(x, y) \wedge C_2(x, y) \wedge \ldots \wedge C_r(x, y))

Note that the witnesses ww are not explicitly mentioned in the above. In particular not necessarily y=wy=w -- typically this is not the case, it's only that ww is a subset of variables in yy. But more generally the intuition to keep in mind is that for a fixed x x given and a "suitable witness" ww it is possible to efficiently generate yy which satisfies the below

C1(x,y)Cr(x,y)C_1(x, y) \wedge \ldots \wedge C_r(x, y)

Last updated