Zero Knowledge Proof using Hamiltonian Cycles

The last post explored the idea of Zero Knowledge Proofs (ZKPs) using simple examples. Now it is time to take a deep dive into a fully functional ZKP based on a mathematical problem instead of hidden doors amongst neighboring houses.

The basic problem is very simple: A Hamiltonian path is a path through a given graph which visits every vertex exactly once. Hamiltonian paths are related¹ to the better known Eulerian paths, which visit every edge exactly once. As an example, this graph has a Hamiltonian Path, which is highlighted in blue:

Related to Hamiltonian paths are Hamiltonian cycles, which add the requirement that there must also be an edge between the start and end vertex, or, equivalently they are a cycle instead of a path that visits every node exactly once. While knowledge of any Hamiltonian cycle trivially allows us to give a Hamiltonian path by simply deleting any edge from it, the opposite does not necessarily hold true. In fact, the graph used in the example above does not even have a Hamiltonian cycle!

To prove that, let us assume a Hamiltonian cycle exists. Since all nodes must be part of the cycle, we can start with any node, say node $C$ . At some point, we must take the connection $\left(E,J\right)$ into the "right house", to visit the nodes in that "house". However, we can now never return to our starting point $A$ without visiting $E$ again. Contradiction found!

Since I, personally prefer cycles over paths (they are 7% more tasty²), take the following graph, which is simply augmented to allow a Hamiltonian cycle:

Take note how the path through the "right house" has to change so that we end up in $G$ , which is connected to the start point of the Hamiltonian path instead of $K$ . Although Hamiltonian cycles are tastier than Hamiltonian paths, finding either one is hard in general.

A ZKP for Hamiltonian Cycles

The core idea of the protocol is as follows: $G$ is a publically known graph, which contains a Hamiltonian cycle that is very hard to compute - and thus secret. This ZKP protocol is intended to prove to someone in possession of the public key ( $G$ ) that someone else is in possession of the private key (the embedded Hamiltonian cycle) without revealing anything other than that.

To bootstrap this protocol, let us start by generating the public graph $G$ and its secret Hamiltonian cycle. To do so, the prover creates a graph consisting of nothing more than a (Hamiltonian) cycle of $n$ nodes, where $n$ is the desired size of the target graph and then hides that cycle by adding a number of random edges and randomly permutating the graph. By construction, the prover knows a Hamiltonian cycle for this graph $G$ .

Since the construction allows any graph that has a Hamiltonian cycle to be generated, there is nothing which is special in $G$ beyond the fact that it has such a Hamiltonian Cycle. This means that we can publish $G$ as the public key. Similarly to current asymmetric cryptographic systems, an attacker with enough computational power can conceivably break the public key by finding a Hamiltonian path. This is not prevented by using a ZKP! It is however made extremely hard by basing the protocol on a problem that is (expected to be) harder than those on which RSA, DSA, etc are built upon³.

The ZKP we are about to craft is going to be a Σ-Protocol, which means it consists of three basic components: A Commitment by the prover, followed by a Challenge from the verifier which is answered in the Response from the prover.

The Commitment

The commitment is going to consist of a version of $G$ , which is hidden in such a way that it is possible to either show that the prover knows a Hamiltonian cycle in the hidden graph or to show that the graph is indeed $G$ . It is already obvious that the prover must never perform both steps on the same commitment.

To begin, we add "not-edges" to the graph (displayed red in the example below), or more formally we do something akin to adjacency matrices: We store a $1$ for each edge that is an edge in $G$ and $0$ for each edge that is not an edge in $G$ . This ensures that the resulting graph will be a clique, making it impossible to garner any information from the degree of individual nodes⁴.

The prover then generates an individual cryptographic commitment for each edge and non-edge (in the example below, the edges become black). Additionally, a single commitment is generated for all labels⁵ (in the example below, the labels fade out).

Finally, the graph is randomly permutated. The result now looks exactly the same for any graph with the same number of vertices as $G$ .

The Challenge

After receiving the Commitment, the verifier poses a simple, binary Challenge to the prover. With $50\%$ chance each, they request that the prover either prove that the Commitment is a commitment to $G$ (and not to some other graph $H$ ), or that the commitment is for any graph with a Hamiltonian cycle.

The Response

To prove that the Commitment is indeed a commitment to $G$ , the prover simply opens all cryptographic commitments. Obviously, this cannot give away any information, as this is nothing more than a random permutation of $G$ , which is nothing that the verifier could not have computed on their own.

Proving that the Commitment refers to a graph for which the prover knows a Hamiltonian cycle is just as trivial: To do so, the prover only opens the commitments to the edges along the Hamiltonian cycle. The only thing that is revealed is an anonymous Graph that consists of exactly one Hamiltonian cycle — something that can trivially be generated by a simulator, meaning that again no information is revealed.

Checking either response is trivial, but obviously may not be omitted nevertheless.

Committing to a Series of Edges

Although this protocol is great for making pretty pictures, it has the drawback that, for a graph $G$ with $n$ nodes, it needs to create $n^2$ commitments to $1$ bit each, plus one to $n\cdot\log_2\left(n\right)$ bits. Even without switching to a different problem or using any advanced techniques, we can do better.

Commitment

To do so, we begin by generating a random permutation $\pi$ to generate a graph $G'$ that is isomorphic to $G$ , just as we did in the previous protocol. For each edge $\left(\alpha, \beta\right)$ we generate a commitment $\text{commit}\left(\alpha, \beta\right)$ , where $\alpha$ and $\beta$ are indices referring to the nodes of $G'$ (and, most importantly, not the labels of the original $G$ ). The full Commitment then consists of these partial commitments in random order and a commitment to $\pi$ . By randomizing the order, we prevent knowledge leaking about which node a commitment belongs to, unless we specifically reveal this information.

Challenge

The Challenge need not be changed in any way, and will again either require a proof that the commitments are bound to $G$ or a proof that they are bound to any graph for which the prover knows a Hamiltonian cycle.

Response

The Response works the same way as before as well. To show that the commitments bind to $G$ , they are all opened. To show that they bind to a graph for which the prover knows a Hamiltonian cycle, only that cycle is shown. While the first response is trivially zero knowledge, the second requires some deliberation.

Remarks

For a graph $G$ with $n$ nodes and $m$ edges, this scheme improves upon the previous scheme by only requiring $m$ commitments to $2\cdot\log_2\left(n\right)$ bits each. This means that for $m < n^2$ (which is a given) we require less commitments, and that for $m\cdot 2\cdot \log_2\left(n\right) < n^2$ , we need to commit to less bits. However, in practice, this advantage grows even further since graphs with more than $2^{64}$ nodes are far too large to be used for authentication and most commitment schemes commit to at least $128 = 2\cdot \log_2\left(2^{64}\right)$ bit at once. Therefore, any single commitment can always hold a commitment to any edge in a graph of less-than-ridiculously-ludicrous-size. Therefore, the number of commitments is the more relevant measure than the number of bits we need to commit to.

While the previous technique is commonly used in literature treating this topic, the technique of committing to a series of edges seems to be mentioned rather rarely. Its relevance is considered to be marginal by the cryptography community , as Hamiltonicity is not usually used to construct practical ZKPs anyway.

A Simulator

To show that it is impossible to gain any knowledge from the execution of the protocol, let me give a small simulator that is able to create a transcript that is indistinguishable from one created by a correct execution of the protocol. This becomes possible by deciding what the "verifier" will ask for in advance, and creating a commitment to match it.

Assuming that the pseudo-challenge will request the opening of the commitments to show that they indeed refer to $G$ , generating a commitment requires simply behaving like a real prover and committing to $G$ , as we do not need to know the Hamiltonian cycle at all.

To create such a commitment that can reveal a Hamiltonian cycle, begin by generating a list of commitments for that cycle: $\text{commit}\left(1,2\right),$ $\text{commit}\left(2,3\right),$ $\ldots,$ $\text{commit}\left(n-1,n\right)$ . Additionally, create $m-n$ additional commitments to bogus edges (where $m$ is the number of edges in $G$ ). Then, randomly permute all of these commitments, and commit to that permutation.

Summary

After exploring the idea of Zero Knowledge Proofs (ZKPs) using simple examples in the previous post, we have explored a mathematical example of a ZKP. By basing it on the hard mathematical problems of finding Hamiltonian paths or cycles in arbitrary graphs, the resulting protocol is both zero knowledge and quantum resistant.

A Hamiltonian path/cycle is a Eulerian path/cycle in the line graph . ↩
Alright, in truth it is only $2\cdot\pi \%\approx 6.283185307179586476925\%$ . And of course only on average. ↩
Most importantly, no quantum algorithm for the Hamiltonian path and cycle problems are known to date. ↩
To show why this is necessary, consider that exactly two nodes in the example have degree 5 ( $C$ and $G$ ) and that those two are adjacent in the Hamiltonian cycle. We now have a tiny piece of the Hamiltonian cycle: $\left(C, G\right)$ . Additionally, the nodes of degree 5 are adjacent to nodes of degree 2 in the Hamiltonian cycle, of which there are only 2 as well ( $A$ and $F$ ). These can only be added in exactly one way to our piece of the Hamiltonian cycle: $\left(A, C, G, F\right)$ . Without going further, it should now be clear that this is hardly "zero knowledge"… ↩
This is only really necessary so that the verifier need not solve the graph isomorphy problem. While recent advances in graph isomorphy solving may make this step technically redundant, it simplifies the protocol significantly and, in practice, can also be expected to be faster to do. ↩