Zero Knowledge Proof using Hamiltonian Cycles

The last post explored the idea of Zero Knowledge Proofs (ZKPs) using simple examples. Now it is time to take a deep dive into a fully functional ZKP based on a mathematical problem instead of hidden doors amongst neighboring houses.

The basic problem is very simple: A Hamiltonian path is a path through a given graph which visits every vertex exactly once. Hamiltonian paths are related1 to the better known Eulerian paths, which visit every edge exactly once. As an example, this graph has a Hamiltonian Path, which is highlighted in blue:

ABCDEFGHJK

Related to Hamiltonian paths are Hamiltonian cycles, which add the requirement that there must also be an edge between the start and end vertex, or, equivalently they are a cycle instead of a path that visits every node exactly once. While knowledge of any Hamiltonian cycle trivially allows us to give a Hamiltonian path by simply deleting any edge from it, the opposite does not necessarily hold true. In fact, the graph used in the example above does not even have a Hamiltonian cycle!

To prove that, let us assume a Hamiltonian cycle exists. Since all nodes must be part of the cycle, we can start with any node, say node CC. At some point, we must take the connection (E,J)\left(E,J\right) into the "right house", to visit the nodes in that "house". However, we can now never return to our starting point AA without visiting EE again. Contradiction found!

Since I, personally prefer cycles over paths (they are 7% more tasty2), take the following graph, which is simply augmented to allow a Hamiltonian cycle:

ABCDEFGHJK

Take note how the path through the "right house" has to change so that we end up in GG, which is connected to the start point of the Hamiltonian path instead of KK. Although Hamiltonian cycles are tastier than Hamiltonian paths, finding either one is hard  in general.

A ZKP for Hamiltonian Cycles

The core idea of the protocol is as follows: GG is a publically known graph, which contains a Hamiltonian cycle that is very hard to compute - and thus secret. This ZKP protocol is intended to prove to someone in possession of the public key (GG) that someone else is in possession of the private key (the embedded Hamiltonian cycle) without revealing anything other than that.

To bootstrap this protocol, let us start by generating the public graph GG and its secret Hamiltonian cycle. To do so, the prover creates a graph consisting of nothing more than a (Hamiltonian) cycle of nn nodes, where nn is the desired size of the target graph and then hides that cycle by adding a number of random edges and randomly permutating the graph. By construction, the prover knows a Hamiltonian cycle for this graph GG.

Since the construction allows any graph that has a Hamiltonian cycle to be generated, there is nothing which is special in GG beyond the fact that it has such a Hamiltonian Cycle. This means that we can publish GG as the public key. Similarly to current asymmetric cryptographic systems, an attacker with enough computational power can conceivably break the public key by finding a Hamiltonian path. This is not prevented by using a ZKP! It is however made extremely hard by basing the protocol on a problem that is (expected to be) harder than those on which RSA, DSA, etc are built upon3.

The ZKP we are about to craft is going to be a Σ-Protocol, which means it consists of three basic components: A Commitment by the prover, followed by a Challenge from the verifier which is answered in the Response from the prover.

The Commitment

The commitment is going to consist of a version of GG, which is hidden in such a way that it is possible to either show that the prover knows a Hamiltonian cycle in the hidden graph or to show that the graph is indeed GG. It is already obvious that the prover must never perform both steps on the same commitment.

To begin, we add "not-edges" to the graph (displayed red in the example below), or more formally we do something akin to adjacency matrices: We store a 11 for each edge that is an edge in GG and 00 for each edge that is not an edge in GG. This ensures that the resulting graph will be a clique, making it impossible to garner any information from the degree of individual nodes4.

The prover then generates an individual cryptographic commitment  for each edge and non-edge (in the example below, the edges become black). Additionally, a single commitment is generated for all labels5 (in the example below, the labels fade out).

ABCDEFGHJK

Finally, the graph is randomly permutated. The result now looks exactly the same for any graph with the same number of vertices as GG.

The Challenge

After receiving the Commitment, the verifier poses a simple, binary Challenge to the prover. With 50%50\% chance each, they request that the prover either prove that the Commitment is a commitment to GG (and not to some other graph HH), or that the commitment is for any graph with a Hamiltonian cycle.

The Response

To prove that the Commitment is indeed a commitment to GG, the prover simply opens all cryptographic commitments. Obviously, this cannot give away any information, as this is nothing more than a random permutation of GG, which is nothing that the verifier could not have computed on their own.

Proving that the Commitment refers to a graph for which the prover knows a Hamiltonian cycle is just as trivial: To do so, the prover only opens the commitments to the edges along the Hamiltonian cycle. The only thing that is revealed is an anonymous Graph that consists of exactly one Hamiltonian cycle — something that can trivially be generated by a simulator, meaning that again no information is revealed.

Checking either response is trivial, but obviously may not be omitted nevertheless.

Committing to a Series of Edges

Although this protocol is great for making pretty pictures, it has the drawback that, for a graph GG with nn nodes, it needs to create n2n^2 commitments to 11 bit each, plus one to nlog2(n)n\cdot\log_2\left(n\right) bits. Even without switching to a different problem or using any advanced techniques, we can do better.

Commitment

To do so, we begin by generating a random permutation π\pi to generate a graph GG' that is isomorphic to GG, just as we did in the previous protocol. For each edge (α,β)\left(\alpha, \beta\right) we generate a commitment commit(α,β)\text{commit}\left(\alpha, \beta\right), where α\alpha and β\beta are indices referring to the nodes of GG' (and, most importantly, not the labels of the original GG). The full Commitment then consists of these partial commitments in random order and a commitment to π\pi. By randomizing the order, we prevent knowledge leaking about which node a commitment belongs to, unless we specifically reveal this information.

Challenge

The Challenge need not be changed in any way, and will again either require a proof that the commitments are bound to GG or a proof that they are bound to any graph for which the prover knows a Hamiltonian cycle.

Response

The Response works the same way as before as well. To show that the commitments bind to GG, they are all opened. To show that they bind to a graph for which the prover knows a Hamiltonian cycle, only that cycle is shown. While the first response is trivially zero knowledge, the second requires some deliberation.

Remarks

For a graph GG with nn nodes and mm edges, this scheme improves upon the previous scheme by only requiring mm commitments to 2log2(n)2\cdot\log_2\left(n\right) bits each. This means that for m<n2m < n^2 (which is a given) we require less commitments, and that for m2log2(n)<n2m\cdot 2\cdot \log_2\left(n\right) < n^2, we need to commit to less bits. However, in practice, this advantage grows even further since graphs with more than 2642^{64} nodes are far too large to be used for authentication and most commitment schemes commit to at least 128=2log2(264)128 = 2\cdot \log_2\left(2^{64}\right) bit at once. Therefore, any single commitment can always hold a commitment to any edge in a graph of less-than-ridiculously-ludicrous-size. Therefore, the number of commitments is the more relevant measure than the number of bits we need to commit to.

While the previous technique is commonly used in literature treating this topic, the technique of committing to a series of edges seems to be mentioned rather rarely. Its relevance is considered to be marginal by the cryptography community , as Hamiltonicity is not usually used to construct practical ZKPs anyway.

A Simulator

To show that it is impossible to gain any knowledge from the execution of the protocol, let me give a small simulator that is able to create a transcript that is indistinguishable from one created by a correct execution of the protocol. This becomes possible by deciding what the "verifier" will ask for in advance, and creating a commitment to match it.

Assuming that the pseudo-challenge will request the opening of the commitments to show that they indeed refer to GG, generating a commitment requires simply behaving like a real prover and committing to GG, as we do not need to know the Hamiltonian cycle at all.

To create such a commitment that can reveal a Hamiltonian cycle, begin by generating a list of commitments for that cycle: commit(1,2),\text{commit}\left(1,2\right), commit(2,3),\text{commit}\left(2,3\right), ,\ldots, commit(n1,n)\text{commit}\left(n-1,n\right). Additionally, create mnm-n additional commitments to bogus edges (where mm is the number of edges in GG). Then, randomly permute all of these commitments, and commit to that permutation.

Summary

After exploring the idea of Zero Knowledge Proofs (ZKPs) using simple examples in the previous post, we have explored a mathematical example of a ZKP. By basing it on the hard mathematical problems of finding Hamiltonian paths or cycles in arbitrary graphs, the resulting protocol is both zero knowledge and quantum resistant.

Footnotes

  1. A Hamiltonian path/cycle is a Eulerian path/cycle  in the line graph .

  2. Alright, in truth it is only 2π%6.283185307179586476925%2\cdot\pi \%\approx 6.283185307179586476925\%. And of course only on average.

  3. Most importantly, no quantum algorithm for the Hamiltonian path and cycle problems are known to date.

  4. To show why this is necessary, consider that exactly two nodes in the example have degree 5 (CC and GG) and that those two are adjacent in the Hamiltonian cycle. We now have a tiny piece of the Hamiltonian cycle: (C,G)\left(C, G\right). Additionally, the nodes of degree 5 are adjacent to nodes of degree 2 in the Hamiltonian cycle, of which there are only 2 as well (AA and FF). These can only be added in exactly one way to our piece of the Hamiltonian cycle: (A,C,G,F)\left(A, C, G, F\right). Without going further, it should now be clear that this is hardly "zero knowledge"…

  5. This is only really necessary so that the verifier need not solve the graph isomorphy problem. While recent advances in graph isomorphy solving  may make this step technically redundant, it simplifies the protocol significantly and, in practice, can also be expected to be faster to do.