Circuit complexity is a subbranch of computational complexity theory that studies how simple or efficient a process can be when broken down into its most basic steps.
More specifically, if you have a function that takes a bunch of binary inputs and produces a binary output, how few logical steps (gates) do you need to compute it? Some functions are easy, and you can compute them with just a few gates; but others seem to require much more! The mystery is that, for many functions, we still don’t know how simple they could be, or if there’s a clever trick we’re missing.[1]
In this blog post I will try to introduce you to the field, and will walk you through the definitions, methods, main results, open problems, and implications. My goal is to convey the core ideas and intuition, rather than be perfectly rigorous. Happy reading!
The Boolean Function
A boolean function is a function , which takes bits as inputs, and outputs a single bit. We usually denote the input vector as , where is the -th input bit. Examples of such functions are
which outputs 1 if and only if the number of on inputs bits is odd (and is usually called the parity function),[2] and
which outputs 1 if and only if the number of on inputs bits is greater than some constant (The “Th” stands for Threshold).
The Boolean Circuit
Intuitively, a boolean circuit is a description of a computation of a large boolean function , that meticulously concatenates the outputs of weaker boolean functions , which we call gates. The number of input bits in a given gate is sometimes called the fanin or in-degree of the gate, and for the purpose of this post, all gates are of fanin of at most 2. It is easy to show that there are different gates of fanin , and in particular, there are exactly unique gates with two inputs. We denote by the set of all boolean functions with input variables.
x₁ | x₂ | |
---|---|---|
False | False | |
True | False | |
False | True | |
True | True |
Notice that out of the 16 gates of in , only 10 gates depend on both inputs, 2 gates depend strictly on , another 2 depends only on , and the final 2 gates depend on no inputs, and their outputs are constant. Hence, this family of gates actually encapsulates all gates of fanin .
We call a set of boolean functions a basis. Then, a circuit over is a directed acyclic graph where all nodes with are labeled by a variable , and are called the inputs of the circuit. Every other node is labeled by a function (gate) from of variables. In addition, nodes where are called the outputs of the circuit and are labeled by . The standard definition allows only one output node; My definition in this post allows multiple output nodes for generality.[3]
Input | Value |
---|---|
x1 | |
x2 | |
x3 |
Output | Value |
---|---|
y₁ | False |
Circuits Computing Functions
The value of a node on input , is defined inductively:
- If is labeled by an input variable , then simply .
- Otherwise, let be the function labeled by , and let be the nodes with incoming edges to . Then, .
Intuitively, to compute the value of an internal node , we first compute the value of all of it’s predecessors , and then apply the corresponding gate function on the computed value of the predecessors. Since the graph is acyclic, this process always terminates. Finally, we say that the circuit computes a boolean function with a single output , if has a single output node ,and for all input vectors , the value equals to .
Measuring Circuit Complexity
It is common to measure the complexity of a circuit using two different metrics.
- The first is the size of a circuit, measured by the number of gates in it. This measure is fairly straightforward: if we stick with the analogy of gates as very simple pieces of logic, then the more gates a circuit has, the more complex the functions it can represent. When talking about hardware circuits, this directly correlates to the cost, and size, of the corresponding hardware.
- The second one is called the depth of the circuit, and is the length of the longest path in the circuit. Intuitively, this measure indicates how parallelizable the computation is. With real hardware as an analogy, this correlates to how fast the computation can be performed.
We denote by and the minimal size and depth, respectively, across all circuits that computes over . The main point of interest in circuit complexity is the behavior of those complexity measures on different functions and families of functions. Mainly, one can ask himself:
- Are there families of functions that require super-polynomial size or depth?
- Can we prove strong lower bounds on circuit size or depth for explicit functions?
- Are there functions which are small in size, but very sequential (i.e. require large depth)?
- Are there functions which are highly parallelizable (i.e. have low depth) but still require large size?
- How do restrictions on the circuit model affect complexity?
- And many more!
The Complete Basis
Recall that a basis is a collection of gates (boolean functions). We say that a basis is complete if for every boolean function (with any number of variables) there exists a circuit that computes the function over the basis .
It is easy to see that the basis is complete: any Boolean function can be written in conjunctive normal form (CNF), which is just an AND of ORs of variables and their negations.[4] Thus, given the truth table of any function, you can always translate it into a circuit made only of AND, OR, and NOT gates.
x₁ | x₂ | y₁ |
---|---|---|
False | False | |
True | False | |
False | True | |
True | True |
From the construction above it is easy to see that each boolean function can be computed by a circuit over of size at most . In 1956, Muller improved this bound to under any finite, complete basis.[5] We will soon get back to this result!
All Bases Are The Same
To prove that a basis is complete, it is enough to show that all gates in an already known complete basis can be computed by a circuit over . If such translation is provided, given a circuit over one can convert each gate in it to the corresponding representation of the gate over , proving that is complete.[6] Surprisingly, and are each complete on their own!
Input | Value |
---|
Output | Value |
---|
Note that a corollary from the result above is that the choice of the basis impacts the complexity of the circuit by a constant factor only. Formally, we say that for any complete bases , where the constant factor is precisely the size of the largest circuit that translates a gate from to . Similarly, . Thus, we usually omit the base entirely when talking about asymptotic bounds in circuit complexity.[7]
Connections to Complexity Theory
Boolean functions can represent very complex properties. For example, given a graph with vertices, we can describe a boolean function with inputs, one input bit for each possible edge, where the output is 1 if and only if some property of the input graph holds.
One, commonly mentioned example is the function, which outputs 1 if and only if the provided graph has a clique subgraph of size . This problem is , which in simple terms means that it is widely believed that finding the answer is computationally hard.[8] Since a circuit computation can be simulated by a turing machine in polynomial time, providing family of polynomial circuits to would imply that = .
The Complexity Class
In Classical Complexity Theory, we try to classify problems by their difficulty, and try to analyze and understand the the differences between different computational models. The most fundamental complexity class is probably , which is defined to contain all decision problems that can be solved in polynomial time using a turing machine.[9]
The equivalent class in circuit complexity is . Since a specific circuit is designed to handle inputs of exactly bits (unlike a turing machine which can process inputs of all lengths), we define that a decision problem is in if there exists an infinite family of circuits such that is a circuit that solves the instance of the problem for exactly inputs, and the size of the circuit is . Then, we define
In words, is the collection of all decision problems where for each problem, for each input length there exists a circuit of size polynomial in , that decides the problem for all inputs of size .[10]
vs
First, it is not hard to show that by showing a polynomial reduction from a polynomial turing machine to a polynomial circuit family. I won’t go into the details here, and you can take it as a nice exercise.
On the other hand, is strictly more powerful than because it permits non-uniformity: it allows a different polynomial-size circuit for each input length, with no requirement that these circuits be generated by a single, efficiently computable process.
Suppose that is a language over a countable alphabet . Since is also countable, let be an arbitrary enumeration of the words in . We now define a new language
which simply includes all strings of length iff is . We now can easily construct a family of circuits that is in that decides . The circuit would simply yield True on all inputs if , or False on all inputs if . The size of all circuits in the family is , and thus by definition. On the other hand , since if there was a polynomial turing machine computing , a simple reduction would have provided a polynomial algorithm for , which is not in by it’s definition. We have showed that but and thus .[11]
Connections to Cryptography
The boolean function model can be extended to functions that output multiple bits. Given a function , we can define different functions where each is defined such that if and only if the -th bit of is 1. I will use to denote the output vector of such functions, where is the -th output bit.
Factorization
A particularly interesting example of a function with multiple outputs bits is , that takes a binary vector , that encodes a binary integer , and outputs a similarly encoded integer , where is the smallest prime factor of (or 0 if is prime).[12]
This function is well-defined, and factoring large numbers is widely believed to be hard in general: many cryptographic schemes rely on this hardness to ensure their security.[13]
Input | Value |
---|---|
x1 | |
x2 | |
x3 | |
x4 |
Output | Value |
---|---|
y₁ | False |
y₂ | False |
Hash Functions
Another interesting cryptographic primitive is cryptographic hash functions. For our purposes, a hash function is a boolean function , where . Such functions deterministically map large inputs to compact outputs that appear uncorrelated with the input.[14] Hash functions have many applications:
- Password verification: Instead of storing your password as plain text in a database, services store a hash of it instead. When you to log in, your input password gets hashed and compared to the stored hash in the database. This is necessary because even if the database is leaked, the attacker can’t recover your actual password from the hash.[15]
- Message Authentication: When you send data digitally (say, transferring funds through your bank’s website), it is important to ensure that the data was not tampered with on the way. One way to ensure this is to append to your message , where is a shared secret that only you and the other party know, and denotes concatenation.[16][17]
But what makes a hash function secure? One essential property is called pre-image resistance: Given an output , finding a message such that should be computationally infeasible. We can formalize this property using circuit complexity! Given a hash candidate , we can define the inverse of it , that for each hashed value returns any such that (or the if such input does not exist). Then, proving a large lower bound on the complexity of would imply the pre-image resistance of !
It is important to note that the state-of-the-art cryptographic hash functions (mainly, the SHA family[18]) have no such rigorous proof: Their security is based on empirical evidence only (decades of cryptanalysis trying to break them, without success). Providing such proof on existing hashes will be a major achievement!
Almost All Functions Are Complex
A primary result that ignited Circuit Complexity as a field of study came from Shannon in 1949[19], and later revised by Muller (1956).[5] The idea is to use a simple counting argument to show that almost all Boolean functions require a circuit of size . The original proofs are fairly lengthy and technical (Muller has put it in the article’s appendix!), but since then, many slightly modified proofs have been presented.[20][21][22] Below I present a modified version based on the sources mentioned.
Denote with the number of different circuits over with input variables, and internal gates. Now, we want to give a relatively simple expression to bound from above. We construct a set , which consists of all graphs with vertices, where , with the following restrictions:
- vertices have no incoming edges.
- The other vertices have an in-degree of 2. Each of those vertices is also labeled with one of the gate functions.
Clearly, some of these graphs do not represent a legal circuit (mainly, nothing requires the a graph in to be acyclic). Despite that, every legal circuit with inputs and internal gates has a corresponding graph in ! Also, note that the output vertex is not explicitly labeled. In a legal circuit, it is the only gate with an out-degree of zero.
First, it is easy to see that there are exactly different ways to characterize a single gate vertex; There are 16 options for the gate label, and another options for each of the 2 incoming edges. We have exactly such vertices, and hence,
After plugging and simplifying the expression above, we get
Recall that there are only unique boolean functions of variables. But the number of different circuits with gates is bounded above by a value that is times smaller than that! Hence, clearly most boolean functions require at least gates to compute. Finally, recall that we already saw that is an upper bound for the size of a minimal circuit of any function with variables! Hence, this bound is asymptotically tight.
On Lower Bounds
We now show that for certain functions with inputs, any gate computing them must consist of at least gates.
We say that a function depends on a variable if there exists an assignment that fixes all other variables so that changing only from 0 to 1 changes the output of . It is easy to see that in any circuit that computes , there must be a path from every variable that depends on to the output vertex. Hence, each input vertex must have an out-degree of at least 1. Since in our model the in-degree of each gate is , just connecting all input vertices to arbitrary gates will require at least gates, and hence the lower bound.[23]
No Super-Linear Explicit Lower Bounds
The bound above is not tight, and trying to improve it is a nice exercise! Surprisingly however, there is no explicit lower bound which asymptotically better! The current state-of-the-art is a bound of gates by Iwama and Morizumi (2002).[24][25]
I cannot stress enough how absurd and mind-boggling that is: despite the fact that we know that almost all boolean functions are very complex, requiring at least gates as shown above, we do not know how to point at a specific function and state that it requires more than a linear number of gates to compute it! This gap is at the heart of the field, and over the last half a century many researches have tried to tackle this problem without much success.
Final Notes
I hope I managed to spark your curiosity. To conclude, I’d like to emphasize just how rich this field of research is, and provide some possible follow-up research questions, that correspond to subfields or other fields that relate to it:
- Complexity Of Formulas: What happens when we restring circuits to trees, instead of DAGs?
- Unbounded Fanin Models: What happens when we allow the and gates to accept unlimited number of inputs?
- Monotone Complexity: What happens if we disallow the gate? What functions can’t we compute, and how minimal circuit sizes are effected?
- Communication Complexity: Surprisingly, there is a strong connection between circuits and certain models in Game Theory!
References and Further Reading
My research relied extensively on the Boolean Function Complexity book by Stasys Jukna, and can highly recommend it if you want to dive deeper in the topics I’ve discussed, and further topics. Proofreading was done by Almog Ben Chen. Thanks!
- Circuit Complexity Wikipedia ↩
- Parity function Wikipedia ↩
- Boolean Circuit Complexity: Scribe notes. Lecture 1, Section 1.1 Uri Zwick, Omer Shibolet ↩
- Conjunctive normal form Wikipedia ↩
- Complexity in Electronic Switching Circuits Bodegas De Muller ↩
- Boolean Circuit Complexity: Scribe notes. Lecture 1, Section 1.4 Uri Zwick, Omer Shibolet ↩
- Big O notation Wikipedia ↩
- Clique problem Wikipedia ↩
- P (complexity) Wikipedia ↩
- P/poly Wikipedia ↩
- Computational Complexity: A Modern Approach. Section 6.1. Sanjeev Arora, Boaz Barak ↩
- Integer factorization Wikipedia ↩
- RSA cryptosystem Wikipedia ↩
- Cryptographic hash function Wikipedia ↩
- Key derivation function: Password hashing Wikipedia ↩
- Message authentication code Wikipedia ↩
- HMAC Wikipedia ↩
- Secure Hash Algorithms Wikipedia ↩
- The synthesis of two-terminal switching circuits Claude Shannon ↩
- Algorithms and Complexity: Handbook of Theoretical Computer Science. Chapter 14, Theorem 2.4 Ravi B. Boppana, Michael Sipser ↩
- Boolean Circuit Complexity: Scribe notes. Lecture 1, Theorem 1.6 Uri Zwick, Omer Shibolet ↩
- Boolean Function Complexity: Advances and Frontiers. Chapter 1, Lemma 1.12 Stasys Jukna ↩
- Boolean Function Complexity: Advances and Frontiers. Section 1.6: A 3n Lower Bound for Circuits Stasys Jukna ↩
- An Explicit Lower Bound of 5n−o(n) for Boolean Circuits Kazuo Iwama, Hiroki Morizumi ↩
- Boolean Function Complexity: Advances and Frontiers. Section 1.5.2: Explicit Lower Bounds Stasys Jukna ↩