Finite field for developers
What is finite field?
Finite field, also known as Galois field, is a finite set, which means multiplication, division, addition, and subtraction are defined and satisfied in a finite set. Take a prime P = 7, and an integer modulo P falls down to one of 0~6 ({0,1,2,3,4,5,6}). Let this set be S (derive from (mod P)). S can also be denoted GF(7), and the aforementioned four operations are performed as follows. Take a=4 and b=5 as examples in GF(7) and throw them into the four operations and see how it works.
First, addiction is very straightforward, a+b = 4+5 (mod 7) = 2
.
For subtraction— a-b = a+(-b)
— what does -b
means here? Let’s turn this subtraction into a form of addition. In b + g = 0
, we can say g is -b (i.e., g is the inverse element of b in addition). In GF(7) we’re considering now, 5 + 2 (mod 7) = 0
, so -b is 2. Back again, a-b = a+(-b) = 4+2 (mod 7) = 6
.
Multiplication is also one as easy as addition, a*b = 4*5 (mod 7) = 6
.
Division might be the trickiest one among other things— a/b = a*(b^-1)
— as we did for subtraction, first turn this into a form of multiplication. The aim here is to find the inverse of b
. Basically, b * g = 1
, where g is the inverse of b, so we have to find out what g is. 5*3 (mod 7) = 1
. g (i.e., the inverse of b) is 3. Back again, a/b = a*(b^-1) = 4*3 (mod 7) = 5
.
Multiplicative group
In number theory, ℤₙ is interchangeable with GF(n), which means ℤ\(_7\) = GF(7) = {0,1,2,3,4,5,6}, and ℤₙ\(*\) indicates the multiplicative group of for ℤₙ modulo n. Then, what is the multiplicative group? Take an example of ℤ\(_8*\) (ℤ\(_8\) = {0,1,2,3,4,5,6,7}). When we take a subgroup (denoted, \(S\)) in this set, call \(S\) a multiplicative group if multiplication by any two elements in \(S\) comes to one element in \(S\) (i.e., closed in multiplication). Say \(S\) = {1,3,5,7} and check by hand whether it is a multiplicative group.
1 * 3 mod 8 = 3, 1 * 5 mod 8 = 5, 1 * 7 mod 8 = 7
3 * 5 mod 8 = 7, 3 * 7 mod 8 = 5
7 * 5 mod 8 = 3 --> Yes! it is a multiplicative group!
{2,3,5,7} is not a multiplicative group as 2 * 3 mod 8 = 6
, which is not in the group. It’s worth noting that each element in a multiplicative group must have its inverse element. That’s why 0 never gets into a multiplicative group because 0 has no inverse element. In \(S\), interestingly, inverses of 3, 5, and 7 are themselves respectively, as 3*3 mod 8 = 1
, 5*5 mod 8 =1
, and 7*7 mod 8=1
. Since \(S\) is closed in multiplication, and every element has the inverse element, we can conclude \(S\) is a multiplicative group, that is to say, ℤ\(_8*\). One more thing worth noting is that \(S\) doesn’t include any of the factors made of n (8, here), which means 8=2*4, that is to say, the factors of 8 are 2 and 4, which are not in \(S\). Plus, 6 is also excluded as can be divided by 4. So, one way to get a multiplicative group is to exclude such factors from an initial set.
This time, let’s take a prime number for n and see what happens. ℤ\(_7\) = GF(7) = {0,1,2,3,4,5,6}. Interestingly, since 7 has no factors because is a prime number, ℤ\(_7*\) gets the same as ℤ\(_7\), except 0 of course. So, ℤ\(_7*\) = {1,2,3,4,5,6}. This fact, which ℤₙ\(*\) has every element from 1 to n-1 when n is a prime number, means a lot and can be utilized in many places.
Generator
If for every element in ℤₙ\(*\) we have \(g^k=a\) for some integer k, we call \(g\) generator or primitive root of ℤₙ\(*\).
For example, 3 is a generator of ℤ\(_4*\) as 3^0 = 1, 3^1 = 3, 3^2 (mod 4) = 1, 3^3 (mod 4) = 3, 3^4 (mod 4) = 1, ...
.
And, 3 is a generator of ℤ\(_7*\) as 3^0 = 1, 3^1 = 3, 3^2 = 2, 3^3 = 6, 3^4 = 4, 3^5 = 5, 3^6 = 1, ...
.
But, 3 is not a generator of ℤ\(_1\)\(_1\) because {3,9,5,4,1} which is only half of ℤ\(_1\)\(_1\).
One thing you may already notice is it starts to cycle once an outcome is down to 1 because \(g^0\) is always 1 no matter what \(g\) is. Another thing, more importantly, is the order of a group generated by a generator. ℤ\(_7*\) with a generator 3 leads to a group {1,3,2,6,4,5}, which contains every element but does not follow a predictable pattern in its order. As n gets bigger and bigger, its order would increasingly get more unpredictable. Based on this characteristic, we can make a quiz that is difficult to solve using a computer. That is, given \(a = g^k\ mod\ n\) where n is prime, how easy is it to find out what \(k\) is when \(a,g,n\) are open to the public? For example, what is k for 4 = 3^k mod 7
? To be honest, this example is easy to solve just using a brute-force strategy because n and g are relatively small. But, what about using very big numbers? That would get extremely difficult to solve, more precisely, take an infeasibly long time to compute by brute force. This aspect of the generator makes a good fit with crypto technologies.
Use-case: why use a finite field? — consistency
Turn our focus to why finite field matters in computer science. First of all, why and when should we use a finite field over infinite numbers? Assume there is a simple function that runs across two different computers, CA and CB. The function takes a and b as input and does a division like this, a / b = c
. Let’s put 10 and 3 for a and b, respectively, then 10 / 3 = 3.33....
. When we want to represent an outcome (3.33…) that is infinite, we need to cut off it at some point (e.g., 3.3 or 3.33) in order to produce a consistent result. This means that an extra protocol (which points to be cut off) is unnecessarily required.
What if doing the same thing but in a finite field? Let’s do 10 / 3
in GF(11). First, try to find out the inverse of 3 in GF(11). The inverse is 4 as 3 * 4 mod 11 = 1
. Then, 10 / 3 = 10 * 4 mod 11 = 9
. As you can see, its outcome will always be down to one element in GF(11). In other words, it is ensured to get an accurate outcome.
Similarly, the use of a finite field can help avoid overflow and underflow, which may bring on undefined behaviors. Try to write down the below code and run it.
int main() {
unsigned char a = 255;
a = a + 1; // overflow happens!
printf("%02x\n", a); // what comes out?
return 0;
}
Some of you expect the above printf() to print 0, but it may not because it is actually depending on what environments (compiler, hardware, …) are used. This also means that this function can cause inconsistent results. With a finite field, we don’t have to worry about such inconsistencies.
Use-case: why use a generator? — crypto systems
As I mentioned earlier, a generator can yield a computationally unpredictable group, for example {1,3,2,6,4,5} for ℤ\(_7*\). Many crypto schemes make use of this characteristic to make themselves computationally secure (i.e., have no optimized way to compute but just to go brute force). More specifically, there is no optimized way to compute what k is for \(a = g^k\ mod\ n\) where a, g, and n are open to the public.
To see how it comes to reality, take a real-world example of ElGamal scheme, which is a kind of public key infrastructure based on Diffie-hellman key exchange. Say there are two actors— Alice and Bob, and Alice wants to send an encrypted message to Bob.
- (Bob) \(Y_B = a^X\ mod\ p\), where p is a prime number and is open to the public, and a is a generator and is open as well. \(Y_B\) is Bob’s public key that can distribute to others freely. Bob first chooses \(X\) and computes his public key, \(Y_B\), and sends his public key to Alice. \(X\) is Bob’s private key (secret) and must be within the boundary from 2 to p-2 and is chosen randomly.
- (Alice) \(K = Y_B\ ^r\ mod\ p\), where r is a random number and is between 0 and p-1. r is a secret to Alice.
- (Alice) \(C_1 = a^r\ mod\ p\), \(C_2 = K * M\ mod\ p\), \(C = C_1\ ||\ C_2\). M is a plain message Alice wants to send as encrypted. C is an encrypted message. Alice sends C to Bob.
- (Bob) \(K = C_1\ ^X\ mod\ p\), \(M = C_2\ /\ K\ mod\ p\). Bob can decrypt C using his private key along with the given cipher text C.
Even if attackers can get the cipher text, they can’t decrypt it as it’s infeasible to recover Bob’s private key by using the public parameters— Bob’s public key, the plain message, …. Of course, they can attempt to do brute force, that is to say, go with all possible key values until “\(Y_B = a^X\ mod\ p\)” returns Bob’s public key that they already know. However, it will take an infeasibly long time to compute since a^X mod p
is known to have no easy computation methods. This is what many crypto schemes are relying on for security.
Discussed thus far may be enough to grasp why generator matters in crypto systems. But we need to raise one more question which is “why must we choose a prime number for p?”. To answer this question, bring back the examples of ℤ\(_7*\) and ℤ\(_4*\). 3 is a generator for both of them. The former, in which n is 7—a prime number, contains every element from 1 to 6, while the latter, in which n is 4—not a prime number, contains 1 and 3 but not 2. This means that a non-prime number generator most likely yields a smaller set of elements, which greatly reduces search space for attackers to do brute force on. That’s why most systems require to go for prime numbers.
Use-case: binary field (GF(2^k))
Binary field representation is a special form of a finite field, denoted as GF(2^k). And the elements of GF(2^k) are binary polynomials. For example, x^2+x
indicates 110 while x^2+1
indicates 101. Coefficients of polynomials must be either 0 or 1 as are in binary representation. Plus, GF(2^3) = {0, 1, x, x+1, x^2, x^2+1, x^2+x, x^2+x+1}. In binary fields, addition and subtraction are performed through a simple XOR operation and are the same because (1-1) = (1+1) = 0
.
(x+1) + (x^2+x+1) = 010 + 110 = 101 = x^2 + 1 // addition
(x+1) - (x^2+x+1) = 010 - 110 = 101 = x^2 + 1 // subtraction
For multiplication, there is one problem coming up, for example, (x^2 + 1) * (x^2 + x)
- yields the degree of 4 that goes beyond GF(2^3). So, we have to reduce this high degree by modulo with an irreducible polynomial. In GF(2^3), x^3+x+1 can be chosen as a modulus because (1) the degree of this polynomial, 3, equals to k (i.e., 2^3) and (2) this is not divided by any polynomial. (i.e., x^4+1
is not an irreducible polynomial because (x^2+1)*(x^2+1) = (x^4+1)
). For division, as we did for other finite fields, we have to find the inverse and multiply by it.
Thus far, we got what binary field is but not why binary field comes to necessary. It’s a turn to get into why and when it gets needed. Before starting off, I want to make sure that this is MY plausible opinion, which is not backed by any official documents. (because I’ve never seen on the internet an answer as clear enough as I want it to be)
Say that we want to build a mixing function, which is a kind of simple encryption—decryption function. And the function can carry lots of operations from addition to multiplication to division, (e.g., enc(x) = ax^n+bx^k+c-d....
, dec(enc(x)) = .... = x
, this is analogous to AES in concept) and it’s assumed “x” is a 1-byte long value. Building such functions, in this setting, require the following.
- Requirement (R): All possible outcomes produced by operations must fall down to between 0 and 255 in order to meet the statement “x” is a 1-byte long.
To meet this requirement, go first with a prime-driven finite field. 256 is not a prime number so choosing 257 instead. GF(257) produces {0,1,2,….,256}. However, 256 can’t fit in a 1-byte long value as demands 2 bytes to store. Of course, we can simply do subtraction by 1 to map it back to under 255, for example, 2*4 mod 257 = 8 -1 = 7
. But not only does it look not tidy but it’s not correct for 0 - 1
. What happens if using GF(2^8) instead? It yields {0—255} that is exactly of our interest and meets the requirement.
Let’s take one more step further. As I mentioned earlier, we may need an unpredictable order of a finite field and this can be done with the help of a generator. (remember what a generator plays for ElGamal scheme) So, the elements of GF(2^3) can be represented as {g^0, g^1, g^2, g^3} alongside a given generator. With the generator of 3 it produces {3,5,4,7,2,6,1}, but with the generator of 2 it yields {2,4,3,6,7,5,1}.
Use-case: error detection code
Error detection code is one of the biggest applications of finite fields. For example, the most common form of Read-Solomon Code is based on binary fields. I’ll write down about it in a separate post as it has a lot to explain and has much to do with STARK technology that has been central to many Zero-Knowledge-based computations. (e.g., zkVM)