Let denote the space of matrices with integer entries, and let be the group of invertible matrices with integer entries. The Smith normal form takes an arbitrary matrix and factorises it as , where , , and is a rectangular diagonal matrix, by which we mean that the principal minor is diagonal, with all other entries zero. Furthermore the diagonal entries of are for some (which is also the rank of ) with the numbers (known as the *invariant factors*) principal divisors with . The invariant factors are uniquely determined; but there can be some freedom to modify the invertible matrices . The Smith normal form can be computed easily; for instance, in SAGE, it can be computed calling the function from the matrix class. The Smith normal form is also available for other principal ideal domains than the integers, but we will only be focused on the integer case here. For the purposes of this post, we will view the Smith normal form as a primitive operation on matrices that can be invoked as a “black box”.

In this post I would like to record how to use the Smith normal form to computationally manipulate two closely related classes of objects:

- Subgroups of a standard lattice (or
*lattice subgroups*for short); - Closed subgroups of a standard torus (or
*closed torus subgroups*for short).

The above two classes of objects are isomorphic to each other by Pontryagin duality: if is a lattice subgroup, then the orthogonal complement

is a closed torus subgroup (with the usual Fourier pairing); conversely, if is a closed torus subgroup, then is a lattice subgroup. These two operations invert each other: and .

Example 1The orthogonal complement of the lattice subgroup is the closed torus subgroup and conversely.

Let us focus first on lattice subgroups . As all such subgroups are finitely generated abelian groups, one way to describe a lattice subgroup is to specify a set of generators of . Equivalently, we have

where is the matrix whose columns are . Applying the Smith normal form , we conclude that so in particular is isomorphic (with respect to the automorphism group of ) to . In particular, we see that is a free abelian group of rank , where is the rank of (or ). This representation also allows one to trim the representation down to , where is the matrix formed from the left columns of ; the columns of then give a basis for . Let us call this a*trimmed representation*of .

Example 2Let be the lattice subgroup generated by , , , thus with . A Smith normal form for is given by so is a rank two lattice with a basis of and (and the invariant factors are and ). The trimmed representation is There are other Smith normal forms for , giving slightly different representations here, but the rank and invariant factors will always be the same.

By the above discussion we can represent a lattice subgroup by a matrix for some ; this representation is not unique, but we will address this issue shortly. For now, we focus on the question of how to use such data representations of subgroups to perform basic operations on lattice subgroups. There are some operations that are very easy to perform using this data representation:

- (Applying a linear transformation) if , so that is also a linear transformation from to , then maps lattice subgroups to lattice subgroups, and clearly maps the lattice subgroup to for any .
- (Sum) Given two lattice subgroups for some , , the sum is equal to the lattice subgroup , where is the matrix formed by concatenating the columns of with the columns of .
- (Direct sum) Given two lattice subgroups , , the direct sum is equal to the lattice subgroup , where is the block matrix formed by taking the direct sum of and .

One can also use Smith normal form to detect when one lattice subgroup is a subgroup of another lattice subgroup . Using Smith normal form factorization , with invariant factors , the relation is equivalent after some manipulation to

The group is generated by the columns of , so this gives a test to determine whether : the row of must be divisible by for , and all other rows must vanish.

Example 3To test whether the lattice subgroup generated by and is contained in the lattice subgroup from Example 2, we write as with , and observe that The first row is of course divisible by , and the last row vanishes as required, but the second row is not divisible by , so is not contained in (but is); also a similar computation verifies that is conversely contained in .

One can now test whether by testing whether and simultaneously hold (there may be more efficient ways to do this, but this is already computationally manageable in many applications). This in principle addresses the issue of non-uniqueness of representation of a subgroup in the form .

Next, we consider the question of representing the intersection of two subgroups in the form for some and . We can write

where is the matrix formed by concatenating and , and similarly for (here we use the change of variable ). We apply the Smith normal form to to write where , , with of rank . We can then write (making the change of variables ). Thus we can write where consists of the right columns of .

Example 4With the lattice from Example 2, we shall compute the intersection of with the subgroup , which one can also write as with . We obtain a Smith normal form so . We have and so we can write where One can trim this representation if desired, for instance by deleting the first column of (and replacing with ). Thus the intersection of with is the rank one subgroup generated by .

A similar calculation allows one to represent the pullback of a subgroup via a linear transformation , since

where is the concatenation of the identity matrix and the zero matrix. Applying the Smith normal form to write with of rank , the same argument as before allows us to write where consists of the right columns of .Among other things, this allows one to describe lattices given by systems of linear equations and congruences in the format. Indeed, the set of lattice vectors that solve the system of congruences

for , some natural numbers , and some lattice vectors , together with an additional system of equations for and some lattice vectors , can be written as where is the matrix with rows , and is the diagonal matrix with diagonal entries . Conversely, any subgroup can be described in this form by first using the trimmed representation , at which point membership of a lattice vector in is seen to be equivalent to the congruences for (where is the rank, are the invariant factors, and is the standard basis of ) together with the equations for . Thus one can obtain a representation in the form (1), (2) with , and to be the rows of in order.

Example 5With the lattice subgroup from Example 2, we have , and so consists of those triples which obey the (redundant) congruence the congruence and the identity Conversely, one can use the above procedure to convert the above system of congruences and identities back into a form (though depending on which Smith normal form one chooses, the end result may be a different representation of the same lattice group ).

Now we apply Pontryagin duality. We claim the identity

for any (where induces a homomorphism from to in the obvious fashion). This can be verified by direct computation when is a (rectangular) diagonal matrix, and the general case then easily follows from a Smith normal form computation (one can presumably also derive it from the category-theoretic properties of Pontryagin duality, although I will not do so here). So closed torus subgroups that are defined by a system of linear equations (over , with integer coefficients) are represented in the form of an orthogonal complement of a lattice subgroup. Using the trimmed form , we see that giving an explicit representation “in coordinates” of such a closed torus subgroup. In particular we can read off the isomorphism class of a closed torus subgroup as the product of a finite number of cyclic groups and a torus:

Example 6The orthogonal complement of the lattice subgroup from Example 2 is the closed torus subgroup using the trimmed representation of , one can simplify this a little to and one can also write this as the image of the group under the torus isomorphism In other words, one can write so that is isomorphic to .

We can now dualize all of the previous computable operations on subgroups of to produce computable operations on closed subgroups of . For instance:

- To form the intersection or sum of two closed torus subgroups , use the identities and and then calculate the sum or intersection of the lattice subgroups by the previous methods. Similarly, the operation of direct sum of two closed torus subgroups dualises to the operation of direct sum of two lattice subgroups.
- To determine whether one closed torus subgroup is contained in (or equal to) another closed torus subgroup , simply use the preceding methods to check whether the lattice subgroup is contained in (or equal to) the lattice subgroup .
- To compute the pull back of a closed torus subgroup via a linear transformation , use the identity Similarly, to compute the image of a closed torus subgroup , use the identity

Example 7Suppose one wants to compute the sum of the closed torus subgroup from Example 6 with the closed torus subgroup . This latter group is the orthogonal complement of the lattice subgroup considered in Example 4. Thus we have where is the matrix from Example 6; discarding the zero column, we thus have

## 6 comments

Comments feed for this article

20 July, 2022 at 4:03 pm

Sam HopkinsDo you know how this compares to using Hermite normal form, which is the more “normal” way to understand subgroups of ?

21 July, 2022 at 7:41 am

Terence TaoGood question! It’s quite possible that the operations considered here (e.g., computing the intersection of one lattice subgroup with another, pulling back a lattice by a linear transformation, or determining whether one lattice is contained in another) could also be calculated using the Hermite normal form instead of the Smith normal form. I have theoretical reasons for preferring the Smith normal form – roughly speaking, the Smith normal form describes lattices by producing a basis, whereas the Hermite normal form instead describes lattices by producing a filtration – but I could imagine that the Hermite normal form is computationally faster for some of the operations considered here. For instance, if one is simply interested in the rank of the lattice and nothing else, I am sure that there would be much faster ways to compute this than to work out the entire Smith normal form.

From the perspective of someone who simply wants to calculate with these subgroups numerically, I would prefer an object-oriented interface in which there were classes of lattice subgroup or closed torus subgroup objects with the operations I actually need (sum, intersection, pullback, equality, etc.), with the precise implementation of these operations (using Smith normal form, Hermite normal form, or any other method) hidden to me (or perhaps set as an optional flag in case the choice could be used to optimise computational performance). It would be convenient for me if such classes already exist somewhere, but I have no experience in trying to search for them (some desultory Google searches did not turn up anything promising).

21 July, 2022 at 8:24 am

Terence TaoOK, now that I have the tip of using the Hermite normal form as an additional search term, I am now beginning to find some hits (e.g., this set of notes) – thanks! It does seem that the Hermite normal form is somewhat faster to compute than the Smith normal form while still being capable of computing most of the operations I actually need (except perhaps the invariant factors). I guess the situation is similar to that of choosing between the singular value decomposition and the LU factorization in linear algebra; the latter gives less information about a matrix, but it is generally faster and would be numerically preferable for certain tasks (such as inverting the matrix).

21 July, 2022 at 9:36 am

AnonymousDr. Tao:

Why do you call the entries of a matrix coefficients? In the first sentence, you said “let denote matrix with integer coefficients”. Shouldn’t one say “with integer entries”? Coefficients usually mean the number in front of an algebraic expression. Or is there a deeper meaning here?

[I am largely viewing these matrices as coefficient matrices for their associated linear transformations, but I have changed the wording to “entries” here to reduce the apparent confusion. -T]25 July, 2022 at 5:17 am

Matt BakerThere is also a correspondence between lattice subgroups and algebraic subgroups of the linear torus , where a lattice subgroup corresponds to the algebraic subgroup This is discussed in Section 3.2 of the book “Heights in Diophantine Geometry” by Bombieri and Gubler, for example. It might be interesting to extend your algorithmic considerations to this setting.

16 September, 2022 at 7:52 am

MarkIt is worth mentioning that these computations are standard in lattice-based cryptography. See these notes of Daniele Micciancio’s (section 4 in particular).

Click to access lec2.pdf