I always had the feeling that Java can play a big role in statistics and machine learning. I see no reason against it. And I really don’t like Python. There are multiple reasons, but this post is not about that.
Investigating and exploring data is a type of activity which is non-linear. Ideas pop up from nowhere and you want to be fast enough to follow, to ask questions and receive answers without all the ceremony required when writing a proper piece of production software. Jupyter notebooks comes handy since it allows you to react fast. But Java is not well represented. There are some good kernels, but each lacks something. The one which I liked most, IJava thanks to SpencerPark (https://github.com/SpencerPark/IJava) looks dead for some years with many open issues with no answer.
I will make a confession: I don’t like this type of project. I would favor working on code for data manipulation or clever algorithms. But the interactivity provided by notebook is too good to not pay the price. So I decided to write one. And I decided to pay a share tribute of my free time to all the issues which will be found and write it as best as I can.
I have to thank to Max Rydahl Andersen for making a better job to collect all those java kernels in a single place and make them available through his amazing https://www.jbang.dev/.
Named parameters allows calling a function using parameter names with their corresponding values instead of using values at given position. This feature is slightly different from language to language, and some of them lack this feature at all. Among additional behaviors we emphasize the possibility of specifying default parameter values and the possibility to specify named parameters in any order.
Java language lacks named parameters feature, for complicated reasons I will not describe here. I have to say that I do not support this feature in general, or as a default function call behavior. The main reason why I don’t like is that I consider the code legibility will suffer if an entire core base is full of named parameters. A secondary reason is that the freedom and flexibility provided is easy to be abused, leading in long term to methods which are longer, overly complicated and harder to maintain. I might be wrong and my perspective could be shaped by the so long used tools and languages. I can accept that.
Still, there are legitimate cases when the presence of named parameters can help a lot and can shorted the verbosity. Some cases I identified are utility methods for building graphical or textual outputs which allows plenty of customization. This story describes how I implemented named parameters for my pet library for building scientific graphs or textual output.
Named parameter features
Since Java does not provide this feature, there only way to have it is to implement it myself. Since I had to implement it, why not devise some nice to have features for my named parameters?
A named parameter has to have an expressive name with Java documentation attached
The method which allows named parameters to allow only parameter types from a given set verified at compile time
The same parameter to be allowed to receive multiple types of inputs, with different logic (for example a color could be expressed as a specific java.awt.Color instance, as a short color String name, or as an index from a color palette)
The logic behind a named parameter from a set to have access to other parameter values from the same set (a color parameter can be specified as a color index from a palette and the palette can be also be specified as parameter)
A named parameter should have a default value and to allow null as a default value
To be easy to create hierarchies of named parameter sets having at the top the default values and to allow named parameter values overrides at any level (Suppose we have a plot with two figures, and in the left figure a histogram, than to allow specify alpha transparency at plot level, that value to be the value for both subplots, but if the histogram wants to override the value to be allowed)
As an implementation fact, to have some base classes which implements as much as possible of the previous features, letting the programmer to gain his attention entirely on the named parameter logic.
As a spoiler code, just in case you are bored, looks like the following:
Frame df = Datasets.loadLifeScience().mapRows(Mapping.range(2000));// just two random variables from a random data set, for illustrationVar x = df.rvar(0).name("x");Var y = df.rvar(1).name("y");// 3 x 3 grid cells// 4 top-left cells will contain a scatter// last row, 2 cells on left side filled with a histogram// the others are filled as they come, from up to down, then from left to rightvar fig =gridLayer(3, 3) .add(0, 0, 2, 2, points(x, y, sz(3), palette(Palette.tableau21()), pch(2), fill(2))) .add(2, 1, 2, 1, hist(y, bins(20))) .add(lines(x)) .add(hist(x, bins(20), fill(9))) .add(hist2d(x, y, bins(30), fill(Color.BLUE)));WS.draw(fig);
Please don’t judge my artistic skills, there is nothing to be found there. However, notice in the code sample some highlights from the features I wanted for named parameters. Among them we have:
color can be specified directly or through a palette index, palette which can be chosen, also
named parameters are specified in any order, this is maintained also for the order of subplots, unless index is unspecified
default values exists, those can be overridden at any level: line width specified for all subplots, but customized for one pink histogram
Java implementation details
The main objective is to allow the programmer (me in this case, but you can be also that one) as little effort as possible in a safe way. Regarding safety, I have to admit, I did not achieved everything I wanted. I would wished to avoid a specific cast and force the check at compilation time, but I did not know how to do it.
Anyway, in order to achieve less effort we need some building blocks to contain as much logic as possible. Abstract classes helps with that. We need two of them: one to model the set of named parameters which are part of the same family and another one to model the definition, default values and identity of named parameters. Thus we have two classes: NamedParam and NamedParamSet.
First of all we have an abstract class, since anyone in need has to derive from this class to create named parameters for some set. The generic type S describes the named parameter set to which belongs the defined named parameters.
The named parameter contains two pieces. The name is used to identify the parameter. As you can notice later, you cannot create in the same set two named parameters with the same name. The interesting part is the value, which in this case is a function. The value is obtained when needed by calling a function which produces the parameter value using also information from the whole set of parameters. This is needed if we want to produce conditional parameter values.
The generic type V describes the type of the value hold by this parameter.
NamedParamSet type
publicabstractclassNamedParamSet<Sextends NamedParamSet<S, P>, Pextends NamedParam<S, ?>> {protectedfinal Map<String, P> defaults;protectedfinal Map<String, P> params =new HashMap<>();protected S parent;protectedNamedParamSet(S parent) {this.parent = parent;if (parent !=null) { defaults =null; } else { defaults =new HashMap<>();register(this.getClass()); } }publicvoidsetParent(S parent) {this.parent = parent; }publicfinal S apply(P... parameters) {for (var p : parameters) { params.put(p.getName(), p); }return (S) this; }privatevoidregister(Class<?> clazz) {Field[] fields = clazz.getDeclaredFields();for (var field : fields) {if (!Modifier.isStatic(field.getModifiers())) {continue; }try { field.setAccessible(true); Object fieldValue = field.get(null);if(!(fieldValue instanceof NamedParam)) {continue; } P np = (P) fieldValue;if (defaults.containsKey(np.getName())) {thrownewIllegalArgumentException("Duplicate keys in named values."); } defaults.put(np.getName(), np); } catch (IllegalAccessException e) {thrownewRuntimeException(e); } } }protected Object getParamValue(P p) { S root = (S) this; S ref = (S) this;while(true) {if (ref.params.containsKey(p.getName())) {return ref.params.get(p.getName()).getValue(root); }if (ref.parent !=null) { ref = ref.parent;continue; }return ref.defaults.get(p.getName()).getValue(root); } }private S getRoot() { S root = (S) this;while (root.parent !=null) { root = root.parent; }return root; }publicabstract S bind(P... parameters); @Overridepublic String toString() {returngetClass().getSimpleName()+getRoot().defaults.values().stream().map(p -> { Object value =getParamValue(p);return p.getName() +":"+ (value ==null?"null": value); }).collect(Collectors.joining(",", "{", "}")); }}
This is much more consistent. The class encodes a set of named parameters. It does not make sense to mix parameters from various named parameter sets. As an example where is no reason to mix graphical plot parameters with printing parameters or with anything else. As such, we need a set to manage all the named parameters from the same family. This is the class.
We have two generic types in the class definition. The first type S is a recursive generic and describes actually the subclass implementation of the base abstract set class. A construct similar with the Self type from other languages. The second generic type P describes the named parameter class implementation managed by S set class.
The named parameter set has three properties, the default parameter value’s map, the current parameter value’s map and a parent set. If an S instance does not have a parent (the parent reference is null) it’s values are the default values. If it has a parent, the current parameter values are stored in params map.
An interesting fact is what is going on in the constructor for the case when an S instance does not have an instance. As we said, if it does not have a parent the values are the default values. But the default values are not present, so how do we know the default values? We inspect the S instance (this newly constructed instance, actually) and we search through reflection all the static members which extends P class, and we register them as default values. All this is done in method register. In that place we check at runtime if the parameter values identity condition is met.
The next method is setParent which is used to set a parent to the current set. This exists to allow some flexibility when building hierarchies (for example we might not know at creation time the hierarchy, this allows one to bind parents later).
It follows method apply. This is used to set parameter values from an array of values. This is usually received through a var args method.
The getParameterValue employs the hierarchy to compute parameter values. What it does is to find in the hierarchy, starting with the current set and goings further through parent’s references, the first set which contains a value for the requested parameter. Once this parameter instance is found, its building value function is called with the original base parameter values set.
The other methods are self explanatory. I will not waste your time with their description.
A small example
We will not implement a first set of named parameters.
Nothing fancy. We have two constructors, one of them creating only to reduce some typing. A legitimate question would be if we need a separate class at all. Well, we need, because we will use this class to receive parameters for this set and only those kind of named parameters, and we do that using this type.
Above we found the named parameter set implementation. The GOpts type is the set and GOpt type is the named parameter. There are three sections in this implementation: the definition of the named parameters, the builders for named parameters and the getters for named parameters. Additionally we have a constructor and some utility optional method.
The definition of the named parameters contains as one private static variable for named parameters. The definition of the named parameters contains its identity and default values. We have defined four named parameters. The first two named parameters are as simple as possible: a boolean and a color. The third named parameter is interesting in that it will not have a getter. It’s value is computed entirely based on the value of another parameter, therefore we do not need to specify it. The last parameter simply build a repeated array of size given as parameter, filled with values from another parameter. I have to recognize that my creativity suddenly dropped at negative levels, at it does not make much sense. It exists only for illustrative purposes.
The builder section is basically our interface. What we have in this section are static methods which produces named parameters. The name of the methods serves as the name in generic named parameter theory. You have to choose it to be as expressive as possible. The reference used in constructor make the connection between the method and it’s underlying named parameter. The second method is the implementation function. Having that we have the freedom to produce values for any parameter using any input we want. Since we build a function, the evaluation is lazy and happens when is needed. Since we have access to the whole parameter set, here is the place which can be used to create conditional computed values for parameters.
The last section contains getter methods. Their code exists in the parent class, but a casting is required and an identifier which is hidden from outside.
That’s all for implementation part. How we will use it? Below you will find an example of implementation.
We can create hierarchies with ease and inject them in our hierarchies. A good example for that is the plotter tool that I built in my pet library rapaio. For example a grid layer creates a grid of cells which splits a graphical view. It contains inside a set of graphical option. Each cell or rectangular group of cells can be rendered by a plotter. The plotter will create a set of graphical options which has as a parent the grid layer options. Next, on each plotter and artist can start to do things (histograms, points, lines, whatever is available). Each artist creates it’s own set of graphical parameters which are bound to their plotter set of parameters. Thus you can specify named parameter values for all object at grid layer level, or only for objects of a plotter, or custom for each artist.
Specifying named parameters exploits the var args syntax and the named parameter type you have defined. Thus, one can easily use them.
The explained scenario for graphical parameters is described in some medium/large classes, so I will not put it here to waste your space. For further reference you can check it here: GOptions.java. It contains 4.5 hundred lines, but I am sure it is very easy to read.
In the end I will show you another example of a graph build using the same syntax:
I do not see the need to implement stuff like this directly in the language. Besides the reasons that I described in the introduction, I can add one more. It is impossible to build a reach set of features for named parameters directly in the language. As such, I do not consider it’s worth the effort.
But some simplifications can be done. One idea would be to use ASM and annotations to generate getters at runtime. Another straightforward idea is to generate builders for annotated parameters. Maybe using an annotation to designate named parameter definitions to improve type safety is also desirable, instead of obey a convention, as I did. Maybe that efort will be done, by me or somebody else, who knows. But even if not, I hope this construct can be simply used in your projects, by simply copying the base classes code.
From time to time I like to take a look at Java JDK source code to learn more about how the contributors think about various topics. While I was looking into how enum is implemented, I noticed something very familiar in the declaration of Enum class, the class which is responsible with enum implementation in Java language.
At the first glance one might notice nothing unusual. At a second look one can see that E is a generic parameter which extends Enum<E>, and feels like mind bending or insane. However I used it in the past, in my personal projects, without knowing it has a name: recursive generics. In this story I will present you what this construct is.
A standard example
We will illustrate a situation very similar with the enum construct. Suppose that we have a base class for numbers and we want to have positive integers and real numbers specializations. Additionally, we want to compare positive integers only with positive integers, and real numbers only with real numbers. A standard way to write those classes is the following:
var n1 =newPositiveInteger(1);var n2 =newPositiveInteger(2);System.out.println(n1.compareTo(n2));
And comparison between two different specialized classes will fail with exception at runtime.
var n1 =newPositiveInteger(1);var n2 =newReal(2);System.out.println(n1.compareTo(n2));
When we declared in the abstract class the requirement to implement the comparator, the base class has no way to tell that we want a specialization comparison. Thus we have to treat the situation with instanceof and throw exceptions if needed.
The question is if it could be done in a better way. Of course, the recursive generics is the answer, otherwise I would not write this story. We go abruptly into the new code.
What happens? In the declaration of Number we introduced a new generic type N and we use this new type to give additional directive to Comparator contract.
Who is N? Well, N is a generic type which inherits from Number, but because Number has a generic type, we have to add to it a generic type parameter. Why the same type N? Simply because N is the generic type which inherits from Number. It looks like a circular argument, but it works.
What we achieved? We have told to the class Number that the inheritors will have to implement a Comparable interface with their own derived types (to not use the base type). In other words, in the base class we described a behavior that the inheritors will have to have, and we described that behavior in terms of their own types.
Now the implementation is simplified. But we achieve more with that. We also allow the compiler to help us and to not allow other implementations. A code like the one below will not even compile.
var n1 =newPositiveInteger(1);var n2 =newReal(2);System.out.println(n1.compareTo(n2)); // <- this will not compile
Back to Java enum implementation
I prepared that artificial example for illustration purposes. The main reason was to explain what happens with enum implementation in Java. Behind the keyword enum is that class which also uses recursive generics. The reasoning should be clear now. They used this type of description to allow each enumeration class to perform comparisons only between elements of the same concrete enumeration class. It makes sense to be like that, in the end it is completely illogical to allow comparisons between elements of different types.
When you create an enum basically, under the hood, you inherit this class. The recursive generics are used here only to limit the behavior of comparator (and not only). We can think of this usage as a bound on allowed types.
Fluent API
Another usage of recursive generics is fluent API. This is often seen in builder patterns, but it is not restricted to it. A fluent API is a way to chain methods which returns a pointer to this. Let’s look at some form of builder.
classPerson {private String name;privateint age;public Person withName(String name) {this.name = name;returnthis; }public Person withAge(intage) {this.age = age;returnthis; }}
This becomes problematic. Chaining methods from both classes is not possible. Once you call a method from Person you will not see methods from Cyclist. Even if one can manage to chain all calls using a proper ordering of method calls, the resulting object must be cast back to derived class.
You can notice the cast to generic parameter. This is safe since we know that this pointer is a derived class by construction.
An example from my pet project
Perhaps there are different use cases for generic types I am not aware of. In fact I was not aware that this construct had a name before reading about enum implementation, but I recognized the pattern immediately because I have used it before. I will give here two examples from my pet statistics and machine learning library I contribute to when I want to learn more about various data related topics rapaio.
The sample code above describes the class declaration for some ML model classifier and the abstract class every classifier model should implement. It might look ugly at first sight, but it does have a straightforward semantic.
Every classifier model is configured by a custom set of parameters specific for each type. The parameter management is implemented in a base class called ParamSet, which you can see it uses the recursive generic construct in order to allow fluent API. The classifier model abstract class provides some general parameters for all classifiers and also some generic behaviors. Those are further propagated through recursive generics to it’s final implementations, in this case the NaiveBayes class.
The classifier model abstract class has also additional generic types. Those types are the class which contains results of the classification prediction and a class which provides information for each iteration (if the classifier is iterative). Those two components reuses also the recursive generic types to allow full access to implementation methods.
In contrast with the complexity of the type declarations, the usage becomes elegant and simple.
Frame train =null;Frame test =null;CForest cf = CForest.newModel() .runs.set(10) // parameter for all classifiers .baggingMode.set(BaggingMode.SOFT_VOTE) // parameter specific to CForest .rowSampler.set(RowSampler.bootstrap(0.8)); // parameter for all classifiersvar result = cf.fit(train, "y").predict(test);result.printSummary();
Final thoughts
I think that every one who is serious about its work has his own share of rediscoveries. It would be great if from time to time those would be real discoveries, but even if not, there are some great things about re-inventing the wheel. It is a great feeling that you do something which resonates with the work of other minds, most of the time greater minds than yours. This is reassuring in the sense that it gives you a confidence boost.
Additionally, rediscovering things allows one to have a deeper understanding compared to the direct contact through learning from external sources. This is because when you learn from somebody, you also take with you some perspectives which helps on short term, but can limit yourself because those perspective are too tightly connected with what you learn and often times it remains the only perspectives available to you.
One kind of optimization offered by some compilers is tail call optimization. This optimization does not bring much, since the programmer can always tailor his code without recursion, especially in an imperative language. On the other side, recursive code often times more elegant, so why we don’t let the compiler do the nasty stuff when it is possible? In this article I will present a neat way to implement tail call optimization in Java using byte code manipulation with ASM.
What is tail call recursion?
A tail call recursion is a special form of recursion where the last operation is the recursive call. It belongs to the more general class of tail calls, where the last operation is a method call. I will limit myself to the more restrictive case of tail recursion. Let’s illustrate with an example.
longfactorial(int n, long f) {if(n<2) {return f; }returnfactorial(n-1, f*n);}
As one can see, the last operation is a call to the same function, but with different parameters. The next example is not a tail recursion.
longfactorial(int n) {if(n<2) {return f; }return n *factorial(n-1);}
The reason why the previous example is not a tail call recursion is that the last operation is not the recursive call, but the multiplication operation. Multiplication happens after the recursive call returns.
A tail recursion has a specific form which allows a faster execution by avoiding allocating a new stack frame since the execution can utilize the current stack.
Anatomy of a method call
If you don’t know much of how Java Virtual Machine make method calls this is a brief overview of it. The idea is almost universal in programming, however the details presented are specific to JVM.
In order for a method to be able to execute it needs a space called frame, where some specific things should be contained:
local variables space: a fixed array of entries with values of various types
operand stack: a stack where the current operands are stored
There is also an execution stack managed by JVM. The JVM execution stack, collects frames. When a method is called for execution a new frame is created, initialized properly and pushed on the JVM execution stack. The eventual parameters of the method call are collected from the current stack and used for the initialization of the new frame. After the method ends it’s execution, the returned value (if any) is collected, the frame allocated for that method call is removed from the JVM stack, the previous frame is referenced and the collected return value is pushed on stack.
The size of the local variables and operand stack parts depends on the method’s code, it’s computed at compile time and is stored along with the bytecode instructions in compiled classes. All the frames that correspond to the invocation of the same method are identical in size, but those that correspond to different methods can have different sizes.
When it is created, a frame is initialized with an empty stack, and its local variables are initialized with the target object this (for non static methods) and with the method’s arguments (in this order). For instance, calling the method a.equals(b) creates a frame with an empty stack and with the first two local variables initialized to a and b (other local variables are uninitialized).
As you can see, calling methods in chains will increase the space required by the JVM execution stack since for each inner method call a new frame is pushed on the stack. Since stack is limited, calling multiple time nested methods can fill the JVM execution stack to the point that it throws exceptions like StackOverflowError, about which you have heard about.
Exhausting stack space is often encountered when one implements algorithms in recursive fashion.
A practical example of a compiled class
Let’s study the generated code for the following class.
As you can see we have a simple class which implements the factorial function in a tail recursive fashion. The first method is a facade for better user experience, while the second method implements the actual computations.
Let us explain the byte code. We have a class which implements an interface and we have a default constructor.
The default constructor is optional in Java language, but it is not in JVM bytecode. The default constructor has, as a first instruction ALOAD 0 which loads the this pointer on the operational stack from local variable with index 0 (remember how the local variables are initialized). The second instruction invokes method init of class Object. This method takes one parameter, which is a pointer to an object. This pointer is taken from operational stack (this is why we have the first instruction). It returns void as it is signaled in it’s description with V. After that it simply returns the control to the previous call. Notice that it follows the description of the initial frame with variables used, the number of entries allocated for operand stack and for local variable table.
It follows the description of method fact, which takes a single integer argument (described by I) and it returns a long value (described by J). The instruction list starts with loading this pointer (ALOAD 0), the value of the parameter (ILOAD 1) and pushing on operand stack the constant long value 1 (LCONST_1). All the three values loaded on operational stack are used by the following instruction which is a call to method factTailRect in reverse order (it’s a stack). After the method call ends, it’s returned value is pushed on the operand stack. That value is used by the next instruction LRETURN which returns the long value from the operand stack and resumes the control to the calling method.
The description of method factTailRect is a little bit more complex, but not overly complicated. Load the value of the first parameter on operand stack (ILOAD 1) and also the integer constant 1 (ICONST_1). The next instruction compares the value of the parameter with the constant (IF_ICMPGE L1) and if the variable is greater or equal than constant go to label L1, otherwise continue. If the comparison fails than loads the value of the second variable on operand stack (LLOAD 2) and returns it (LRETURN).
At the label L1 there are instructions for the recursive call. First the this pointer is loaded on stack in preparation for recursive call (ALOAD 0). Then the first variable is loaded on stack (ILOAD 1) and constant 1 (ICONST_1). Those two are used by subtract operation which decrease the value of variable with the value of constant (ISUB). The returned value is put on stack. Notice at this point that on stack we have two values: the value of this pointer and the decreased value of variable (the subtraction instruction pop up two operands from stack and put back one, the result).
Next the second variable is put on stack (LLOAD 2) and also the value of the first variable (ILOAD 1). The first variable still has the original value, the modified one being on stack. The integer variable is converted to long type (I2L) and both values are multiplied (LMUL). The multiplication uses the last two stack operands and push back the result of multiplication.
The operand stack has now three values, the values required for the recursive call function (INVOKEVIRTUAL). The three operands are consumed by method call and the result is put on stack. The final result is returned (LRETURN). Last part is the description of the frame which contains three local variables and has allocated 6 places on stack and 4 on local variable table.
I hope you were not bored reading all of that. Maybe you know how the JVM stack machine works, but I have put that description for the case when you don’t.
The structure of a tail call recursion
In general a tail call recursion has a very simple structure. A tail recursive method has three phases. The first phase are the stopping rules. Those rules defines when the recursion will end. The second phase contains calculations and the third stage is the recursive call who’s result is returned. Often times, when the computation is simple the last two phases are merged into a single one, where the computation happens just before passing parameter values. This is the case with our method.
Having this clear design some observations can be made which leads to a straightforward optimization.
The first observation is that since the same method is called, the shape of the frame for the recursive call is identical with the current frame. The reason is that the shape of a frame is determined at compile time and remains fixed. Since we call recursively the same method, we are sure that the current frame fits the needs of the recursive call. The possible optimization is to avoid creating a new frame which has to be pushed on JVM execution stack with each call.
The second observation is that in order to reuse the current frame we need to prepare the stack and local variables in the same way as it would be if a proper frame initialization would happen. However this is very easy to be done simply because the last call before return is the recursive call. In order to make a recursive call the stack needs to be filled with the value of this pointer and all parameter values in order. The call would pop all those values from the current operational stack and would initialize the next frame with those values. This is what we have to do. Simply to take all those values and properly initialize the local variables of the current frame. The values are already prepared for us.
Using ASM to transform byte code
ASM is a wonderfull and neat library which allows one to analyze, transform and generate byte code at compile time or at runtime. It is used by many platforms and tools, including the OpenJDK compiler itself. I have not enough words to describe the usefulness and elegance of this library and I feel in great debt to its creator and contributors.
ASM library allows one to transform byte code using two approaches: event based and tree based. I will use the tree based API since the changes are not trivial and could not be performed in a single pass of the parser. This is the code used to optimize a method which is tail recursive:
classTailRecTransformerextendsClassNode {privatestaticfinal String METHOD_SUFFIX ="TailRec";publicTailRecTransformer(ClassVisitor cv) {super(ASM9);this.cv = cv; } @OverridepublicvoidvisitEnd() {// we optimize all methods which ends with TailRec for simplicity methods.stream().filter(mn -> mn.name.endsWith(METHOD_SUFFIX)) .forEach(this::transformTailRec);accept(cv); }voidtransformTailRec(MethodNode methodNode) {// method argument typesType[] argumentTypes = Type.getArgumentTypes(methodNode.desc);// iterator over instructionsvar it = methodNode.instructions.iterator(); LabelNode firstLabel =null;while (it.hasNext()) {var inode = it.next();// locate the first label// this label will be used to jump instead of recursive callif (firstLabel ==null&& inode instanceof LabelNode labelNode) { firstLabel = labelNode;continue; }if (inode instanceof FrameNode) {// remove all frames since we recompute them all at writing it.remove();continue; }if (inode instanceof MethodInsnNode methodInsnNode && methodInsnNode.name.equals(methodNode.name) && methodInsnNode.desc.equals(methodNode.desc)) {// find the recursive call which has to have// same signature and be followed by return// check if the next instruction is return of proper typevar nextInstruction = it.next(); Type returnType = Type.getReturnType(methodNode.desc);if (!(nextInstruction.getOpcode() == returnType.getOpcode(IRETURN))) {continue; }// remove the return and recursive call from instructions it.previous(); it.previous(); it.remove(); it.next(); it.remove();// pop values from stack and store them in local // variables in reverse orderfor (int i = argumentTypes.length -1; i >=0; i--) { Type type = argumentTypes[i]; it.add(newVarInsnNode(type.getOpcode(ISTORE), i +1)); }// add a new jump instruction to the first label it.add(newJumpInsnNode(GOTO, firstLabel));// finally remove the instruction which loaded 'this'// since it was required by the recursive callwhile (it.hasPrevious()) { AbstractInsnNode node = it.previous();if (node instanceof VarInsnNode varInsnNode) {if (varInsnNode.getOpcode() == Opcodes.ALOAD && varInsnNode.var ==0) { it.remove();// we remove only the last instruction of this kind// we don't touch it other similar instructions // to not break the existent codebreak; } } } } } }}
I really hope the code and comments are self contained. I will briefly present the logic of it for consistency.
In order to transform a method with tree API of ASM library, one needs to change the values in class MethodNode since this is the representation of JVM byte code in the ASM library. For simplicity, I created a transformer which tries to optimize all the methods who’s name ends with suffix TailRec. This is for illustrative purpose, an annotation would be preferable, but require more code and building an agent.
The core of the optimization logic lies in method transformTailRec. This method receives the corresponding representation of the bytecode of any class method who’s name ends with our sufix. The optimization has the following stages.
We identify the first code label. This is the start of the code for the recursive methods. We will use this label when we will replace the recursive call with a simple jump instruction. This jump instruction is goto. As a fun fact this infamous instruction does not exist in the Java language for good reason. This kind of uncontrolled jump would break all the accounting machinery of the JVM. However the same instruction exists in JVM. Because in JVM we can jump only inside a set of instructions from the same method call, it is safe to be used.
Instead of the recursive method call which would create a new frame, we will reuse the current frame. The next stage is to remove the recursive call and the return instruction after it, altogether with preparing the local variables and stack for next use. In place of the recursive call we introduce a goto instruction which points to the first label. Basically we implemented a while loop. The stopping conditions are already in the code, so we will not obtain an infinite loop because of the optimization.
We are done!
Testing the recursive tail optimization
A complete treatment of this would imply implementing a Java agent which would optimize the code before class loading. A avoided those complications because it is irrelevant to the subject. Maybe in the future I will create a tiny github project with this annotation and optimization.
To keep thing simple I wrote a custom class loader which creates classes with optimized code. Java allows one to have two classes with the same specification if those classes are loaded by different class loaders. In order to be easy to use them, I created also an interface.
In this way we will have two classes, one optimized and the other not optimized, and both implementing the same interface. In this way we can use them in the same JVM instance and test them with JMH. For reference the code for class loader is listed below.
The difference is pretty clear. The optimized version is faster. The differences are not large, thought. This is simply because of the small number of recursive calls, which has to be small to not produce integer overflow.
Sum JMH Benchmark
For illustrative purposes I implemented a sum over the values of an array in tail recursive manner. Of course, this is not the best option, but if the container would be a linked list it would be an appealing implementation in functional style. Below is the implementation of the sum method.
We also notice improvements produced by tail call elimination.
Final remarks
I am not a huge fun of recursion in general, and I tend to prefer tight iterative implementations when is possible. This is by no means an argument against tail call optimization, especially tail call recursion.
Java at this moment does not offer any kind of tail call optimizations. Project Loom seems to take into consideration an even greater class of call optimizations, but those does not look to be a priority now. The tail recursion optimization can be implemented instead into a library, like Lombok, offering the proposed optimization when a given annotation is present.
Recently I was playing a little with prime numbers and I needed a prime number generator for which I wrote a sieve of Eratosthenes in Java. Since I wanted all primes up to 10^9, I became a little impatient about the time taken to generate those primes. So I started to optimize it incrementally. Also I found a fast version which uses parallelization. This is the story.
Sieve of Erathostenes
Sieve of Eratosthenes is an ancient algorithm which finds prime numbers up to a given limit. It is an iterative algorithm. In each iteration it finds the next prime number in increasing order and eliminates all the numbers which are divisible with the current prime number found.
This is a well known algorithm and I expect that most of the people interested in computer science or math has played with it and they admired its simplicity and elegance. Here is a direct straightforward version as it is taught in schools.
int[] sieveClassic(int n) {
int len = 0;
int[] numbers = new int[n + 1];
for (int i = 0; i < n; i++) {
numbers[i] = i;
}
for (int p = 2; p < n; p++) {
if (numbers[p] == 0) {
continue;
}
numbers[len++] = p;
for (int j = p; j < n; j += p) {
numbers[j] = 0;
}
}
return Arrays.copyOf(numbers, len);
}
There is nothing significant about the implementation. It initializes an array of consecutive numbers, and iterate starting with 2 through those numbers. When a number is found as prime (it’s value is different than \(0\) ), the number is moved to the first part of the array, and all numbers divisible with that one are marked as non-prime, aka. sets value to \(0\).
The performance for generating all prime numbers less than 10^9 is around 40 seconds. A lot of time. One can only imagine the eternity awaiting him, if he wants more than that. An immediate natural question is if we can do something to improve the situation.
First improvement: eliminating even numbers
There are two simple observations which we can make since the beginning. The first one is that we know that other than \(2\), all odd numbers are not prime. And half of those numbers are odd. There is plenty of computation done for that and we can elude that with ease.
A second observation is the following one. Suppose we are at an iteration and we found a prime number \(p\). We want to mark as not prime all numbers divisible by \(p\). Those numbers are \(\{2p, 3p, 4p, .. , p^2, p(p+1), ..\}\). The numbers up until \(p^2\) are already eliminated. The reason being that in those numbers \(p\) is multiplied with a number smaller than \(p\). But all those numbers are multiple of smaller than \(p\) prime numbers, and therefore are already marked as non prime. We can start the inner loop from \(p^2\). Another small trick to shave off some computation even further is that we can step through the inner loop with \(2p\) because we know odd numbers are not what we search for.
Also, because we know we have to start from \(p^2\) all primes greater than the square root of \(n\) will mark nobody. Using that limit we can further eliminate one computation, which is the initialization of inner loop.
int[] sieveSkip2(int n) {
int[] numbers = new int[n + 1];
for (int i = 1; i < n; i += 2) {
numbers[i] = i;
}
numbers[0] = 2;
int len = 1;
int lim = (int) Math.floor(Math.sqrt(n));
for (int p = 3; p < n; p += 2) {
if (numbers[p] == 0) {
continue;
}
numbers[len++] = p;
if (p < lim) {
int step = 2 * p;
for (int j = p * p; j < n; j += step) {
numbers[j] = 0;
}
}
}
return Arrays.copyOf(numbers, len);
}
Those small adjustment moved down the total running time for getting all primes smaller than 10^9 towards 12 seconds. This is a good progress, but other tricks can be applied.
Second improvement: working with booleans
We are working with an array of integers because we have anyway to return an array of prime numbers and we reuse the array for that purpose. However we know that if the jumps in memory are large, than it can affect the computation also. At the price of an additional boolean array, we can improve the computation due to the eventual memory caching which will provide fewer cache miss operations.
The next improvement is also straight forward. We simply have to keep the computations in the boolean array. Remember that a boolean element takes the same amount as a byte value. We could work with bytes also, but I preferred boolean because is much cleaner for conditional statements. When a number is marked as not a prime, it’s flag (value from its indexed position) is set to true.
int[] sieveSkip2Flags(int n) {
int[] numbers = new int[n + 1];
boolean[] flags = new boolean[n + 1];
numbers[0] = 2;
int len = 1;
int lim = (int) Math.floor(Math.sqrt(n));
for (int p = 3; p < n; p += 2) {
if (flags[p]) {
continue;
}
numbers[len++] = p;
if (p < lim) {
int step = 2 * p;
for (int j = p * p; j < n; j += step) {
flags[j] = true;
}
}
}
return Arrays.copyOf(numbers, len);
}
This improvement, although small, reduced computation time towards 6.6 seconds. It’s a good lesson for anybody who ignores cache and memory characteristics in real life. And are so many, unfortunately.
Going further: Eliminate numbers divisible by 2 and 3
The next improvement is similar in idea with removing odd numbers, but induces some complicated code. It is not hard, but we have to do some bookkeeping which is sometimes confusing.
It is clear that going further by eliminating from calculation all the candidates which are divisible by \(2\) and \(3\) will reduce further the computation. The number of odd numbers is \(n/2\). The number of candidates divisible by \(3\) is \(n/3\), but half of them are odd, so they are already counted. Still, it gives us \(n/2+n/6=2n/3\) reduction of candidate numbers, which is quite good. In order to further make the computation fast we would better reduce the memory.
Let’s look at the numbers after we marked with angle brackets the numbers which were not marked as non-prime (prime candidates) by iterating with \(2\) and \(3\):
What you can notice is that it is a pattern of length \(6\). At each \(6^{th}\) element, the position where the candidates are found is fixed. This tells us that if we take the value modulo \(6\) we will know if that number is a prime candidate or not. This happens because \(p \% 6 \in \{1,5\}\), or in other words \(p = i*k +/- 1\).
This suggests a new improvement. What if we would store only the values which are prime candidates after removal of all numbers which are divisible by \(2\) or \(3\)? In other words we will keep only the two columns which contains proper prime candidates, and we will not waste space and computation for other numbers. For that purpose we add two helper functions to convert a position into a number and viceversa.
The code with this new improvement is listed below:
int[] sieveSkip23Flags(int n) {
int size = n / 3;
int[] numbers = new int[size];
boolean[] flags = new boolean[size];
numbers[0] = 2;
numbers[1] = 3;
int len = 2;
int lim = (int) Math.floor(Math.sqrt(n));
int pos = 0;
while (true) {
while (pos < size && flags[pos]) {
pos++;
}
if (pos == size) {
break;
}
int p = posToPrime(pos);
if (p >= n) {
break;
}
numbers[len++] = p;
if (p < lim) {
int step = 2 * p;
for (int j = p * p; j < n; j += step) {
int rest = j % 6;
if (rest == 1 || rest == 5) {
flags[primeToPos(j)] = true;
}
}
}
pos++;
}
return Arrays.copyOf(numbers, len);
}
private int posToPrime(int pos) {
return ( (pos & 1) == 0 ) ? 3 * pos + 5 : 3 * pos + 4;
}
private int primeToPos(int p) {
int pp = (p + 1) / 3 - 2;
return (3 * pp + 5 < p) ? pp + 1 : pp;
}
As one can see, we have some magical arithmetic in the helper functions. You can trust the code, it works. It looks ugly because I tried to put it in a form which involves less operations. It is possible to exist an even uglier formula, but I was not able to find it.
The time reduction is interesting, since it came down a little bit upper than 4 seconds, which is a drastic reduction from the previous variant.
At this moment I was fed up with searching for better variants. I know we can go further reducing other candidates, but I expect the ugliness of the code would explode and the gains will not be significant. I spent some dozens of minutes trying for something better and I was ready to give up.
Can we go in parallel?
Does the Sieve of Eratosthenes algorithm allow parallel computation? Honestly, I did not consider it a viable options since I have not read about parallel methods. I am not saying that there are not versions of it, I am saying that I did not know about them (later edit: I had read some papers after writing this, there are some parallel implementations but not like the version I found. It looks like a new approach and I have to research more on that topic).
The idea of parallelism in the sieve algorithm is based on the following observation. Suppose we are iterating through numbers and we found the prime number \(p\). After adding the prime number to the list, we start to eliminate multiples of \(p\) starting with \(p^2\). The point is that between \(p\) and \(p^2\) there are also other prime numbers. For example between \(5\) and \(25=5^2\) there are also \(7,11,13,17,19,23\). If, instead of finding the prime number \(p\) we search for all prime numbers greater or equal with \(p\) and less than \(p^2\), we can add all of them to prime numbers and set the flags in parallel.
Using parallel computation leads to concerns regarding concurrent updates or related things. This is not a concern here, however. Reading flags is done only before finding the next chunk of prime numbers. Updating flags happens only in the parallel loops. Additionally all those updates does a single thing, put value true instead of false. If this happens concurrently it really does not matter who wins, since the result is invariant to order. As such, this allows a brutal concurrent processing with no thread safe mechanisms.
Additionally we do some optimization in such that we do not launch new threads when there is nothing to do (the squared prime number is greater than \(n\)).
In the code below I used a preview version of structured parallel constructs from Java 19. It can be implemented also in the classic way, using normal threads and a thread executor. But I like to take a look into the new features, and the code using StructuredTaskScope is much cleaner anyway.
int[] sieveSkip32FlagsParallel(int n) {
int size = n / 3;
int[] numbers = new int[size + 1];
boolean[] flags = new boolean[size + 1];
numbers[0] = 2;
numbers[1] = 3;
int len = 2;
int lim = (int) Math.floor(Math.sqrt(n));
int pos = 0;
try (StructuredTaskScope<Object> s = new StructuredTaskScope<>()) {
while (true) {
while (pos < size && flags[pos]) {
pos++;
}
if (pos == size) {
break;
}
NextPrimes ps = addNextPrimes(numbers, len, flags, pos, n, size);
if (ps.step == 0) {
break;
}
List<Callable<Object>> tasks = new LinkedList<>();
for (int i = len; i < ps.len; i++) {
int pp = numbers[i];
if (pp < lim) {
tasks.add(() -> {
int step = 2 * pp;
for (int j = pp * pp; j < n; j += step) {
int rest = j % 6;
if (rest == 1 || rest == 5) {
flags[primeToPos(j)] = true;
}
}
return null;
});
}
}
len = ps.len;
if (!tasks.isEmpty()) {
for (var task : tasks) {
s.fork(task);
}
s.join();
}
pos += ps.step;
}
s.shutdown();
} catch (InterruptedException ignored) {
}
return Arrays.copyOf(numbers, len);
}
record NextPrimes(int step, int len) {
}
private NextPrimes addNextPrimes(int[] numbers, int len, boolean[] flags, int pos, int n, int size) {
int p = posToPrime(pos);
int square = Math.min(p * p, n);
int i = pos;
while (i < size) {
if (flags[i]) {
i++;
continue;
}
int pp = posToPrime(i);
if (pp >= square) {
break;
}
numbers[len++] = pp;
i++;
}
return new NextPrimes(i - pos, len);
}
private int posToPrime(int pos) {
return ( (pos & 1) == 0 ) ? 3 * pos + 5 : 3 * pos + 4;
}
private int primeToPos(int p) {
int pp = (p + 1) / 3 - 2;
return (3 * pp + 5 < p) ? pp + 1 : pp;
}
There is a little bit more code than the initial version, but it is still pretty clean. Honestly, I did not expect a large improvement, considering the work which has to be done in parallel. Despite the expectations, the implementation performed very well, pushing down the execution time further to 2.2 seconds for generating all prime numbers less than 10^9. It almost halved the previous time!
Benchmarks
To have a proper assessment of the performance I implemented some JMH (Java Microbenchmark Harness) test and measure the performance. I used various sizes of prime number limit starting with \(100\) and incrementing with one order of magnitude until 10^9. The average time for each variant is listed below.
Even if the Sieve of Eratosthenes is an ancient and well known algorithm implemented by most of us from undergraduate school, it still can offer opportunities for improvements. I am not aware of this way of exploiting parallel computation for this algorithm. It might be a new idea or not. However, the implementation is not hard or inaccessible and the implementation of all the variants with testing took me about 4 hours. It can be easily adapted for long integer values. The same idea can be readily applied also to a segmented version.
I hope you enjoyed that small journey around the Sieve of Eratosthenes.
There are a lot of opinions regarding the for each construct and some of them are plain wrong. I am usually not bothered by wrong opinions, but this is a simple issue which can be clarified. I was triggered by two articles on Medium which appears first when you search google for “java foreach performance” (Which is Faster For Loop or Foreach in Java and For-each loop vs “forEach” method in Java 11. Which one is faster?). This is a bit of a rant, but the main purpose is to put things in place.
Language Specification
Since we are talking about a language feature the first reference should be Java Language Specification, the part of interest can be found here: 14.14.2. The enhanced for statement.
I strongly encourage you to read the whole section, but here is a brief summary of it:
The type of the expression must be Iterable or an array type (§10.1), or a compile-time error occurs.
If we have an Iterable, the code emulates a for loop which uses hasNext() and next(), the type of element is the generic type or raw type if none can be infered
If we have an array than a classical for loop with index is produced
Here we could also quote Joshua Bloch in his book Effective Java:
The for-each loop, introduced in release 1.5, gets rid of the clutter and the opportunity for error by hiding the iterator or index variable completely. The resulting idiom applies equally to collections and arrays:
// The preferred idiom for iterating over collections and arrays
for (Element e : elements) {
doSomething(e);
}
When you see the colon (:), read it as “in.” Thus, the loop above reads as “for each element e in elements.” Note that there is no performance penalty for using the for-each loop, even for arrays. In fact, it may offer a slight performance advantage over an ordinary for loop in some circumstances, as it computes the limit of the array index only once. While you can do this by hand (Item 45), programmers don’t always do so.
The note on the eventual performance gain is not actual. That is because just-in-time compiler, which, as usual, does a wonderful job, optimizes the for each loop code. End of story. We could stop here for what is worth. Still, some misconceptions popped up out of nowhere and we should debunk them. In this matter, I have to point out that it does not help much to name for each loop an enhanced for.
Some well-known facts
Microbenchmarking is hard
A microbenchmark is a benchmark designed to assess the performance of a small or even tiny piece of functionality. Good examples are math functions, sorting numbers, and, as it is our case, iterating over the elements of a collection or array.
The microbenchmarks are easy to implement, and hard to get it right. The main reason is that to have a good grasp of performance one has to eliminate all sources of bias, constant folding, dead code elimination, add warm up iterations and multiple tests, ensure environment and context diversity. Its a deep topic in itself. The de facto standard for Java micro benchmarking is JMH (Java Microbenchmark Harness).
It is meaningless and totally misleading to perform only one measurement. If you don’t create conditions to eliminate outside factors, at least do as many measurements as possible to control external factors. Even when multiple measurements are realized a single number like the mean is not enough, you have to provide also information about the deviations from that central behavior (standard deviations or whatever you feel is informative).
Foreach loop is a syntax sugar
Because foreach loop is a syntax sugar, it will have the same effect as when you would write the generated code. No more, no less. We don’t count here wrong or different scenarios, like calling iterator’s hasNext() method multiple times or iterating over array from end to beginning. Still, the following conclusion was presented in one of the linked articles:
The for loop method is faster when using ArrayList because the for-each is implemented by the iterator, and it needs to perform concurrent modification verification.
Plain wrong in multiple ways. First of all the internal iterator loop calls only hasNext() and next(). There is no other call, thus no modifications or updates. The test was also performed on an ArrayList, which is not a thread safe construct, thus no concurrency work required. Nobody should expect anything else from foreach loop other than what could you expect if you would have to put there the iteration or usual for loop. As a consequence, any difference in performance should be explained by external factors, and there are plenty to consider.
Don’t compare apples and oranges
In one of the cited articles the author compares a foreach loop and a indexed for loop over LinkedList. Perhaps this comparison could be useful if one wants to illustrate what happens when you don’t consider the specific characteristics of used data structures. Here, however, it is used as an argument in favor of foreach loop.
When using LinkedList, the for-each is much faster than the for-loop, because LinkedList is implemented by using a two-way linked list. Each addressing needs to start from the header node. If we need to traverse LinkedList, we need to avoid using the for-loop.
I’ll give you another example: binary search in a sorted array using foreach loops. Immediately goes into O(nlogn). Could that be used as an argument against foreach loops?! Sadly, it’s a good example of ignorance.
JMH Microbenchmark
I have put us a small JMH benchmark to illustrate the similar performance. The code is as simple as possible and tests the behavior of foreach and indexed for on arrays and ArrayList.
@BenchmarkMode( {Mode.Throughput})
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class ForEachOrNot {
@State(Scope.Benchmark)
public static class MyState {
@Param( {"100", "1000", "10000", "100000", "1000000"})
private int len;
private int[] array;
private ArrayList<Integer> arrayList;
@Setup(Level.Invocation)
public void setup() {
array = IntArrays.newSeq(0, len);
arrayList = new ArrayList<>();
for (int i = 0; i < len; i++) {
arrayList.add(i);
}
}
}
@Benchmark
public void arrayFori(MyState s, Blackhole bh) {
long sum = 0;
for (int i = 0; i < s.len; i++) {
sum += s.array[i];
}
bh.consume(sum);
}
@Benchmark
public void arrayForeach(MyState s, Blackhole bh) {
long sum = 0;
for (int i : s.array) {
sum += i;
}
bh.consume(sum);
}
@Benchmark
public void arrayListFori(MyState s, Blackhole bh) {
long sum = 0;
Iterator<Integer> it = s.arrayList.iterator();
while (it.hasNext()) {
sum += it.next();
}
bh.consume(sum);
}
@Benchmark
public void arrayListForeach(MyState s, Blackhole bh) {
long sum = 0;
for (int i : s.arrayList) {
sum += i;
}
bh.consume(sum);
}
public static void main(String[] args) throws RunnerException {
Options opt = new OptionsBuilder()
.include(ForEachOrNot.class.getSimpleName())
.warmupTime(TimeValue.seconds(1))
.warmupIterations(3)
.measurementTime(TimeValue.seconds(1))
.measurementIterations(10)
.forks(1)
.resultFormat(ResultFormatType.CSV)
.result(Utils.resultPath(ForEachOrNot.class))
.build();
new Runner(opt).run();
}
}
Below are the results for iterating over arrays, and the second part displays results iterating over ArrayList. Notice that the iteration over arrays uses normal indexed for, and iteration over ArrayList uses the Iterator construct. Those are correspondent code replacements at runtime, without considering the possible optimizations realized by just in time compiler.
Notice the score which displays throughputs (number of operations per millisecond). The numbers are mixed and similar if one considers the standard errors. As expected.
Personal opinion on for each in Java
The for each syntax sugar construct is a good thing, since it shortens some code and cuts possibilities of making mistakes. I think it was not a wise decision to name this construct enhanced for loop. While I agree it is an enhancement in terms of code shortening, it is not a faster way to iterate, and there is some opportunity to be considered faster by various people when this is not the case.
The behavior of foreach differs if one uses the construct over iterable collections versus arrays. As it was pointed out, when iterating over arrays it uses a normal indexed loop, since this is the natural construct. I don’t consider mixing those behaviors under the same syntax as a good thing. Any change to how one wants to iterate over an array reverts back the syntax to the old syntax. And the old syntax for arrays is very different. I don’t like this and I consider it an inconsistency while reading code. This is why I find annoying when IDEs struggles to suggest for each construct over arrays. Of course, it is a minor issue, and overall the advantages could prevail.
For each loop is a shorter syntactical sugar and can be used whenever you have opportunity, but don’t expect any performance improvements.
When I learnt about Cholesky decomposition it struck me immediately the elegance and simplicity of the solution. It was clear that the symmetry contained some sort of redundancy inside, but it was not clear at all how that symmetry could be exploited. Here I will try to pay my insignificant tribute to the man who discovered it and present you his story.
We are talking about France
The story of Andre Louis Cholesky traces back probably in Poland, when it is believed his ancestors moved to France with the Napoleon army1. He was born in Montguyon (October 15th, 1875), north of Bordeaux where he absolved Lycée and entered L’Ecole Polytechnique. There he had brilliant teachers, among many being Camille Jordan (yep, the Jordan normal form was not invented by a basketball player) and after graduation he entered Army’s artillery branch as Second Lieutenant.
He was designated to work in Tunisia and Algeria, and after he moved to the Geodesic Section of the Army Geographic Service 1905, where he was reported to have “… a sharp intelligence and a great facility for mathematical work, having an inquiring spirit and original ideas”2. He was part of various geographical works like the topological cartography of various territories.
Following Gauss
The methods used at the time dates back to Gauss when he was calculating planet’s orbits. It consisted in the following ideas. The triangulation of the terrain started with the measurement of a line segment, and iterate from there measuring only angles (horizontal and vertical). The angles are much easier to measure than the length for long distances. After many observations one will end up with a lot of numbers. Using the numbers directly poses some problems, since those numbers contains errors. The procedure than uses some nonlinear equations which are linearly approximated by the first terms of Taylor approximation. $$C x = b$$
But this poses a problem, since \(C\) is not square. The system of equations has more equations than unknowns. The situation is not improved by the fact that the values involved \(C\) and \(b\) contains errors, as we mentioned. Here Gauss came in place and solution is found as the one which minimize the \( L_2 \) norm.
One recognize that this is a quadratic problem and has a minima where the derivative is zero (since the Hessian is positive). Thus, taking derivatives and equaling with 0, the well-know normal equations arise: $$C^T C x = C^T b$$
where \( C \) is the matrix of coefficients, \(x\) is the desired solution and \(b\) is the equality vector. At that time the solution was found by Gaussian elimination. This process is tedious when there were many unknowns and computed by hand. Many wise people from Geographical Service were in a search for a better way to solve this.
Hidden in plain sight
A careful observer will notice that \(C^T C\) is symmetrical. This is intriguing since this is not an ordinary linear system, it has some regularity inside, some redundancy given by symmetry, which one can feel it can be exploited. It must be something more, but what could that be?
Among many who perhaps saw the same facts, the young officer found the solution. Perhaps he did not saw the non negative requirement (know today), besides symmetry, but he knew something. A triangular system of equations is much easier to solve, than a regular system. This is provided by Gaussian elimination process itself, but he pushed this much further. Let’s see. A lower triangular system looks like this: $$\begin{aligned} l_{11}x_1 & & &= b_1 \\ l_{21}x_1 &+l_{22}x_2 & &= b_2 \\ l_{31}x_1 &+l_{32}x_2 &+l_{33}x_3 &= b_3 \end{aligned}$$
And one can notice that the first unknown \(x_1\) is trivial to find. Knowing \(x_1\), computing the second unknown \(x_2\) is again trivial, and so on. The previous example has 3 unknowns for illustration purposes only, you should think about its generalization. This procedure is called forward substitution and a similar one exists for high triangular systems named backward substitution.
Let’s start afresh by replacing \(C^T C\) with\(A\). $$A x = b$$ The Cholesky method is to find a decomposition $$A = L L^T$$ where \(L\) is a lower triangular matrix. Having done that we can see that: $$ Ax = L L^T x = L ( L^T x) = L y = b$$
First we solve for \(y\) in \( L y = b \) by forward substitution. After we found \(y\) we solve for \(x\) in \(L^T x = y\) by backward substitution. This is clever, isn’t it? But how to find \(L\)? And here comes the elegance, because if you see how it’s done it is remarkably natural.
If it looks intimidating, than it’s fine. You are not alone. Imagine that you are Cholesky trying to figure it, having at disposal some paper pieces and a pen. You will appreciate more his efforts. After some time of paying attention to the patterns, you start to see the regularities.
$$l_{jj} = \sqrt{a_{jj} – \sum_{k=1}^{j-1}l_{jk}^2}$$ $$l_{ij}=\frac{1}{l_{jj}}(a_{ij}-\sum_{k=1}^{j-1}l_{ik}l_{jk}) \text{ for }i>j$$
Still, if it looks intimidating, look at the code for Cholesky-Banachiewitcz algorithm below:
for (i = 0; i < dimensionSize; i++) {
for (j = 0; j <= i; j++) {
float sum = 0;
for (k = 0; k < j; k++)
sum += L[i][k] * L[j][k];
if (i == j)
L[i][j] = sqrt(A[i][i] - sum);
else
L[i][j] = (1.0 / L[j][j] * (A[i][j] - sum));
}
}
On today computers, this task is trivial. At that time it was a real challenge. Cholesky described in his paper3 that he was able to solve a system of 10 equations with 5 decimal digits in 4 to 5 hours. He reported that his method was also successfully used for several systems of dimensions 30, and for one of dimension 56 (but he did not mention how long it took). This is top performance by any measure!
Legacy
The life of Andre Cholesky was a mix between duty and passion. Perhaps his first ideas about his method came to him in France, when he participated at the topological measurement of the country. He definitely enrich those ideas during his missions in Tunis and Algeria.
But this is not over. He was sent to Crete where he conducted the topological efforts and he managed to end his mission against nature impediments. He then served as artillery officer. Since 1916 until February 1918, he was sent to Romania, serving as the Head of Romanian Geographical Service4. There, again, he impressed so much that he was promoted to the Commander rank.
Back to France he was serving as artillery commander and in August 31th, 1918 he fell for his country5. A loss for his country, for Geodesy and for the whole world. But his legacy, his method remains with us, being the first choice for solving systems of equations when coefficient matrix is symmetric non negative definite. This is very often the case when solving over determined systems with least squares.