Code Generation Synopsis
One of the areas I have dug into lately is around code generation and class file manipulation. It’s actually quite amazing how much you learn and understand when you begin to understand what the JVM is doing and how class files are constructed. Even though on a day-by-day basis, class generation is rarely used, it really buys you a great appreciation and I heavily suggest digging into it, even if only a little. I have a much better idea of how various code will affect the JVM in terms of performance and handling. There are a few ways which you can generate byte code at runtime, of which only a few I will touch on. Click to keep reading for my synopsis.
There are essentially two main types of runtime code generation: class file generation and proxy class generation. Proxy class generation actually uses class file generation behind the scenes but allows the developer to stay above the low level byte code by using actual Java code to inject or intercept method calls. The two libraries I typically use are either CGLib or Cojen. CGLib is a proxy-based generation, whereas Cojen is byte-code manipulation. There are various other libraries including BCEL, ASM, and Javassist. There are many more than that, but those were the ones I have looked into briefly…if you know of others, feel free to drop a comment. I personally prefer Cojen only because it is most familiar to me. They all have the basic idea of allowing byte code to be generated and injected into dynamic class files.
Let’s start with a synopsis of what we are going to performance test and the effects. One of the more popular coding styles is AOP (aspect-oriented programming) which basically dynamically surrounds a given code block or method to perform some related (or somewhat unrelated) action and then delegating to the underlying instance. For example, you may have some method that requires transaction semantics. Rather than code the transaction handling directly into the business logic, delegation can be used to wrap the method instead. This is often times referred to separation of concerns in order to decouple cross cutting concerns from actual business logic. This mechanism is typically done through code generation by dynamically creating the wrapper classes to perform the delegation. In my test scenario, I am going to recreate a simple version of transaction semantics so that I can begin and end a transaction, delegating to the underlying handler in the process. To test this out, I will use pure Java solution, Cojen for dynamic class file generation, and CGLib for proxy or interception handling.
But first, let’s create our base handler class and method:
public class Handler { public long calculate(int count) { long value = 1; for (int i = 0; i < count; i++) { value *= i; } return value; } }
Nothing really fancy here…just a simple calculate method that essentially multiplies continually up to the given count. The calculate method is what we will make transactional by starting and stopping a transaction per invocation:
public class Transaction { private Lock lock = new ReentrantLock(); public Transaction() { super(); } public void startTransaction() { lock.lock(); } public void endTransaction() { lock.unlock(); } }
Pretty straightforward transactional semantics that performs a lock and unlock per transaction. The point of this exercise is comparing and contrasting the libraries, so the code here is fairly simple obviously.
Next, let’s setup a simple test harness to actually invoke the calculate method. The harness can take in any Handler class and simply executes the calculate method over and over, averaging the results.
public class Harness { public static void test(Handler handler) { // dry run first to setup classloader handler.calculate(10); int runs = 1000; long total = 0L; for (int i = 0; i < runs; i++) { // run timer long start = System.nanoTime(); for (int j = 0; j < 10000; j++) { handler.calculate(10); } long end = System.nanoTime(); long duration = end - start; total += duration; } System.out.println("TOTAL: " + total / (double) runs + " ns"); } }
To give ourselves a baseline without the transaction semantics, let’s test the performance of purely invoking the calculate method.
public static void main(String[] args) { Harness.test(new Handler()); }
On my Macbook Pro laptop, I was averaging around 3,400 nanoseconds per test.
With the baseline in place, let’s get to the actual test cases. First up, let’s look at a pure Java solution. This, in theory, should be the fastest with no dynamic code generation, all the JVM compiler hooks in place, etc. However, in practice, creating a custom delegate class per business class purely to perform transaction semantics can become quickly cumbersome and error prone (ie: copy’n'paste errors). Nonetheless, it is an interesting comparison to the real world scenarios with code generation.
public class JavaTest extends Handler { private Handler delegate; public JavaTest(Handler delegate) { this.delegate = delegate; } public long calculate(int count) { Transaction trans = new Transaction(); trans.startTransaction(); long result = this.delegate.calculate(count); trans.endTransaction(); return result; } }
Basically, the JavaTest class extends the base handler and then overrides the calculate method at which point it begins the transaction, delegates to the underlying instance, and ends the transaction. Using our harness, we can quickly test that:
public static void main(String[] args) { JavaTest test = new JavaTest(new Handler()); Harness.test(test); }
In my testing that resulted in 450,000 nanoseconds. So, our transactional semanatics adds quite a bit of overhead, but that was to be expected with the lock/unlock behavior. The real test is to get a baseline for the remaining dynamic generation scenarios and see how those compare to 450,000 nanoseconds.
The easiest way to get something like transactional semantics is using CGLib. CGLib is purely a Java library and requires no understanding of JVM byte code or class files. Behind the scenes, it generates byte code and class files, but to the developer, you create a method interceptor that can be used to perform various logic. CGLib has the concept of Callback classes such as MethodInterceptor that whenever a method is invoked on an underlying proxied instanced, then it invokes the callback to perform the actual handling. However, the same callback is invoked for every method, so you have to perform method detection to determine the proper course of action. For our test, this is pretty simple. If the method is calculate, surround with a transaction, otherwise, delegate accordingly.
import java.lang.reflect.Method; import net.sf.cglib.proxy.Enhancer; import net.sf.cglib.proxy.MethodInterceptor; import net.sf.cglib.proxy.MethodProxy; public class CGLibTest { public static Handler createHandler(final Handler handler) { // Enhancer is used to enhance an existing class with custom functionality return (Handler) Enhancer.create(Handler.class, new MethodInterceptor() { @Override public Object intercept(Object obj, Method method, Object[] params, MethodProxy proxy) throws Throwable { // check if method being invoked is our calculate method if ("calculate".equals(method.getName())) { // surround with transaction Transaction trans = new Transaction(); trans.startTransaction(); // delegate to the underlying handler (final method param above) long result = ((Long) proxy.invoke(handler, params)).longValue(); // end the transaction and return the result trans.endTransaction(); return Long.valueOf(result); } // otherwise, just delegate accordingly else { return proxy.invoke(handler, params); } } }); } }
That should be fairly trivial to understand. CGLib provides all the information necessary to perform whatever actions are required. The proxy object allows you to invoke the method on another instance (ie: delegation) or to invoke the super class. There are several other Callback-related classes that may be of interest that I encourage you to look into on your own. Getting back to the results, let’s create our harness and calculate.
public static void main(String[] args) { Harness.test(createHandler(new Handler())); }
For me, that resulted in 865,000 nanoseconds (a 92% increase from the standard Java test). Obviously, there is overhead in CGLib to create the proxy and generically invoke the callback. Further, we have to test for the proper method call further adding computation time. In general though, not a bad result for the simplicity involved to easily create a generic transactional proxy for any business logic. Note that we could have easily substituted any class for Handler.class and passed it to Enhance. Further, we could go steps further to really genericize and define either an interface that defines the calculate method or use annotations (ie: @Transactional). The callback handling code, then, can check if the method or object instance is an instance of that interface or check the method annotations. These scenarios may increase computation time, however.
Finally, let’s look at byte code generation using Cojen to perform the transactional semantics. This case is much more involved as it involves generating actual low level byte code. However, I will again strongly express the fact that having somewhat of an understanding of byte code will really enlighten your JVM experiences around performance. Even though we are dealing with class file generation, libraries make it easier by removing the need to manage concepts like the ConstantPool, strings, class file formats, etc. Instead, it allows focusing on the actual method-level byte code.
Let’s get started. The first step is to define a class file instance with the given class name and optional super class (ie: Handler.class). Note that because we are creating a class file, we typically pass in class names as strings, since certain classes may not have been generated yet and Java only needs the class instance at runtime, not creation of class file time.
RuntimeClassFile classFile = new RuntimeClassFile("com.znet.codegen.Handler$Test", Handler.class.getName());
Next, we need to define a field for our delegate instance. If you remember from the base java test above, we did the same thing. This shows how to define that within bytecode. Again, note that we use class descriptions (ie: TypeDesc) rather than actual class instances. Also note the use of the Modifiers class to specify the privateness of the field.
TypeDesc handlerType = TypeDesc.forClass(Handler.class); classFile.addField(Modifiers.PRIVATE, "delegate", handlerType);
We now need a constructor that takes a Handler instance as a delegate and assigns that value to the field. Within Java, this is fairly trivial and only a single line of code as the JVM figures out all the other details. However, within byte code, this is slightly more difficult. To be class file compliant, the constructor must invoke a single constructor from the parent class and then return void from the method ensuring the operand stack is empty at that time.
// define the constructor specifying the parameter types MethodInfo mi = classFile.addConstructor(Modifiers.PUBLIC, new TypeDesc[] { handlerType }); // create a code builder for the method in order to generate the byte code CodeBuilder builder = new CodeBuilder(mi); // invoke the default super constructor...make sure to load the this instance builder.loadThis(); builder.invokeSuperConstructor(new TypeDesc[0]); // load 'this' as the object of the field to store the value to builder.loadThis(); // load the first parameter passed to the constructor builder.loadLocal(builder.getParameter(0)); // store the value to the field named delegate with the specified type // storeField expects the stack to have the object containing the field and the value to set builder.storeField("delegate", handlerType); // return from the constructor builder.returnVoid();
Now comes the fun part…building the actual calculate method to perform transaction semantics. This will use the same code as the java example above but doing so in byte code. Note that if you are unfamiliar with byte code or need to look at an example of how something might be done, the best way is to just generate the same Java code and use a byte code analyzer to visualize the resulting class file. I typically do this by using the Byte Code Outline Plugin in Eclipse. This plugin comes with an Eclipse view that links to the active editor and disassembles the Java source into the byte code commands. It makes it easy to see how the Java compiler creates byte code for any source.
Below is the basic code to generate the needed transaction semantics:
// define: public long calculate(int) MethodInfo mi = classFile.addMethod(Modifiers.PUBLIC, "calculate", TypeDesc.LONG, new TypeDesc[] { TypeDesc.INT }); CodeBuilder builder = new CodeBuilder(mi); // define: Transaction trans = new Transaction() // NOTE: you must first create a new instance of the object which is uninitialized // and then invoke the associated default constructor LocalVariable trans = builder.createLocalVariable("trans", TypeDesc.forClass(Transaction.class)); builder.newObject(TypeDesc.forClass(Transaction.class)); // duplicate the resulting instance on the stack to 1. invoke ctor and 2. store variable builder.dup(); // duplicate again on stack to invoke startTransaction method afterwards builder.dup(); // invoke default ctor (expects instance on stack) builder.invokeConstructor(Transaction.class.getName(), new TypeDesc[0]); // store instance on to local variable (expects instance on stack) builder.storeLocal(trans); // invoke Transaction.startTransaction (expects instance on stack) // NOTE: we use reflection to get method to simplify having to define the types and whether // the method is invokeVirtual or invokeInterface builder.invoke(Transaction.class.getMethod("startTransaction")); // define: long result LocalVariable result = builder.createLocalVariable("result", TypeDesc.LONG); // load delegate field (expect class instance of field on stack) builder.loadThis(); builder.loadField("delegate", handlerType); // load the first parameter of the method (the count) builder.loadLocal(builder.getParameter(0)); // invoke delegate.calculate(count) // NOTE: the first object on the stack must be class instance to operate on // and the next object(s) must the parameters to call the method builder.invoke(Handler.class.getMethod("calculate", new Class[] { int.class })); // store the result of the method to the result variable builder.storeLocal(result); // reload the transaction variable builder.loadLocal(trans); // invoke Transaction.endTransaction builder.invoke(Transaction.class.getMethod("endTransaction")); // load the result again builder.loadLocal(result); // return the result as a long builder.returnValue(TypeDesc.LONG);
This is much more in depth and complicated than the CGLib example. However, if we run this through our test harness, we get much better results.
public static void main(String[] args) throws Exception { Class<? extends Handler> clazz = createHandlerClass(); Handler handler = clazz.getConstructor(Handler.class).newInstance(new Handler()); Harness.test(handler); }
Running this on my Macbook Pro resulted in 450,000 ns. This is basically the same as the pure Java test because essentially it is a pure Java test. The only exception is that rather than writing Java source files for every business logic class, we automatically generate it in a generic fashion without sacrificing performance. Further, we could use annotations or other methods to dynamically discover and inject classes. See my post on Building Annotation Driven Configuration for ideas on how to dynamically discover classes.
The other big issue with generating dynamic classes in this manner is ensuring the byte code is correct. When you write code in Java, the compiler detects errors and tells you. When you write byte code there is no compiler to protect you. As such, Java does runtime checks of the class during class loading to ensure it is valid. If not, it throws a lovely VerifyError exception with some description of the error. More times than not the error is not helpful other than pointing out what method the error may have been in. One way to help get around this is to output the class file to disk and then load that class file through Eclipse or other byte code visualizer.
ClassFile classFile = ...; FileOutputStream fos = new FileOutputStream(classFile.getClassName() + ".class"); classFile.writeTo(fos); fos.close();
At that point look through the byte code and even take a pencil and paper and write out the stack and operations one-by-one. Another option is to write the code in Java and then compare that generated class file. A final option is to use the method builder.mapLineNumber to inject line numbers into the class file to allow the JVM to output line numbers in stack traces.
So, we have seen how to generate code in two different ways and the pros and cons of each. To reiterate, CGLib is great at non-performance intensive requirements as you can easily write the concerns in a Callback class in Java and have the compiler protection. However, for performance intensive tasks or where performance or scalability is a big concern (as it is in my world typically), pure byte code manipulation is a great way to go if you can get over the subtle annoyances of byte code, debugging issues, and verify errors. That tends to occur, but the end result is vastly more scalable.
I hope you enjoyed this article and that you investigate into byte code as it will truly change your perspective.

nothing important, but calculate method of Handler class always returns 0, since for loop starts with 0.