Concurrent Asynchronous API Programming in Java
Several sites are moving towards an API program to allow easier access as well as mashup data. The use of an API is considerably different from typical methodologies such as direct database access. The biggest difference is latency and time. Further, the work being done during REST lookups is more idle time or I/O processing time and little CPU time. However, even if a RESTful API call takes 200ms, that means that the server or client making that call is doing nothing for 200ms leaving the CPU idle. The easiest way to fix this is to do something that the Javascript and NodeJS communities have been doing for years: asynchronous callbacks via closures. The problem is that Java has no direct and easy means to do this automatically. Javascript does it out of necessity as it only has one running thread that would otherwise block the user interface.
So, how do we do this in Java as unobtrusively as possible? The answer is simple: threads. But, how do we manage those threads? The biggest problem with threads is the heavy expense to create them, especially if we are talking about tasks taking less than a second. If we cannot create new threads for each task, what do we do: pool the threads and re-use. This is the same concept as re-using and pooling connections in database processing. That may seem like a difficult task to setup that type of processing, but it’s actually quite simple. It boils down to the standard interview question: What are the three ways to create threads in Java? Actually, the standard question is what are the two ways (implement Runnable, extend Thread)? The problem is that very few people know about the third way as of JDK 5 and the addition of the java.util.concurrent package.
If you have never looked at this package, I highly encourage you to look at it in more depth or to check out the various other blogs/articles about it. The package is a ton of utilities for developing concurrent programs in Java: concurrent maps, synchronization strategies, blocking queues, thread management, and many more. The latter is the one we are concerned about: thread management. Thread management in this package all starts with the Executors helper class and the ExecutorService interface. The Executors helper class provides easy methods to create an ExecutorService as either a cached thread pool that grows on demand as new threads are needed or a fixed thread pool that ensures tasks are queued once the threads are exhausted. As a developer all you need to do is simply submit new tasks and the task will be executed in the background once a thread is available. The result is a Future . With the future in hand, you can poll for the completion of the event or simply wait for it. With only a few lines of code, we have created a re-usable and thread managed system for querying APIs.
import java.util.concurrent.*; public class ApiService { private final int THREADS = Runtime.getRuntime().availableProcessors(); private ExecutorService executor = Executors.newFixedThreadPool(THREADS); public void invokeApi(String path) { Future<Object> future = executor.submit(new Callable<Object>() { public Object call() { // query the API (via HttpURLConnection, HTTP Client, commons http, etc) // map the API to POJOs (jaxb, JSON mappers, etc) return result; } }); Object result = future.get(5, TimeUnit.SECONDS) // do stuff with result } } |
With these few lines, we ensure that only THREADS count of tasks will run concurrently with very few lines of code. Before the addition of these packages and classes, this type of programming required several libraries and classes.
With Groovy, this becomes even simpler as you can directly use closures without the new Callable() { public void call() { } } boilerplate. JDK 7 and Lambda also will remove the boilerplate code.
Now that we have this code in place we can do a few things to improve response times and processing. First, if you only need one API invocation per request, the only thing this snippet provides is thread queuing. If you have multiple invocations, then you can submit all of the independent ones and wait on them so they invoke concurrently rather than serially. Rather than 3x time, it is just 1x. Second, if you are using the Servlet 3.0 API, you can pause a request while waiting on the API invocation to complete to allow the servlet container to use the request thread to process other incoming threads. This will provide greater request thread usage and higher throughput of the entire system. I would not recommend this if the expected API processing times and latency are below 250-500ms as the context switching between threads could become expensive if the servlet container is pausing, processing, unpausing, etc threads over and over.
The most important thing to note about this blog post is to look into the java.util.concurrent package. Then, the next time an interviewer asks you the two ways to create a thread, you can amaze them and tell them all three.

Great article. Actually I am using this approach for the ad server implementation. Currently, I am dealing with 5B searches per day with 50 boxes. I am thinking of using async io call via continuation b/c threads are considered to be expensive and in order to push the servers to the limit, i don’t want any thread being hung up on waiting responses from 3rd parties… (range from 100ms to 1s). However, I haven’t tried this approach and see if you have any input on this. I heard of scala provides lightweight thread via continuation.. never try it before, have you?
[...] way to maintain the connection and shut it down was to have the worksocket in the stateobject.I've recently done some TCP/IP socket programming. The .Net framework provides a some nice features …understand when I look at the flow of the call first. So here is an outline of the flow. The basic [...]