in Uncategorized

Async I/O and Fork-Join

Parallelizing network IO is a common problem as it can help reduce overall latency for user-facing tasks. But implementing an efficient and easy to use abstraction for fork/join for network I/O is hard. This often leads to frameworks to simplify task scheduling, joining on task completion, and synchronization.

Fork-join I/O

Most solutions for this include one of the following patterns.

  • Describe the task flow, typically using some XML-based domain-specific language, describe the data sources, and then execute an engine to schedule the work, which will notify you back when the tasks are complete or provide you with a blocking function to call for completion. The work flow description includes what and when to parallelize, what to do on task completion, fallback and error handling.
  • Provide a programming language specific orchestration API to assemble tasks, and use the API to schedule the tasks. The API calls you back when the tasks are complete, or provides you a blocking function to call for results.

Most enterprise-targeted software products fall into the first bucket, while home-grown frameworks fall into the latter. Unfortunately, given the complexity of the problem and how insanely inadequate commonly used programming languages are, such solutions end up with quite a bit of complexity to understand, use and debug.

Evented I/O does help simplify this problem to an extent, but still exposes some code complexity. Here is an example to schedule 3 HTTP requests and join on their completion using node.js.

Here the tricky part is wrapping each source URI inside an immediate function as otherwise the loop would overwrite the input state for each task. Now imagine repeating this a few times in a complex workflow.

In reality, this code would get longer and more complex as you intermix application-specific code in between. Often node's callbacks get blamed for the resulting code complexity, but most languages will lead us to the same level of, if not more, complexity. It is the problem that is non-trivial.

When I was tinkering with this problem nearly eight months ago, I tried a few options like this. It was trivial to come up with a new API to hide some of the complexity, but I was not pleased with the outcome as it meant explicit decision making code thrown all round. In addition, real production-ready code will need to deal with network errors, timeouts, slow servers, HTTP errors, as well as fallback behavior in case of failures. This is the reason ql.io includes a domain-specific language and not a programming language specific-API.

With a DSL, we can express the same problem as below.

Fork-join I/O with a DSL

In this model, each statement is an expression of intent to perform an I/O task. Each expression might include dependencies on other tasks expressed through parameters. By interpreting these statements and parameters, it is trivial to come up with a simple orchestration algorithm to schedule these tasks.

This approach greatly simplifies both the application problem and the framework problem. The application has an imperative way express the intent, and the framework has a simple algorithm to orchestrate the tasks asynchronously. See engine.js in case you are interested in the actual implementation.