In Node.js all I/O performing operations like HTTP requests, file access etc. are designed to be non-blocking. The functions for these operations usually take a callback function argument as last parameter, which will be called once the operation is finished. The asynchronous nature of these operations often requires special means of synchronization, as exemplified in this post.
Let’s assume we want to download a set of files from a list of URLs:
var urls = [ 'http://example.org/file1', 'http://example.org/file2', 'http://example.org/file3', ];
If we were to use a classic, synchronous blocking function for the download operation the implementation might look like this:
urls.forEach(function(url) { downloadSync(url); console.log('Downloaded ' + url); }); console.log('Downloads complete.')
Output:
Downloaded http://example.org/file1 Downloaded http://example.org/file2 Downloaded http://example.org/file3 Downloads complete.
The program loops through the list of URLs and downloads one file after the other. Each subsequent download is only started after the previous download has finished. No two downloads are active at the same time. Finally, the program prints a message indicating that all downloads are complete.
Now we want to embrace the asynchronous programming style of Node.js. We have a different function, downloadAsync, which is a non-blocking asynchronous function. The function uses the callback convention of Node.js to let the programmer handle the completion of the asynchronous operation:
urls.forEach(function(url) { downloadAsync(url, function() { console.log('Downloaded ' + url); }); }); console.log('Downloads complete.')
Output:
Downloads complete. Downloaded http://example.org/file2 Downloaded http://example.org/file3 Downloaded http://example.org/file1
The download operations are now started at the same time and finish in a different order, depending on how long it takes each file to download. The quickest download finishes first, the slowest last. The total time for all downloads to finish is probably shorter than before, because slower connections do not make faster connections wait for them to finish.
However, there is one problem with this code: the message “Downloads complete.” is shown before the downloads are actually completed, which is obviously not the intended behavior. The reason why this happens is because the control flow immediately continues after the forEach loop has started all the downloads.
The ‘async’ module
What we need to correct this behavior is a way to wait for all asynchronous download operations started by the loop to finish before we print the “Downloads complete” message.
The async module for JavaScript provides various synchronization mechanisms for exactly these kinds of situations. In our case we’re going to use async.each:
var async = require('async'); async.each(urls, function(url, callback) { downloadAsync(url, function() { console.log('Downloaded ' + url); callback(); }); }, function done() { console.log('Downloads complete.') });
The async.each function is similar to JavaScript’s forEach function. However, the function called for each iteration has an additional callback parameter, which must be called when each iteration is considered to be completed. The third parameter to async.each is also a callback function. This function is called after all iterations reported their completion. This is where we place the output of the “Downloads complete” message.
Conclusion
The async module provides a rich toolset for the synchronization of asynchronous operations in JavaScript. Go check it out!