Pierre Palatin's corner

Random posts about some stuff I’ve been doing.

Javascript Web Workers Startup

Posted at — Sep 4, 2011

Some explanation and code in order to initialize html5 javascript Web Workers without race conditions - it’s an hardly documented issue afaik.

HTML5 Web Workers add parallelism to javascript, including on the browser side. They did not fall in the trap of just adding threads to javascript, but instead used fairly well isolated sandbox as parallel workers. The only communication between workers and the main page are through messages which are copied on send, which prevents most issues from threading. I.e., no variables changing under your feet without you knowing it, extremely clear interfaces between branches and so on.

There are plenty of article covering the basic usage of web workers. For example, on Mozilla Developer network: Mozilla Developer Network.

However, details are often missing (or unclear) in all those page covering only the basics. One of a first issue I ran into was how to start things up properly:

Race conditions on startup

There’s no apparent guarantee that when starting a new worker, you will set the onmessage on time, on either side. The Worker() constructor load and execute the script as soon as possible, so you have no execution order guarantee.

Given that you have no control on load time or message transmission time, it might sounds like there’w no way to do a proper initialization. I.e., when posting a message, you cannot have a guarantee that the other side is going to be already configured to process it (message are dropped when there’s no onmessage handler specified).

However, those who worked on implementation of web workers are obviously not stupid and there’s a solution for that mentioned on the wonderful stackoverflow, linking to this discussion.

The idea is that messages to the worker (from the main code) are queued until the worker code is executed. Those messages are then delivered to whatever handler the worker might have set.

That’s a fairly good idea, as it means that in most cases you would not need to do anything special. You only need to follow the following rules, which are probably not an issue:

If you have ever used threads in other languages, I let you think on how much better message passing / web workers are :) They mostly just work, without big traps.

Knowing when a worker is ready

Given that workers can be loaded from the network, and that once loaded messaging should be fairly fast, it can be interesting to know when a worker has been actually loaded and is ready to process messages - if only to change a “loading” icon to a “loaded” state.

To do that, we can have the following implementation for the main code:

// Create a new worker and run onReady when it is correctly initialized.
function spawnWorker(workerurl, onReady) {
  var w = new Worker(workerurl);
  w.onmessage = function(event) {
    onReady(w);
  };
  w.postMessage();
};

// Usage:
spawnWorker("worker.js", function(worker) {
  // Worker is ready, do initialization stuff
  // ...
  // Don't forget to add our message handling too:
  worker.onmessage = function(event) {
    // Do the real stuff here
  }
});

And for the worker (worker.js) :

// A small helper function to encapsulate the logic.
// It's fairly basic, so it's probably not worth having a library for that.
function startup(onMessage) {
  self.onmessage = function(event) {
    // Set the onmessage handler to the real handler
    self.onmessage = onMessage;
    // Ping the main code to tell it we're ready
    self.postMessage();
  }
};

startup(function(event) {
  // Our real message handling
});

This way, no messages will be lost as long as you send them from the onReady argument of spawnWorker or from the argument of startup() in the worker.

I assume that Javascript is not reentrant. If it was, it would mean that an onmessage() method can be started while it’s running for another message, which would break the previous code. However, I’m fairly certain that Javascript is not re-entrant, even if I have not seen anything about that (it’s probably in the ECMA standard). If it was actually re-entrant, it would probably basically break every existing javascript code :) (and start a whole new category of problems without any tools to solve them like conditions or mutexes).