Javascript Promises and Causality

Let's say you're building a standard web application, where you have Clients, and Documents for those clients. You want to build a page that lets you browse documents, based off of clients. Something like the following:

   [ Client  A | V ]  <- Client Selector
    ---------------------
     - Document A        <--- List of Client A's documents
     - Document B

You decide to load in the list of documents only after a client is selected, and to do so through an AJAX request.

Your code could end up looking like this (if you're working in Angular):

$scope.$watch('client', function(){
   // when the chosen client changed
   // get document list URL for client
   var url = documentUrl($scope.client);
   $http.get(documentUrl).then(
      function(response) {
        // after getting the documents, update the document data
        $scope.documents = response.data;
        // after setting scope value, Angular template
        // automatically renders document list
      });
});

Pretty simple, right? You pick a different client, a new request goes out, and when it finishes you set the document list to the new docs.

The problem comes whenever you move a bit too fast. What if the following happened?

Pick Client A
HTTP request goes out for documents of Client A (1)
Pick Client B
HTTP request goes out for documents of Client B (2)
Request (2) finishes, setting documents to the documents of Client B
Request (1) finishes, setting documents to the documents of Client A

Because of the asynchronous nature of the update of documents, you can actually end up with a pretty nasty race condition! You'll have Client B picked, but Client A's documents selected. In itself it's not too much of an issue, but it does break a pretty fundamental invariant of your program! If you used this data for any subsequent operation, it could lead to a lot of confusion.

If you use promises without thinking, you can quickly run into this situation as well. I'm so used to having causality when writing blocking, synchronous programs that I do not have the right habits when I lose it.

Here are a couple ways we can solve this bug. I do not have a more general solution the problem, so more exploration is needed though.

Be Closer to the Underlying Data

In the initial implementation our view defined a client and a list of documents. But really, you're choosing a client and then showing that client's documents. So one way of tackling this issue is putting the list of documents "inside" the client:

$scope.$watch('client', function(){
    // when the chosen client changed
    var url = documentUrl($scope.client); // get document list URL for client
    $http.get(documentUrl).then(
      function(response) {
          // after getting the documents
          //update the *client's* document data
          $scope.client.documents = response.data;
       });
});

Actually, that's wrong. Because if we do have a race, $scope.client could have changed to the wrong client by the time we treat the response! So what we need to do is capture the current client, and only use that. In Angular watches, those are passed as parameters to the watch function:

$scope.$watch('client', function(newClient){  // <-- reference to the client
    // when the chosen client changed
    var url = documentUrl(newClient); // get document list URL for client
    $http.get(documentUrl).then(
      function(response) {
          // after getting the documents, update the *client's* document data
          // assign to the (unchanging) client
          newClient.documents = response.data;
       });
});

In this new model, even if there is a race in the requests, documents to Client B will still receive documents for Client B, and similarly for Client A.

The issue with this solution comes to when you're doing business logic on your objects elsewhere. if you have a client c, what is c.documents? If you have pagination on your endpoint, is it the "current page"? If you haven't tried fetching the documents yet, it's simply undefined. You can build a consistent model, but without using something like TypeScript, it can be easy to have an if(!c.documents) branch that meant to check for lack of documents, but is also catching if you simply haven't made the document request yet.

Short Circuit your Requests

There are certain sets of requests where you'll only want the results of at most one outstanding request. For example, once we make a request for the documents of Client B, we know that we no longer need the results of the outstanding request of Client A.

In that case, we could simply ensure that at most one request stays active at a time. If we stop any outstanding request before starting a new one, those requests will fail (and thus not overwrite the documents). So if a request succeeds, you "know" that the documents are for the last chosen client.

var currentRequestCanceller = undefined;
 function fetchDocuments(client){
   // cancel any ongoing request
   if(currentRequestCanceller !== undefined){
     currentRequestCanceller.resolve();
     // resolving the canceller triggers a timeout in the request
     // (see below)
   }
   // create a new deferred promise to serve as a timeout for the request
   currentRequestCanceller = $q.defer();
   // start a new request
   $http.get(
     documentUrl(client),
     {timeout: currentRequestCanceller.promise}
     // if the timeout resolves, the request is cancelled
   ).then(function(response){
     // only runs if the request wasn't cancelled
     $scope.documents = response.data;
  });
};

$scope.$watch('client', function(){ fetchDocuments($scope.client)});

This solution is pretty interesting. It kills unneeded requests, and "debounces" the information requests. The main issue with this solution is that it can be hard to manage if you are chaining requests together. You're also forced to declare things cleanly, instead of ad-hocing it with a lot of anonymous functions (but that's more a feature than a bug).

Forcing Causality

I'm going to start off by saying that this is not the recommended solution. But it's the fun solution, and isn't that what really matters in life?

Here we end up with races because our asynchronous requests can complete in a different order than the request order. So what if we just forced requests to resolve in the "right" order?

var firstRequest = $q.defer(),
    requestChain = firstRequest.promise;
firstRequest.resolve(); // kickoff the request chain

function synchronised(nextPromise){
  // first, build a promise to track the finalisation of your previous requests
  var prevRequestsFinished = $q.defer();
  // resolve the request on success or failure
  requestChain.then(
    function(){ prevRequestsFinished.resolve(); },
    function(){ prevRequestsFinished.resolve(); }
  );
  // update requestChain to a new promise that resolves
  // whenever all requests are finished
  // (and returns the result from the next promise)
  requestChain = $q.all(
      [prevRequestsFinished.promise, nextPromise]
  ).then(
    //unwrap the result on success
    function(result){ return result[1]; }
  );
  return requestChain;
}

//...

synchronised($http.get(urlA)).then(
  function(response){ console.log("request A finished!");}
);

synchronised($http.get(urlB)).then(
  function(response){ console.log("request B finished!");}
);
// "request A finished!" will always show before "request B finshed!"
// even if request A is slower and finishes after request B

Here, we've written a function to "chain" all our promises globally. our synchronised function is actually putting our requests on a queue, where a promise cannot resolve until all previously inserted promises have been resolved.

Even though our HTTP requests are going out in parallel, and might finish out of order, the resolution of the promises will not happen out of order! Our race condition will disappear!

synchronised($http.get(documentUrl)).then(
  /* stuff that will now run in the same order as other synchronised calls */
)

But, of course, if your requests take different amounts of time, you could lose time waiting for requests to resolve. And you generally lose a lot of advantages of the asynchronous model.

Which is why this isn't recommended! But it's fun to think of a way to "force" the synchronous model back into an asynchronous-by-default one.

I'm not satisfied with any of these solutions, really. The second one is the cleanest, but relies heavily on the single-threaded nature of Javascript to work. I would really like to find a way of thinking that could easily be applied to different execution models.