31st August 2019

A Javascript DSL for WASM-JS bindings

bpat is a cross-platform portability layer for multimedia applications that I am currently developing. It can run on web browsers via WebAssembly (WASM). However, WebAssembly alone can't access the Web Platform APIs needed for multimedia applications such as WebGL and WebAudio. Instead, it can only do so by calling Javascript functions which would then call the Web platform APIs.

In this post, I'll talk a bit about the Javascript-like DSL developed to make writing WASM-to-JS bindings a pleasant experience in bpat. This DSL allows developers to write Javascript modules which can selectively export functions to WASM and import functions from WASM.

An example Javascript module written in the DSL is:

import { getValueFromWasm } from '@wasm-exports';
import { console } from '@js-globals';

export("wasm") function printValueToConsole() {
  console.log(getValueFromWasm());
}

Background

Let's say we have the following C program which aims to print 42.0 to the Javascript console:

foo.c

void printValueToConsole(void);

float getValueFromWasm(void) {
  return 42.0f;
}

int main(void) {
  printValueToConsole();
  return 0;
}

In the above source, printValueToConsole() is a JS function that we want to call from WASM-land, and getValueFromWasm() is a WASM function that the JS-land will call.

If you have clang-8 and above, you can compile the C source directly into WASM with:

$ clang-9 --target=wasm32 -c -o foo.o foo.c
$ wasm-ld-9 --no-entry --allow-undefined --export-all -o foo.wasm foo.o

And then write the necessary HTML/JS that will provide foo.wasm with printValueToConsole():

foo.html

<script type="text/javascript">var bpat = function (wasmData) {
  var getValueFromWasm;
  var env = {
    printValueToConsole() {
      console.log(getValueFromWasm());
    },
  };
  WebAssembly.instantiate(wasmData, { env }).then(program => {
    var exports = program.instance.exports;
    function getExport(name) {
      if (!exports[name])
        throw new Error('Export not found: ' + name);
      return exports[name];
    }
    getValueFromWasm = getExport('getValueFromWasm');
    getExport('main')();
  });
};
fetch('foo.wasm').then(resp => resp.arrayBuffer()).then(bpat);
</script>

And that's all! You can now build your own bpat competitor ;-) However, after adding many module imports, your code might start to look like how my code started to look:

var bpat = function(wasmData) {
  // WebGL context
  var gl = initWebGLContext();

  // Our exports
  var env = {
    // Console
    btLog(str, len) {
      console.log(utf8ToString(str, len));
    },

    // Math
    cosf: Math.cos,
    cos: Math.cos,
    sinf: Math.sin,
    sin: Math.sin,

    // WebGL
    btglActiveTexture(texture) {
      gl.activeTexture(texture);
    },

    btglAttachShader(program, shader) {
      ....
    }

    // ... many many lines more ...
  };

  WebAssembly.instantiate(wasmData, { env }).then(function(program) {
    var exports = program.instance.exports;
    function getExport(name) {
      if (!exports[name])
        throw new Error('Export not found: ' + name);
      return exports[name];
    }
    memory = getExport('memory');
    main = getExport('main');
    // ... many many lines more ...
  });
};

At this point, a few things became apparent to me:

A way is needed to split the binding code into different modules. As hinted in the code example above, the env module import is extremely large and unwieldy, containing a mixmash of functions from various very distinct functionalities and responsibilities. For example, we have the bindings for the console (console.log). the bindings for WebGL (which is very large), the bindings for WebAudio etc. Each of them manages their own private data. For example, the WebGL bindings will keep the WebGL context, its mapping between WebGL objects and IDs, etc, and the WebAudio bindings will keep the WebAudio context and the associated object IDs etc. If we have all of these data structures criss-crossing each other in the same file, things would become very unmanageable. Having modules would allow us to split the code into multiple files and encapsulate the private data, allowing us to deal with the different subsystems in manageable chunks.
If this was a Javascript project, I would probably just reach for Webpack or some other module bundler, however as bpat didn't (and still doesn't) require NodeJS to be installed for development, it would be nice if there was a way which didn't involve adding another build dependency.
Dead code elimination is required. bpat is meant to be a general platform that contains support for many different APIs (and different versions of the same API). For example, it supports both WebGL 1 and WebGL 2. A typical application will only be using a subset of these APIs, and thus it doesn't make sense to bring all of them along with every application. As such, it would make more sense for only the absolute minimal binding code to be pulled in as required by the application.
However, at the same time, this "pulling in" of dependencies should be automatic. Devs should be able to use these APIs as if they have always existed in the application, rather than needing to, for example, explicitly maintaining a list of modules/functions that are used in the application.
To be able to do this, we need a compiler that is able to statically analyze the JS source code at a fine level. However, some JS functions, such as eval(), make it difficult to statically analyze JS programs, or to safely perform dead code elimination. In such a case, for bpat's purposes, we need to straight up refuse to allow devs to use those features, rather than allowing them (while silently disabling or handicapping DCE) for the sake of preserving compatibility with JS. This goal is different from other general module bundlers which aim to be compatible with all valid JS code (because they need to support existing JS libraries, whereas in my case the JS code is all newly written).

The DSL

As such, bpat uses its own custom Javascript-like DSL for writing WASM-JS bindings. This DSL brings in all the nice modern JS features that makes writing JS code pleasant, such as ES6 modules and let/const. At the same time, it also restricts JS code in some ways to ensure that certain transformations can be easily performed correctly on the code. For example, it only allows declarations on the top-level --- this is because entire modules can be dead code eliminated, and thus not allowing arbitrary top-level code preserves correctness when the module not initialized. An example module written in the DSL is:

import { getValueFromWasm } from '@wasm-exports';
import { console } from '@js-globals';

export("wasm") function printValueToConsole() {
  console.log(getValueFromWasm());
}

As can be seen above, having our own language also allows us to add little syntactical enhancements which make writing WASM-JS bindings much more pleasant. For example, I have extended the export keyword to allow devs to specify whether the function should be only be exported to WASM-land, or to JS-land, or even both. The DSL also cuts out the boilerplate for dealing with WASM, and provides the special @wasm-exports modules for allowing devs to use WASM functions using regular Javascript import syntax.

The DSL compiler will compile the above source file into:

let bpat=(function(){ let a0;let a1;let a2;let a3=function(){console.log(a1());};return{env:{printValueToConsole:a3},run:function(b0){let b1=b0.instance.exports;let b2=function(c0){if(!b1[c0])throw new Error("Export not found");return b1[c0];};a0=b2("main");a1=b2("getValueFromWasm");a2=b2("btDumpStack");try{a0();}catch(c0){a2();throw c0;}}};})()

As we can see, the compiler takes care of generating the necessary boilerplate for giving the WASM the table of JS functions and fetching WASM functions from the WASM module. Furthermore, it also tries performs minification so that the generated JS file is small (although it can certainly be smaller, more work is required in this area!).

Closing Words

Having a module system has allowed bpat's JS binding code to "scale up" and handle the ever-increasing range of Web Platform APIs that bpat supports. At the same time, the various little enhancements has allowed the binding code to be written succinctly, and the result is much more readable. Dead code elimination has allowed the size of generated JS files to scale with how much APIs the application actually uses --- A "Hello World" JS file is around 2.3K while the JS file that powers Bouncy Boar 3 is around 17K (minified, uncompressed). Overall, this DSL is another valuable tool in bpat's toolbox that makes developing bpat an enjoyable friction-less experience.

devlog

A Javascript DSL for WASM-JS bindings

Background

The DSL

Closing Words