AMD & Require.js

A standard approach to making large projects easier to manage.

AMD (a protocol) and Require.js (a library) can make it easier to manage large web applications.



Overview


AMD is a protocol and Require.js is the implementation of this protocol. Its goal is to make large web projects easier to manage. Projects that involve many JavaScript files are hard to maintain because of the dependencies that exist between these files. These dependencies force developers to carefully consider the order in which JavaScript files are referenced in each page. The complexity grows exponentially when more pages and files are added and this can become a real stumbling block. Let's look at an example:

Say your project has 10 JavaScript files.

├── file1.js
├── file2.js  <- depends on?
├── file3.js
├── file4.js
├── file5.js
├── file6.js  <- utilities 
├── file7.js  <- helpers
├── file8.js  <- support
├── file9.js
├── file10.js

You can probably already guess what is happening. Suppose file3 uses utility functions that are available in file6. File6 in turn depends on file7 because it has some handy helper functions. File9 requires File8 which contains essential support. And, file10 depends on file2 and file5 because they have the necessary base classes. In turn, the two latter ones also depend on File6 utilities. You get the idea.

We only have 10 files and already it has gotten very complex. For each page that you build you have to ensure that the correct script files are included and are placed in the right order. It is easy to see that managing a large project with dozens of files/modules by hand is difficult, error prone, and not very scalable. This is where AMD comes in.

The AMD module format is a specification that allows you to define modules and their dependencies which are then loaded asynchronously. AMD has been adopted by several popular JavaScript tools and libraries, including jQuery, Dojo, Mootools, and Firebug.

AMD offers a great deal of flexibility, first by being asynchronous, but also by introducing string IDs for dependencies and allowing these to be mapped to a different path which is great for creating mocks in unit testing. Furthermore, the module definitions in AMD are encapsulated in their own namespace which avoids polluting the global object's namespace.

We will look at Require.js which implements the AMD module API format. It is the most popular script loader and dependency manager available today.

Require.js is one of those tools that initially you may not quite understand what it does and why you need it. Then after a while, something clicks in your brain, and once that happens you wonder how you ever did without it.

Let's look at a simple example. Say we have the following project directory:

├── index.html
├── scripts/
│   ├── main.js
│   ├── require.js
│   ├── jquery.js
│   ├── modules/
│       ├── module1.js
│       ├── module2.js
│       ├── module3.js

The project has a single web page, index.html, and a /scripts directory with JavaScript files. The file main.js is our 'main' file discussed shortly. Two 3rd party libraries are used: require.js and jquery.js. Finally, in subdirectory /modules we have 3 modules that contain custom JavaScript code we wrote.

Here is what's in index.html.

<!doctype html>
<html>
    <head>
        <title>Title</title>
        <script src="scripts/require.js" data-main="scripts/main"></script>
    </head>
    <body style="background:linen;">
        <h2>AMD and Require.js</h2>
        <button id="clicker" >Click here</button>
    </body>
</html>

Notice that is has only a single script tag in the entire file which is require.js. This tag includes a special data-main attribute that references the 'main' JavaScript file which will be loaded when require.js itself is done loading. As far as Require.js is concerned, any .js file extensions is optional. So main refers to main.js, jquery refers to jquery.js, and so on. Let's open main.js which has configuration information and startup code.

require.config({
    paths: {
        jquery: "jquery",
        module1: "modules/module1",
        module2: "modules/module2",
        module3: "modules/module3",
    }
});

require(["jquery", "module1"], function ($, mod1) {
    
    // hook button up with click event
    $("#clicker").on('click', function () {
        mod1.go();
    });
});

The top half is the configuration section. Require.js offers dozens of configuration options but the one that is almost always used is paths. The paths property is a hash that maps shorthand names (i.e. script IDs) to JavaScript file paths. This greatly simplifies using Require.js as these names are referenced throughout your files.

In our example the name jquery maps to jquery (which is jquery.js: remember that .js is optional). There's no directory mentioned which implies that jquery.js resides in the same directory as main.js. Next are our modules: module1 maps to modules/module1.js, module2 to modules/module2.js, and module3 to modules/module3.js. By the way, we could have used any name for these modules, for example: modA, modB, and modC; they are just handles that are referenced throughout the app.

As an alternative you can import jQuery from a CDN. The configuration section would look like this:

require.config({
    paths: {
        jquery: "http://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min",
        module1: "modules/module1",
        module2: "modules/module2",
        module3: "modules/module3",
    }
});

Before using the paths, let's review how Require.js loads in dependencies. It does this through the require function. To load up a script file you would write something like this:

require(["jsfile"], function (jsfile) {
    jsfile.doSomething();    
});

The first argument is an array of dependencies. Here we have just one: jsfile. If jsfile is not a mapped name (using paths in the configuration section), then it refers to jsfile.js in the same directory as main.js.

The second argument is called the factory function. Its parameters typically match the dependency array. Here we just have one. Require.js has loaded jsfile and passes it as an argument to the factory function. Within the function you can be sure that jsfile is loaded and ready to go. By the way: the name factory function is a reference to the Factory pattern as many of these functions return modules that are manufactured by the function.

With Require.js, the script files being loaded can be traditional script files, but more often they are modules defined by the Require.js define function (note: the Module pattern is discussed in the Modern Patterns section). Modules are defined like this:

define("moduleName", ["jquery", "myapp"], function ($, myapp) {
    // define your module... 
});

The optional moduleName is the name of the module. Usually it is not provided because Require.js will then automatically convert the file name to the module name. The second argument is an array of dependencies. The last argument is the factory function which returns a newly created module. The parameters in this function typically match the dependencies: here there are two: jquery and myapp.

In summary, the two core functions in Require.js are require and define; require loads dependencies and define defines modules. Both are used in our example project. Indeed, it's time to go back to our example and complete the second half of the main.js file. We had looked at the configuration and now we will examine the second half which contains the startup code:

require(["jquery", "module1"], function ($, mod1) {
    
    // hook button up with click event
    $("#clicker").on("click", function () {
        mod1.go();
    });
});

With require we indicate there are two dependencies: jquery and module1. The factory function receives references to these, $ and mod1, which are then used inside the body. The jQuery library is used to traverse the DOM, find the button, and attach a click handler. Inside the click handler the go function on module1 is called.

Let's open the module definition of module1.js.

define(["module2"], function (mod2) {

    var go = function () {
        alert("- I am in Module1 \n" + mod2.go());
    };

    return { go: go };
});

The define method has no module name and therefore defaults to the filename which is module1. The dependencies array indicates that module1 has a dependency on a single module: module2. The second argument is the factory function that returns the module (this is the Revealing Module pattern). Inside the go function there is a nested go call but this time on mod2. Well, let's open module2:

define(["module3"], function (mod3) {

    var go = function () {
        return "-- I am in Module2 \n" + mod3.go();
    };

    return { go: go };
});

It's déjà vu all over again. This file is almost the same as module1. Module2 is dependent on Module3. The incoming mod3 argument is referenced inside the go function with another call to a go function on mod3. Finally, let's see module3:

define(function () {
    var go = function () {
        return "--- I am in Module3";
    };

    return { go: go };
});

This is the last module in our dependency chain which is without any external dependencies. The define method creates a Require.js module that has module3 as its name.

To summarize: require.config in main.js establishes the configuration settings for Require.js. The require function call in the same file establishes the first dependencies by setting up the button event handler. When the button is clicked, module 1 is loaded to execute the handler. But module1 depends on module2, which is then loaded. Similarly module2 depends on module3 which is then loaded.

All these dependencies and the loading of these dependencies are automatically handled by Require.js in a highly efficient and effective manner. Remember also that Require.js does not pollute the global namespace with any of these modules. Quite remarkable if you consider that only a single <script> reference is included in the index.html page.

It's time to see our project in action.

    Run this and you'll get the following output:

First the index.html page displays. Clicking on the button shows the alert box with 3 messages, each coming from a different module. This confirms that Require.js has loaded all necessary dependencies for us.

As indicated earlier, AMD and Require.js are a stopgap measure until ES6 (EcmaScript 6) will be available. Among other things ES6 will introduce the module, import, and export keywords to address asynchronous loading and module dependencies. Unfortunately, it will take some time before it is widely available.