Making web applications available offline

With the arrival of HTML 5 (which commonly means more than just HTML), web development has become even more of a joy that it used to be. Clean semantics in HTML, elegance through CSS, new scripting possibilities, it’s all there. This blog post describes two new features, offline applications and name/value storage, which allow developers to make web applications available offline.

Hey, it’s just like Java Webstart!


Running remote applications on a users computer is nothing new. SUN attempted Java Applets in 1995, but that failed miserably due to the long startup times Java had back then. Java Webstart (introduced in 2001) did not have much of that handicap as the JVM got better, but it still required the user to install Java. Neither has been a large success so far.

The offline applications and name/value storage features from HTML 5 provide only part of what Java Webstart can, but they are used from the web browser. That means thew full power of HTML 5. Including a powerful scripting language, AJAX, and the ease with which web applications allow us to build aesthetically pleasing user interfaces.

So no, it’s not like Java Webstart. It’s much, much better.

Storing data locally


Before we can consider making our application available to offline users, we must be able to store data locally. The name/value storage feature allows this. This goes beyond creating a Javascript object, whose properties already form a name/value store (due to the dynamic nature of the language). We need the data to be available beyond a page reload, even beyond a browser restart.

The new Storage interface provides just that. It has two implementations:

sessionStorage
A name/value store that stores values as strings. Data is available for all pages of the same origin (protocol+host+port). It is available beyond page loads, even if you load pages of another origin in between. The sessionStorage is not shared across browser windows/tabs. 

Note that the sessionStorage is generally not available beyond browser restarts. It can be, however, if a browser restores the session. This is usually a user action.

localStorage
A name/value store that stores values as strings. Data is available for all pages of the same origin (protocol+host+port). It is available beyond page reloads and browser restarts. It is also shared across all browser windows/tabs. 

The following Javascript snippet demonstrates how easy it is to use it. You can replace sessionStorage
with localStorage to use the other.
	// Clear all values (for our origin only)
sessionStorage.clear()

// Set a value in different ways
sessionStorage.setItem('characterName', 'Zaphod')
sessionStorage['characterName'] = 'Zaphod'
sessionStorage.characterName = 'Zaphod'

// Get a value in different ways
console.log(sessionStorage.getItem('characterName'));
console.log(sessionStorage['characterName']);
console.log(sessionStoragecharacterName);

// Log all stored name/value pairs
for (var i=0; i < sessionStorage.length; i++) {
var name = sessionStorage.key(i);
var value = sessionStorage.getItem(name);
console.log(name + "=" + value);
}

// Remove a single key
sessionStorage.removeItem('characterName');

The amount of data you can store in the localStorage is limited to 5MB, or approximately half of that if you’re using a Webkit browser like Chrome, Safari, iOS Safari and Android Browser (WebKit stores strings as UTF-16 instead of the more common UTF-8). Internet Explorer has a limit of 4.75MB.

The sessionStorage has the same limit as the localStorage, but this varies. On Firefox, Safari and Android for example, the sessionStorage is unlimited.

So now we can store data in the browser. For more information, see:

Cache manifest


When you can store data in th browser, the only thing missing to make a web application available offline is a cache. All files required for the application must be cached offline. For simple applications, like the "TODO list" example that is so popular to demonstrate Javascript MVC frameworks, all files are already cached when you first load the web application. For most however, you either need to manually load all pages or have the browser do that for you.

This is where the cache manifest comes in.

The cache manifest is referenced from the <html> tag in your application. Preferably from all pages, as then any page load can trigger a cache refresh. The manifest has a MIME type of text/cache-manifest.

The cache manifest also has an advantage over the normal browser cache: you can designate resources as only available online, and you can specify fallback resources. This is best explained by an example. Below is the cache manifest of this simple demo.
CACHE MANIFEST
# The line above MUST be the first line in the manifest.

# Empty lines and lines starting with a hash (#) or
# whitespace + hash are ignored as comments.

# Cached resources are only reloaded (using the standard
# If-Modified-Since header) when the manifest content
# changes. Inserting an explicit version number or timestamp
# is an easy way to accomplish this.

# Version 2012-04-24T11:29

# All resources mentioned in this file are URLs, relative to
# the manifest file.

# We're still in the first, thus unlabeled, section. This is
# treated as a cache section (see below). Note that the file
# from which the manifest was loaded is always cached. But
# as we want to cache all pages, all are mentioned
# explicitly.
index.html
clock.html

#JavaScript
# Cache section. All resources mentioned here are cached.
# Each non-comment line in this section must be a URL.
#
CACHE:
clock.css
clock.js
# Fallback resources must explicitly be cached, or they will
# not be available.
lamp_off.png

#
# Network section. These resources are never available when
# offline.
# Each non-comment line in this section must be a URL,
# except for the three exceptions mentioned here.
#
NETWORK:
# Chrome respects * (as per the specification)
*
# Firefox respects the following instead
http://*
https://*

#
# Fallback section.
# Each non-comment line in this section contains two URLs:
# a URL to match (as prefix) and the resource that should be
# used instead if the resource if not available in the
# normal browser cache.
#
FALLBACK:
lamp_on.png lamp_off.png
/images lamp_off.png

As you can see, caching the entire web application is the easy part. Also, the generic network section in the example allows us to use any (REST) web service anywhere on the web — within the bounds of the same origin policy of course.

More interesting is the fallback section. This allows us to specify different resources when offline. For example:

  • we can show a generic user avatar instead of a gravatar,

  • return a report on the previous month instead of a dynamically generated report


Note however, that fallback resources are not used when the resources they’re a fallback for are available in the (normal) browser cache. So we cannot, for example, return a different script for offline use as opposed to online use. For that, use the boolean value window.navigator.onLine or the window.ononline & window.onoffline events.

This concludes the building blocks to make a web application available offline. For more information, see:

Bringing it all together


Currently (april 2012) it only Internet Explorer does not yet support the cache manifest (it is planned for version 10). As a fallback, you can use the <link rel=“prefetch” href=”...”/> to rely on the normal browser cache. For everybody else, from Firefox/Chrome/Safari to iOS and Android devices, offline applications already supported.

Having said that, the easiest way to bring the two together is:

  1. Build a web application with HTML, CSS, JavaScript, and preferably a JavaScript MVC framework.

  2. Identity all static resources, and put them in a cache manifest.

  3. Identify resources that are static per week/month/…, and cache them too — as fallback.

  4. Identify dynamic resources that have a meaningful static fallback, and configure that

  5. For all other dynamic resources, check their nature:

    • if they may be stale, store them in the localStorage and adjust the JavaScript accordingly

    • if they cannot be allowed to go stale, note them in the network section of the cache manifest.




This takes care of everything related to functionality. Proper use of the cache can only increase the performance, so there’s no issue there. Thus the only issue not yet addressed is security.

And for security, there’s not much to do: highly sensitive data should not be stored locally anyway (neither in cookies, localStorage or cache). For other data, you can define that while offlinem, the user identity is the same as the last time the user was online. The net result: simply store the data, but also store the account name. If the user logs in with a different account in the future, clear the storage.

Conclusion


So there you have it: all building blocks you need to make an application available to offline users.

When your application is first displayed, the browser loads all relevant resources. Your application now effectively works the same as if it were offline already. The only difference: when online you can use a (REST) webservice.

Your scripts will probably check if you’re online using the value of window.navigator.onLine or using the window.ononline & window.onoffline events. Thus when offline, they can show data from the localStorage and/or sessionStorage (the latter is probably best used as a computationcache). And when online, your scripts can issue AJAX requests to get data and store before displaying it to the user.