JOSS architecture approaches for integrating with OpenStack Object Storage

As discussed earlier, JOSS allows you to interact with OpenStack Object Storage. You have been shown how to get started with JOSS, so you got to know the library firsthand. Now, it is about time we start moving up the chain by discussing and showcasing various architecture models for integrating the OpenStack Object Storage component into your Java application.

First, the models:

  • Download objects directly from the Object Store; publish your resources and allow clients to download them directly from the Object Store, or use temporary URLs to grant expirable access to confidential resources

  • Up- and download objects through an application; stream your confidential up- and downloads through your application, introducing various possible custom operations, such as fine-grained authorisation

  • Process uploaded objects; could be to asynchronously upload, or to process objects into another format before uploading to the Object Store

We will deal with these models in-depth.

Download objects directly from the Object Store


As shown in the image above, objects can be retrieved directly from the Object Store. This is a showcase for the Object Store and the preferred usage. If objects are not confidential, you are advised to put those resources in a public container and have your users download the content directly from here. You will make use of the better latency and your application server will not be burdened with the downloads.

Confidential objects are another matter entirely. One option is to make use of temporary URLs, which effectively work in the same way as public URLs, except that the temporary URLs expire after a set time, disallowing the object to be accessed through that URL from then on. From a user experience point-of-view, temporary URLs assume immediate consumption of the object, ie a direct download, or showing a movie, listening to a song. This may not work for you if you need URLs to stick around. Its weakness is its strength as well—expiration.

Up- and download objects through an application


For uploads, this is the most logical way to set up your architecture. By streaming the object through the application, there is no need to first store the object in the application. This model offers the possibility to use the application layer to check the credentials and determine the storage container. Optionally, you might prefer to sniff the Content-Type of the object, instead of relying on file extension matching.

For downloads, it is possible to access confidential resources in private containers through your application layer as well. In this case, your application becomes an extra node for the download, introducing more burden on the application server. However, there is a gain as well. Now you suddenly have the option to use the credentials of the downloader to ascertain that user has authorization to download the object. Since the URL is not temporary, it will remain usable and secure, since it will not expire and you will check for authorization everytime the object is accessed.

Key in downloading through the application is that the resource behaves in the same way as if it were accessed as a static resource. That means, you will want to have the right Content-Type so that the browser knows how to handle the content and you might want to have a check for modifications, possibly resulting in a Not Modified (304) response.

Process uploaded objects


Streaming is all very nice, but it takes away an important feature - intermediate processing. There are various reasons why you might want this. For example, requiring an asynchronous upload by first storing the object in your own database before uploading to the Object Store. You also might want to create various formats of an uploaded image, for example thumbnails and small images.

The cost incurred for intermediate processing is the introduction of an extra stop, which could be relatively expensive by persisting to file or database (in which case a save/restore operation is included) or cheaper, by making use of an in-memory byte array. What you choose depends a lot on the actions that you require - async processing requires a persisted state, whereas image transformations can be done just as well on byte arrays and large objects are better handled on-file than in-memory.

Intermediate processing is expensive, have no doubt. Besides the extra node (ie, your application) for up- and downloading, there is also the cost of intermediate storage. You should refrain from introducing an extra stop if there is no need for intermediate processing. Instead, use streaming or, if it is a public resource, directly downloading from the Object Store.

JOSS showcase

To demonstrate the various architecture models, we set up a tutorial project, which you can pull from Github and try out for yourself.

Note that you will need a working Object Storage account. Don’t have one? Try one out @CloudVPS. When you have received a working account, be sure to fill out the file in the project with the right credentials.

After you run maven jetty:run, you point your browser to http://localhost:8081. You should see the following screen:


Use case 1 demonstrates how you can stream your content through the application directly to your browser. Use case 2 demonstrates how you can stream to the Object Store, presenting a link that demonstrates the download of the same content. Note that you can upload various types of content, so please do try. Finally, use case 3 processes an image before uploading to the Object Store. You will also be presented links to retrieve the images directly in your browser.

You have the source code at your disposal, so you can freely try things out. We hope you have fun playing with the various architecture models and that you find the one that most suits your needs.

No comments:

Post a Comment