NAME Image::Delivery - Efficient transformation and delivery of web images INTRODUCTION Many web applications generate or otherwise deliver graphics as part of their interface. Getting the delivery of these images right is tricky, and developers usually need to make trade-offs in order to get a usable mechanism. Image::Delivery is an extremely sophisticated module for delivering these generated images. It is designed to be powerful, flexible, extensible, scalable, secure, stable and correct, and use a minimum of resources. DESIGN Because it can take a little bit of work to set up Image::Delivery, we will start with a quick once-over of the design of the API, and the reasons and use cases that drove it. Preventing Multiple Server Calls Use Case 1: CVS Monitor The initial idea for Image::Delivery was due to some problems with the design of CVS Monitor (L<http://ali.as/devel/cvsmonitor/), an advanced but extremely resource-hungry MVC CGI application. Many of the CVS Monitor views have a single large graph on them, which involves a second call to the server that starts just before the previous call ends. Generating the graph took minimal extra effort, but the overhead of starting another process and loading another 100meg of data creates a double whammy hit to the server. What would be ideal would be to generate both at once and have the browser get the image without a CGI hit. The solution to this problem, and the primary mechanism that Image::Delivery implements could be called "Static Delivery via Cached Disk", but is best demonstrated with the diagram outlined in General Structure below. Use Case 2: Thumbnails One problem with thumbnailing is the vast number that need to be generated. When done on demand, if generated by the image request, you will have large numbers of processes working. The normal solution is to pre-generate the thumbnails, potentially polluting image directories. Image::Delivery stores all images in one central cache, so that the original images are unaffected. General Structure Image Provider | |BLOB + TransformPath | \1/ Image::Delivery | \ | | | | \2/ | Hard Disk | /5\ | |URI | | | | | | | \6/ | Web Server | /4\ | / | |gzip / \ | / \ \7/ \3/ Web Browser 1) Image Data pulled from Object/Provider An Object, or a Provider that accesses the data from outside the API, generates or obtains the image data and various metadata that describes the image data. 2) Image Written to File-System Image::Delivery writes the image to the filesystem with a specific file name 3) URI sent to Browser in HTML Image::Delivery determines the matching URI that points to the location of the written file, and provides it to be used in an "img" tag in the generated HTML page. 4) Web Browser Requests Image Having received the HTML, the browser requests the image from the web server. 5) Web Server Finds Image File The web server receives the image request and finds the file that was written at step 2) 6) Web Server Retrieves Image File Web server reads the file like any other plain file 7) Web Server Sends File to Browser Web server sends the file off to the browser Digest::TransformPath Image::Delivery works around source objects. Each source object may want to work with more than one image, and each image may need to come in several different versions. In short, there can be lots of variations of images. To handle this, we utilise (or SHOULD utilise) Digest::TransformPath to help identify the images, with a 10 digit digest built into the filename. Might as Well Cache Them Since we went to all that effort to write the file, its relatively easy to add caching. But the most important thing if we are going to cache is to have a good file naming scheme. Image::Delivery Naming Scheme In order to make this all work, the naming scheme is critical. The basic path format is: $ROOT/Object.id/checksum.type Object.id When an object is updated, it may have any number of Image fields, which may each have any number of scaled/rotated/morphed/derived images. When a source object is updated, some or all of these need to be cleared. checksum The checksum calculated from the TransformPath does not describe any of the data, only the data source and modifications to it. This means that it is possible to cheaply test if the image for a particular transform has already been created, without having to access any of the data in the actual images. type Because we accept image data in a variety of formats, its not possible to know what image type any given image should be. So when testing we simply check the lot until we find one. Generally, rather than test 10-15 types, the Provider will inform us of the types to expect. :) Operation Profile All of this junk gives the module the following properties - Intrinsicaly supports all major image types - No pre-generation of images, generates everything on-the-fly - Image names are secure and can't be predicted - All images for any page are processed in one process hit - Cache checking is extremely quick - Never touches image source data when not filling the cache - Handles many images. Storage extendable to support thousands to millions of individual images - Multiple hosts can work with the same Image cache - Images can be delivered by a different web server to the application DESCRIPTION Image::Delivery is very powerful, but setting it up may take a little bit of work. Setting up the URI <-> path mapping First, you need to become aquainted with HTML::Location. This is used as the basis for the mapping between the disc and a URI. You should also make sure that whatever process will be running will have write permissions to the appropriate directory. For starters, we would suggest creating the cache directory just under the root of a website, at "$ROOT/cache", which will be linked to "http://yourwebsite.com/cache/". This will let you create your HTML::Location. # Set up the location of the cache my $Location = HTML::Location->new( "$ROOT/cache", "http://yourwebsite.com/cache" ); This gives you the absolute minimum Image::Delivery itself needs to get rolling. With a location to manage, you can then start to fire images at it, and it will store them and hand you back a HTML::Location for the actual file. # Create the Image::Delivery object my $Delivery = Image::Delivery->new( Location => $Location, ); However, the tricky bit is probably setting up your Provider class. Although the abstract class implements much of the details and defaults for you, you are probably still going to need to do some work to tie the two together. STATUS While the concept and design are fairly well understood and unlikely to change, there is an unfortunate situation with regards to the Cache:: family of modules. Although originally written to live at Cache::Web and to be a little more general, it was felt by the maintainer that Cache::Web would represent the module as being a full member of the Cache:: family, which it is not. However, during the first few releases I hope to at least try to move the API of Image::Delivery as close to Cache:: as possible, possibly under a common Cache::Interface class, to gain some potential benefits from code written on top of it. Until these comments are updated, you should assume that the API may undergo some changes. METHODS new %params The "new" constructor creates a new Image::Delivery object. It takes a number of required and optional parameters, provided as a set of key/value pairs. Location The required Location parameter Location The "Location" method returns the HTML::Location that was used when creating the Image::Delivery. filename $TransformPath | $Provider The "filename" method determines, for a given $TransformPath or $Provider, the file name that the Image should be written to, excluding the file type. This is the method most likely to be overloaded, so enable a different naming scheme. exists $TransformPath | $Provider For a given Digest::TransformPath, or a ::Provider which contains one, check to see the a file exists for it in the cache already. Returns the HTML::Location of the image if it exists, false if it does not exist, or "undef" on error. get $TransformPath | $Provider The "get" methods gets the contents of a cached file from the cache, if it exists. You should generally check that the image "exists" first before trying to get it. Returns a reference to a SCALAR containing the image data if the image exists. Returns "undef" if the image does not exist, or some other error occurs. set $Provider The "set" method stores an image in the cache, shortcutting if the image has already been stored. Returns the HTML::Location of the stored image on success, or "undef" on error. clear $TransformPath The "clear" method allows you to explicitly delete an image from the cache. This would generally be done for security purposes, as the cache cleaners will generally harvest files directly, rather than going via TransformPaths. Returns true if the image was removed, or did not exist. Returns "undef" on error. TO DO - Add ability to mask indexes with empty HTML files - Add cache clearing capabilities - Add file locking to prevent race conditions in the cache - Add pluggable cache cleaners SUPPORT All bugs should be filed via the bug tracker at <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Image-Delivery> For other issues, contact the author AUTHORS Adam Kennedy <adamk@cpan.org> COPYRIGHT Copyright 2004 - 2007 Adam Kennedy. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. The full text of the license can be found in the LICENSE file included with this module.