Archive for April, 2011

Code4Lib - Virtual Lightning Talk

I know I was late and therefore I couldn’t join the group this afternoon, when I tried to login I got a message “Error: you cannot attend the meeting because its capacity has been reached.” :-(

Anyway, in case anyone is interested on the topic of File Viewers for DSpace Collections, here is a link to a PDF file with a basic outline of the talk; also, key points of what I was going to talk about include:

Why viewers for DSpace?
DSpace is quite popular in the academic world, many institutions are using it now for a variety of objects –e.g. manuscripts, maps, books, videos, etc.  In January 2011, there was a discussion on this topic in the DSpace tech listserv; several people jumped in and talked about the need/importance of an out-of-the-box solution for DSpace.

How others are doing it?
- Texas A&M -the developers of the XMLUI framework- for a folio collection, they’re using thumbnails as links to individual pages, the page images are then displayed on a pop-up window using some Lightbox script.
- Another solution is provided by @mire –a commercial organization that develops modules for DSpace.  Their Document Streaming Module enables in-browser viewing of document files and it based on the Scribd’s iPaper document viewer.
- There is also a test/prototype using PDFs and the Google Docs Viewer.

Local evaluation of alternatives (workarounds)
- In the summer of 2010 we implemented a flash-based viewer that reads jp2 files and feeds a dynamic viewer based on OpenZoom -a front end of the IIPImage server; the viewer includes zoom and full-screen options and it seems to work great for newspaper or magazine files.
- More recently, we tested an existing PHP script that can take a DjVu file and creates JPG files for each page; in the same script, we customized a basic toolbar that can help users navigate from page to page or use a drop-down menu to jump to any page.
In both cases, we’re using some PHP scripts to generate the preview files and then using an identifier to embed them into DSpace.

What’s next?
Obviously, the desired goal in implementing a file/document viewer in DSpace is to optimize the interaction between a DSpace record page and users.  With the current default option, there is no preview and if users need to view a page or some pages in a multi-page file, they would need to manually download the file … and for a file with 100+ pages and/or 50+ MB, this can be an issue.  So I think the hope is that someday with the help of others, we can have an Out-Of-The-Box Document Viewer for DSpace, which ideally can include features like: zooming, searching, options for exporting to other formats, as well as something that can work on any device -regardless of their screen size, OS or browser.

Oh well, that’s it for now … feliz fin de semana!  :-)


No Comments »

Testing & Prototyping: File Viewers for DSpace

As more institutions continue to work with large and diverse type of content for their digital repositories, there is definitely a “new” need to evaluate, prototype and implement file viewers to adequately display files from non-text collections.

When displaying a record in DSpace -by default- users will only see the metadata and a box with basic description of the file/s associated with the record.  This method may be fine IR projects where most files are in PDF format; however, for non-text collections users will most likely expect to get a “snapshot” of the file/s in the record.  Displaying something on the first page of a record becomes particularly important for unique and/or complex files such as: maps, manuscripts, books, or multimedia.  For these types of files, there is usually a need for features like: zooming, magnifying, searching, streaming, etc.

Then, the question/challenge is: how to implement viewers using the DSpace files that can be embedded on the record’s page along with the metadata?  For image collections, there seems to be a good number of examples –including many using some combination of Lightbox script.  An afternoon test for multi-page documents using a DjVu file seems to produce some decent results.  The trick is to create a set of JPG files for every page -using a local PHP script based on DjVuLibre- and embedded it in the item_view element.  This option (prototype) would allow users to view pages sequentially or jump to any page of the document and of course they will have the option to download a PDF version of the file if they need to.  In short, this workaround seems to work fine as an early prototype, an OCR feature will make it perfect.

Below is a screenshot of today’s test:
In since the output file is a JPG, it works perfectly fine on the iPads and probably most other tablets as well.


No Comments »