Archive for 2008


Thursday, December 18th, 2008

I recently had a bit of fun at the National Library of Australia’s Christmas party playing Michael Jackson. It was so much fun.

New Zealand then and now

Tuesday, December 9th, 2008

On November 27 the National Library of New Zealand became the sixteenth institution to join The Commons.  In what was perfect timing, within days Google launched their streetview service for New Zealand.  Of course I’ve modified my then and now mashup to include the images on Flickr from the National Library of New Zealand.  They’ve been busy geotagging their images and it’s all starting to come together and providing some interesting looks into how New Zealand has changed over time. Start exploring New Zealand then and now.

Flickr commons in my neighbourhood

Monday, November 24th, 2008

Following on from my previous then and now Flickr commons meets streetview demonstration, I started to think of how could I bring that experience to a user based upon their current location – take out the streetview and replace it with a real life view.  Can you provide an immersive experience for a user, giving them only the historical items from a cultural institutions collection, that are relevant to their current location?  Imagine being in a location with your laptop or mobile phone, and being able to see exactly what the location you are standing in looked like in the past.

For example – thanks to free wifi from the Apple Store in Sydney on George Street, I built a web application to show me my current location.  It’s pretty accurate, but not perfect.  In this case it’s showing my location as about 100m from where I actually am (on the corner of King St and George St).  I can then display all the images from Flickr Commons that relate to that area.  As you can see there are a variety of historic images available and I’ve selected an image showing Martin Place.

Flickr in my location

From where I am standing on George Street I can see the real life buildings that are shown in the historic photo.

Martin Place

If I then walk down George Street 100 metres I can be in the same environment where the photos were taken and can compare the historic image of my current location on my laptop to the environment I am standing in.

Flickr in my location

So how does this work?

Up until a few months ago, the only option available was to to guess the location of the user based upon their IP address.  This might have been able to give the application the city that a user was in, but it was unlikely to provide it with a more accurate location than that.  Recently two plugins based around the W3C Geolocation specification have been developed – Geode from Mozilla labs for Firefox 3 and geolocation functions have been added to Google Gears which is available for a variety of browsers.  By using these plugins I can determine a reasonably accurate latitude and longitude for a users location and if a user doesn’t have either of these plugins installed, or for privacy reasons decides not to allow these to broadcast their locations, I can fall back to using the much less accurate IP address lookup.

Once I have a users location, I can use the Flickr API to return all the images that are within a certain radius of the user.  If I also use Google Maps as the mapping application, I can also add wikipedia articles from the users current location into the mix.

Try it out

I’ve developed two prototypes, the first one returns historical images from Flickr Commons. Given that there aren’t a vast number of photos in the Commons yet, and even fewer have geolocation information added, unless you are around the Sydney CBD area, your mileage may vary.

The second prototype returns images from Flickr that have a creative commons license. As there are a lot more images available in this category, the chances of getting a result is much greater.  For either of these to work you’ll need to download one of the plugins.

Communicating University Research Identity

Thursday, October 23rd, 2008

These are notes from a talk by Simon Porter from the University of Melbourne at the Libraries Australia Forum 2008.

Simon Porter

All you needed to know about a University was a book.  Number of pages increase over time from 16 pages in 1870 to 227 pages in 2004.  Although the scope remains the same, the size increases.  Now the Universty calendar is a brand for an online resource.  From the webpage you get sent to a faculty home page.  The information isn’t collated in the way it used to be and it is often stored in many places rather than a central repository.

Important contextual framework for history.  Structure a history.  Because we are now in the online space, we can do different things with it.  We can collect stories not from just one individual, but from many individuals and relate them together.

In 2003 the University was disparate systems, with the information replicated all over the place.  By 2005 much of the information started to be in one place.  By 2006, they could take this information that had been prvate and make it public, giving each academic their own web page showing their information, their publications, their awards and honours.

With the list of publications on their pages they can construct OpenURLs to try and source the publications online.  They can then also link to other academics that have worked on the same projects or grants.  This is required as part of Government reporting.

Cornell Universities VIVO project.  They don’t have the same reporting requirements that we have in Australia, but they’ve build it (using RDF).  Expertise island in Ireland is a similar project.

The data has gone from being facts to being identities, not just representing the information that is there, but making an authority.  They have responsibilities to present the correct information now that the information is public rather than private.

What about privacy?  At the University of Melbourne it was expected that part of your duty was to the public.  There are some issues, they have the option of hiding their contact details or making them available.

Next Generation Library Catalogues

Thursday, October 23rd, 2008

These are notes from a talk by Eric Lease Morgan from University of Notre Dame at the Libraries Australia Forum 2008.

Eric Lease MorganThe environment is changing, cheap computers that are globally connected have changed the way libraries work and what they are about.

When items are analogue it is important to create surrogates of our items.  Libraries had to create a catalogue to be able to describe it as there was no way to directly access physical holdings.  Now that items are often born digital, it’s not as necessary to create surrogates as it used to be.  Things like full text indexing can supplement a catalogue.  Indexing was ignored by libraries for decades, then Google came along and proved it could be done.  As items are born digital, a person coming to a library and accessing an item in a specific physical space is no longer, it can be accessed from anywhere.  Enormous amounts of information are held on things like USB drives (all of WorldCat can be stored on an ipod) and it’s cheaper than in the past.

Librarianship consists of 4 processes:

  1. Collection: done by bibliographers and can be supplemented through the use of databases
  2. Preservation: done by archivists, most challenging in the current environment
  3. Organisation: done by cataloguers supplemented by databases and XML
  4. Re-distribution: done by reference librarians

These processes won’t be outdated due to technology, it’ll just change the way they are done.  If you think about books, you don’t have much of a future, but if you think about what is in books, then you have a future.

There are two services the user can interact with:

  1. query against the index
  2. query against the content

In the past, users could only do queries against an index.  Now users can do queries directly against the content, for example carrying out a full text search on a book or a newspaper. The real future is in the growth of services against the content. This means users can partake in things like:

  • Annotation
  • Create tag clouds
  • Taking quotations and citing it
  • save it to ‘my favourites’
  • working out how often words are included, or what are unique words across a collection

Libraries are always a part of a larger hosting communities.  Learn how to take advantage of this fact and put searches against the catalogue into the users context.  This used to be done face to face, you built a relationship with the librarian, why is it so impersonal on the web?  You can replicate, but not replace this with a computer.

You need to know your user. For example, if a user is searching for nuclear physics, the results you should return are different if the user is a physicist or a high school student.

Database are great for organising and maintaining content, but they are lousy when it comes to search.  You have to know the structure of the database in order to do a search.  Indexes are the opposite.  An index is a list of words with pointers to where the word can be found.  You don’t need to know the structure of the database and you can do things like relevance ranking.

“Next-generation” catalogues such as vu-find, evergreen, primo, aquabrowser….. they are all very, very similar with the exception of evergreen which is an intergrated library system.  Discovery systems deal with MARC records, EAD, XML – these systems normalise them to create an index, most of them using Lucene as the indexer.  Open source with a layer on top.


The library catalogue isn’t really YOUR catalogue.  Include everything related to your audience in an index, not just stuff that you own.  Make sure everything in there is accessible via Google, Yahoo, MSN.  Put as much open access content in there as possible. Gather it and include it in your index, you can’t rely on others to do it and it’s easier to search and do things with the data when you have control of it. Apply a library eye to incoming queries (eg: munge the query into a phrase search to enrich the query).  We need to do less library standards and more W3C standards. Repurpose the system by exploiting SOA and RESTful computing techniques.


How we do things are changing, requiring retraining and a shift in attitudes to investigate ways to exploit the current environment.