Archive for the ‘Mashup’ Category

YQL mashups for libraries

Wednesday, December 9th, 2009

In October GovHack was held in Canberra. I went along as a participant, but also to advise any teams on the use of the National Library of Australia’s API’s. One of the things I spent my time doing there was to make some YQL Open Data Tables for some of the Library’s services. Why is this interesting? Let’s go back a few steps.

YQL is a service from Yahoo that provides a SQL like environment for querying, filtering and joining web services. So instead of having to write a complex URL to access data from a website, we can use YQL to write a statement that is similar to an SQL query that we might use to obtain data from a MySQL database, except, instead of querying a database, we are querying a web service. As an example, you can enter the following into the YQL console to extract photos of the Sydney Harbour Bridge from Flickr:

SELECT * FROM WHERE text="sydney harbour bridge";

When YQL was launched it initially had options to query only Yahoo’s services. If you wanted to query a web service that was outside of Yahoo’s services you were out of luck. Since then Yahoo has allowed developers to build YQL Open Data Tables. An Open Data Table is an XML file that acts as a bridge between your API the YQL language and you describe how your API is structured in terms that YQL can understand.

If we wish to use an API to return data from one of the Library’s services, say Picture Australia, we can query it using the following URL:

As you can see, it starts to become a fairly complex URL with a lot of querystring values to point towards where we need to extract the data from.

Now let’s create that same query using YQL. Firstly I created an Open Data Table for Picture Australia. This is the key component that ties Picture Australia and YQL together. If you now enter the following into the YQL console & you’ll get back an XML feed from Picture Australia for the pictures of the Sydney Harbour Bridge.

USE "" AS pictureaustralia;
SELECT * FROM pictureaustralia WHERE searchTerms="sydney harbour bridge" AND startPage="1";

Alternatively you can query The National Library of Australia’s catalogue for pictures of the Sydney Harbour Bridge by using this Open Data Table and entering the following term into the YQL console:

USE "" AS nla;
SELECT * FROM nla WHERE lookfor="sydney harbour bridge {format:Online AND format:Picture}";

So how is this interesting? Can’t all of this information already be gathered from our standard API’s? There are a couple of advantages to using YQL. One advantage is being able to extract just portions of the data. Say you want to extract just the title, description and persistant URL of the records and you only want to return the first 3 items, you can just enter:

USE "" AS pictureaustralia;
SELECT title,description,link FROM pictureaustralia WHERE searchTerms="sydney harbour bridge" AND startPage="1" LIMIT 3;

or you could just extract a link to where the most relevant original item is stored.

USE "" AS pictureaustralia;
SELECT enclosure.url FROM pictureaustralia WHERE searchTerms="sydney harbour bridge" AND startPage="1" LIMIT 1;

This starts to give you a bit of flexibility in the fields and amount of data that is returned and limit the amount of parsing that you have to do. All the hard work is being done by the servers at Yahoo.

But the really fun stuff starts when you try to create a little mashup by combining data from different services. Let’s use YQL to find the current number 1 artist at Yahoo’s music service:

SELECT name FROM music.artist.popular LIMIT 1;

We can now easily combine this search with a search for the top 5 items from or about that artist in the National Library’s catalogue:

USE "" AS nla;
SELECT * FROM nla WHERE lookfor IN (SELECT name FROM music.artist.popular LIMIT 1) LIMIT 5;

Once we have constructed this query, we can access that using a JSON-P call and use a little bit of JavaScript to display the results within a web page (see example 1).

<div id="nla"></div>
<script type="text/javascript">
function nlabooks(o){
  var f = document.getElementById('nla');
  var out = '<ul>';
  var books = o.query.results.item;
  for(var i=0,j=books.length;i<j;i++){
    var cur = books[i];
    out += '<li><a href="' + + '">'+ cur.title +'</a></li>';
  out += '</ul>';
  f.innerHTML = out;
<script type="text/javascript" src="*%20FROM%20nla%20WHERE%20lookfor%20IN%20(SELECT%20name%20FROM%20music.artist.popular%20LIMIT%201)%20limit%205%3B&format=json&diagnostics=false&callback=nlabooks"></script>

We’ve now got a little widget that we can use inside any page to dynamically mashup 2 separate data sources.

If we were to do that in a traditional manner we would have to be writing two separate calls to the web services and possibly parsing the results in different ways. By using YQL, all that hard work can be carried out in a minimal amount of code.

Building these tables was as much a case of learning a bit more about YQL and the possibilities that it can offer. What I’ve shown here is a simple demonstration at the ease with which you can use services like YQL to expand your data to a wider audience.

Note: Please don’t build any mission critical applications using these data tables – they are only there for demonstration purposes. I’ll hopefully make them more permanent and hosted on the National Library’s servers.

Building location aware websites

Friday, July 24th, 2009

On the 24th of July I gave a presentation to the Canberra Web Standards Group on Building location aware websites. Here are the slides and notes from my presentation.

Slides 1-2
Welcome I’m Paul Hagon a web developer at the National Library of Australia. This is my twitter handle if you are twittering about my presentation while  I talk

Slide 3

Slides 4-5
Traditionally websites have required the user to make a choice about their location. This is stored in a cookie or within the user login.

Slides 6-9
There are applications where I don’t want to make the choice. I am travelling and in a different location and I want the information that is relevant to my current environment. A perfect example of this is the weather. I’m primarily interested in the weather and forecast for where I am.

Slide 10
The W3C geolocation group released their first working draft in late 2008. Their final recommendation is due to be released at the end of 2009. Their goal is to:

define a secure and privacy-sensitive interface for using client-side location information in location-aware Web applications

Slide 11
Location detection takes a variety of forms. The first form is an IP address lookup. If you are lucky this might give you the users location to the nearest town or state. It is generally fairly inaccurate. The next option is to determine the location of your wi-fi network router. If the user is on a cellular network, their location can be triangulated by  using the tower ID’s. These methods can be very accurate (to within a couple of hundred metres). The final method is to use a dedicated GPS chip and obtain a satellite fix. This is accurate to within a few metres.

Slide 12-13
Mobile phones started to have built in GPS chips, but it was really the iPhone that opened up the possibilities in this area. The problem was, the location sensors could only be accessed through dedicated iPhone applications written in Objective C. We are web developers and like angle brackets rather than square brackets. It’s a bit of a leap to go to a ‘proper’ programming language

Slide 14-15
Recently 2 developments took place. Firstly Firefox 3.5 was released. In amongst the newer JavaScript engine and native HTML5 audio and vide support, it also featured native geolocation functions. The iPhone operating system was also upgraded to OS 3.0 and with it, access to the iPhone location sensors was made available to mobile Safari.  Both of these implemementations followed the draft W3C guidelines. Native geolocation is also available within development builds of Opera and Fennec (mobile Firefox).

Slide 16
So where does this leave Internet Explorer (and the desktop version of Safari)? Users of these browsers can download Google Gears. This is typically used to offline access to things like gmail, Google docs etc. It also makes available some geolocation functions, although they are slightly different to the W3C recommendations.

Slide 17
A user can also use a service such as Fire Eagle to update their location, and this web service has an API that allows the data to be shared between sites (for example automatically updating your twitter location).

Slide 18-20
Privacy is a major concern. A user has to opt in to sharing their location with a web site. These services store IP addresses, the access point information and a unique identifier (for a period of 2 weeks). No identifiable information is passed to or stored by these services. You probably have in place something in your privacy policies to cover storing log files. We tend to know a fair bit of general location information about our users anyway from things like Google Analytics reports.

Slide 21
Users start broadcasting their location through services like Google Latitude or brightkite. This raises many more privacy issues, and they have options to allow a user to decide just how much information they wish to share.

Slide 22-30
The code to make it happen. Create a function that we can call from an event like a page load or event click. Make a location call. If the call is successful, extract the latitude and longitude. If it is unsuccessful (may not be able to get a signal or the service may not be able to resolver your location) do something else.

Slide 31
Reverse geogoding is the process of turing a latitude and longitude into a human readable form.

Slide 32-37
An example of a location aware application. It’s a mashup searching for photos in a particular location. Firstly in Safari (a non native geolocation aware browser) the user has to pick the location. In Firefox 3.5 (a native geolocation aware browser) the user can ask to be taken directly to their location. The browser asks for their permission before making the call. The location is accurate to a few hundred metres. Now some of the results aren’t totally accurate. It is searching via a name as there is very little location data in the records.

Slide 38
There are 3 instances of Parkes on the page – Parkes ACT, Parkes NSW and a name, Henry Parkes. It can’t differentiate between them

Slide 39
There is a service called Yahoo Placemaker where you can pass in data and it will return the geographic information for that data.

Slide 40
Passing in Parkes Australia we get the relevant geographic information for both locations of Parkes

Slide 41-43
Placemaker also accepts a URL as input. Lets pass some information from Open Australia into it. Open Australia is an application that allows users to see what their members of parliament have been doing. We could add location aware services to this to instantly be able to select the senator for the area we are currently in, or to find all the references to the area you are in, to see what decisions have been made that may have an effect on you.

Slide 44
Placemaker extracts the location names from the text of the page and returns any associated location data

Slides 45-48
Is this usable or is it still too cutting edge? iPhone usage is small (in the overall website usage), but users update quickly and have the capability to use location aware services. Firfox usage is also small, but as it has just been released it will take a little time to build up a user base. Firefox users tend to update rapidly. Users whose browsers have the capability to use location based services if they install Google Gears is more than 95% of our visitors.

Slides 49-50
I expect to see many more location based websites in the future. This presentation is available on slideshare & the references I’ve used are up on delicious. Thank you.

DigitalNZ location search

Thursday, June 18th, 2009

Over the past couple of months I’ve been building a little application using the API’s from the DigitalNZ project. DigitalNZ is a collaboration between government departments, publicly funded organisations, the private sector, and community groups to expose and share their combined digital content. Part of their plan to expose their data is to provide a publically available API for developers to expose their content in ways they may not have thought about.

Typically, a large dataset has a search box as it’s main interface. I wanted to get right away from that approach and create an engaging interface. This uses a map interface to allow the user to freely explore the content.

It currently uses a combination of API’s from Google and Flickr to convert a latitude and longitude from the map to obtain a place name. It then displays a shapefile from Flickr to approximate the area being searched, and returns a list of relevant results from DigitalNZ. Since I started work on this, the data returned from both of these API’s have been released under a Creative Commons license (Yahoo have released their geoplanet data and Flickr have release their shapefile data). I’ll end up incorporating these releases into the application rather than relying on the API’s for the functionality.

Explore the contents of DigitalNZ.