Thursday, November 20, 2008

OpenRef and bioGUID

One of the judges for the Elsevier Article 2.0 Contest is Andrew Perry, whose blog has some posts on Noel O'Boyle's OpenRef idea (see DOI or DOH? Proposal for a RESTful unique identifier for papers). Andrew discusses some implementations he has come up with, and compares OpenRef with OpenURL. This prompted me to add OpenRef-style identifiers to bioGUID's OpenURL resolver.

Basically, OpenRef is a human-readable identifier for an article, based on concatenating the journal name, year of publication, volume number, and starting page, for example:

openref://BMC Bioinformatics/2007/8/487

The equivalent OpenURL link would be

http://bioguid.info/openurl/?genre=article &title=BMC%20Bioinformatics &date=2007 &volume=8 &spage=487

Andrew notes:
A key cosmetic (and philosophical) difference between OpenURL and OpenRef/ResolveRef URLs is that OpenURL uses HTTP GET fields, eg ?title=bla&issn=12345, while OpenRef/ResolveRef uses the URL path itself eg, somejournalname/2008/4/1996. It’s a bit like one scheme was designed in the age of CGI scripts, while the other was designed for web applications capable of more RESTful behaviour. In my mind OpenURL is more versatile but much uglier, while OpenRef is cleaner and simpler but can only reference journal articles.
Of course, it is straightforward to add openref-style URLs to an OpenURL resolver by using URL rewriting, for example:


RewriteRule ^openref/(.*)/([0-9]{4})(.*)/(.*)
openurl.php?title=$1&date=$2&volume=$3&spage=$4&genre=article [NC,L]

I've done this for my resolver. One limitation of OpenRef is that there are many different ways to write a journal's name, so you can't determine whether two OpenRef's refer to the same journal by simply string matching (as you can with a DOI, for example -- if the DOI's are different the article is different). For example I might write BMC Bioinformatics and you might write BMC Bioinf.. One way around tis is to have unique identifiers for journals, which of course is the approach Robert Cameron advocated with Universal Serial Item Names and JACC's. The obvious candidate for journal identifier is the ISSN. I guess the problem here is that it's easier to use the journal name rather than require the user to know the ISSN. OpenRefs are certainly easier to write. Hence, I think they are great as a simple way for people to construct a resolvable URL for an artcle, but not so great as an identifier.

No comments: