<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Circulatable: a Librarian's Group &#187; Technology</title>
	<atom:link href="http://circulatable.org/category/technology/feed/" rel="self" type="application/rss+xml" />
	<link>http://circulatable.org</link>
	<description>Because sometimes you need to trammel the editor and exorcise the rules of grammar...</description>
	<lastBuildDate>Wed, 12 May 2010 02:36:27 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Woe is MARC (&amp; URLs)</title>
		<link>http://circulatable.org/2010/04/20/woe-is-marc-urls/</link>
		<comments>http://circulatable.org/2010/04/20/woe-is-marc-urls/#comments</comments>
		<pubDate>Wed, 21 Apr 2010 03:38:06 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Cataloging/Classification]]></category>
		<category><![CDATA[Code]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[bibliographic despondency]]></category>
		<category><![CDATA[cataloging]]></category>
		<category><![CDATA[marc]]></category>
		<category><![CDATA[urls]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=180</guid>
		<description><![CDATA[Background
At present I am working on a project that we have given the code name UW · Forward. Forward is a union index over the library collections within the UW System. Currently the UW System Libraries has 14 different ILS Catalogs, one for each of the 13 four year campuses and another one for all [...]]]></description>
			<content:encoded><![CDATA[<h3>Background</h3>
<p>At present I am working on a project that we have given the code name <a href="http://forward.library.wisconsin.edu/">UW · <em>Forward</em></a>. Forward is a union index over the library collections within the UW System. Currently the UW System Libraries has 14 different ILS Catalogs, one for each of the 13 four year campuses and another one for all of the two year campuses. Forward searches the data across all libraries.</p>
<p>We deduplicate all of our MARC records so that we can have a <a href="http://forward.library.wisconsin.edu/catalog/ocm30780581">single record</a> for items held at multiple locations. As part of the process of deduplication, for certain MARC fields, we add all instances of a given field into the combined record. We do this for holdings represented in 852 fields and for URLs in 856 fields.</p>
<h3>The Problem</h3>
<p>MARC capture of URL description has proven to be inadequate in our context. We simply do not have the kinds of information coded in <a href="http://www.loc.gov/marc/bibliographic/bd856.html">856 fields</a> to clearly present access to online representations of library items.</p>
<h4>Licensed Content</h4>
<p>When a particular campus has access to a licensed online resource, the local version of the URL used is incredibly important. The library in question usually needs to proxy the URL through a campus authentication &amp; authorization mechanism to provide off campus access to databases and e-journals. These are arguably the most important online resources in our catalogs &#8211; they are not only carefully selected, like all online resources, they are also some of our patron&#8217;s preferred library resources and the ones taking up greater and greater percentages of the library budget.</p>
<p>This is where the first problem lies: the MARC 856 field does not provide any indication that a given URL is for a licensed resource that has restricted access. This problem is complicated by the fact that sometimes a UW System campus catalogs a URL in its proxied form and other campuses insert a proxy string at runtime. In this case the cataloged form of the URL is not proxied but the OPAC software inserts the proxy prefix when it outputs a web page. A &#8220;restricted access&#8221; or &#8220;deep web&#8221; indicator would be great here. Alas, the indicators are already reserved.</p>
<p>What makes this more complex here at the UW System is the fact that the Forward project is a union index and provides a single view for all campuses. So we have the added problem of not just telling the end user what the correct URL is, but for whom that URL is valid. Traditionally this kind of information is coded into the free text public note subfields (usually <code>$z</code>). As a free text field, the way any given library indicates that a URL is available only to certain users is all over the place. There is nothing consistent enough in our MARC records that come from multiple sources that I can address in code.</p>
<h4>Free Online Resources</h4>
<p>The free online resources that have been cataloged in our OPACs are also extremely problematic when introduced into a consortial/union context. The following scenario represents a huge missed opportunity. There are certain resources that have been cataloged by one or a few of the campuses within the UW System. A great example would be our own UW Digital Collection Center&#8217;s site for <a title="Aldo Leopold Archives" href="http://digital.library.wisc.edu/1711.dl/AldoLeopold">The Aldo Leopold Archives</a>. Forward currently has a <a title="Forward record for the Aldo Leopold Archives" href="http://forward.library.wisconsin.edu/catalog/ocn229109923">record for this archival collection</a>, for which the data comes from a few schools within the UW System. A close inspection of the MARC data in question the staff view raises a few problems:</p>
<p>There are two URLs, each cataloged 6 times by different schools. However the digital collection that has been cataloged should clearly indicate that it is available to the entire patron base of faculty, staff &amp; students from the entire UW System, not just the six schools who contributed records. We are missing an opportunity to let the other schools who did not catalog the resource themselves benefit from the cataloging provided by the 6 schools that did catalog this digital collection. Again, it would be nice to have an indicator which states this is an &#8220;unrestricted free online resource&#8221;. The same indicator but opposite value as what is needed for licensed online resources.</p>
<p>Furthermore, there is no agreement among the schools who did catalog the resource as to whether the URLs in question are for the thing itself or a related resource. Notice the way that the second MARC indicator is applied inconsistently in the <a title="MARC 856 field specification for Electronic Location and Access" href="http://www.loc.gov/marc/bibliographic/bd856.html">856 fields</a>. Some campuses consider both URLs to be related to this resource, while one school thinks one of the URLs is for the resource itself and another thinks that both URLs are the resource itself. The inconsistencies here make it impossible to write code that efficiently displays to an end user how to use the resource in question, which is to say that the descriptive function of the cataloging in question is failing in our unified context.</p>
<h3>Steel cage grudge match!</h3>
<p>In this corner we have Licensed Resources, weighing in at a hulking 60% of your library budget. In the opposite corner are Free Resources, nimble, agile and known to fell a giant or two. Get your tickets. The fight will be phenomenal and sensational.</p>
<p>Finding solutions that make sense for free online resources are proving to be the opposite solutions that work for licensed content. For example, for free content, the best thing to do is drop any affiliation with the school who cataloged the resource in question, deduplicate the set of URLs and display a small set of links to everyone. However, for licensed content the affiliation between school and URL is important so that the end user can authenticate and prove she has the credentials to use the resource in question. And all the while there is nothing in the MARC data that tells me a given URL is restricted to a particular subset of the population for whom we intend will use our union catalog.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2010/04/20/woe-is-marc-urls/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Library Vendors &#8211; API me!</title>
		<link>http://circulatable.org/2010/03/27/library-vendors-api-me/</link>
		<comments>http://circulatable.org/2010/03/27/library-vendors-api-me/#comments</comments>
		<pubDate>Sat, 27 Mar 2010 15:22:24 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Code]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[ils]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[voyager]]></category>
		<category><![CDATA[web services]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=165</guid>
		<description><![CDATA[I am spending most of my time at work these days on the UW · Forward project. One of the big features we are trying to bake into the application is the ability to place requests from one library to another or from campus to campus in the UW System. To accomplish this we are [...]]]></description>
			<content:encoded><![CDATA[<p>I am spending most of my time at work these days on the <a href="http://forward.library.wisconsin.edu/">UW · Forward</a> project. One of the big features we are trying to bake into the application is the ability to place requests from one library to another or from campus to campus in the UW System. To accomplish this we are using the <a href="http://www.exlibrisgroup.com/?catid={8E2F087A-6266-4E8D-B784-FF321DFADE27}">Voyager XML Over HTTP Web Services API</a>. Using these web services feels right, like a nice sweet spot in the relationship between libraries and their ILS vendors.</p>
<p>The API works well, though my institution is only using Voyager 7.1 and so we are using one of the early versions of the APIs and it has parts that are not documented well or that are a bit sloppy. There are seeming mismatches between some of the XML values returned and the data the element names represent. For example, when placing a request there is an expiration date that is returned. However, the timestamp that is returned corresponds to the time the request was placed.</p>
<p>Additionally, it is difficult to find complete documentation from Ex Libris on all the different cases for which data is returned. When staff process a request made by a patron, there are many stages that the request steps through. The documentation is a little light on what numeric codes correspond meaningful stage names. Or certain elements appear and disappear as a request is put through different stages, but it is difficult to know this without nearly reverse engineering the process.</p>
<p>These are places that the API could be improved (and yes, documentation is just as important a piece of an API as the API code and requests/responses themselves). But overall, I am quite glad that our ILS vendor is putting these services in place. It enables us to embed ILS functionality in places for which it would be unreasonable to expect the vendors themselves to put them. The work we are doing on our Forward project can also be used in places like the campus portal, where people manage their other accounts associated with university life.</p>
<p>We are writing a Ruby plugin for the Voyager API. We are starting with the basic functions we need:</p>
<ul>
<li>authenticate a patron</li>
<li> return his/her ILS account</li>
<li> renew items</li>
<li> place a request for an item</li>
<li> cancel requests for an item</li>
</ul>
<p>Our code is not fully baked itself and not documented yet, so there nothing to share at this time. We do intend to share it more broadly in the future.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2010/03/27/library-vendors-api-me/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Lifelike, messy</title>
		<link>http://circulatable.org/2009/07/06/lifelike-messy/</link>
		<comments>http://circulatable.org/2009/07/06/lifelike-messy/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 03:19:37 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=117</guid>
		<description><![CDATA[I wrote an article for the journal code4lib &#8220;Using a Web Services Architecture with Me, Myself and I&#8221; and I keep realizing all of the things it is missing. But that is what a blog is good for, right?
There is something that just feels right about creating three applications all working in concert to do [...]]]></description>
			<content:encoded><![CDATA[<p>I wrote an article for the journal code4lib &#8220;<a href="http://journal.code4lib.org/articles/1771">Using a Web Services Architecture with Me, Myself and I</a>&#8221; and I keep realizing all of the things it is missing. But that is what a blog is good for, right?</p>
<p>There is something that just feels right about creating three applications all working in concert to do the job of a single application: it feels a little bit messy, but good messy. It is not that the code is sloppy or carelessly composed. And while I wouldn&#8217;t necessarily go so far as to use cliches about the whole being greater than the sum of the parts, the messiness is what makes the application lifelike. In other words, it is like a library. It is as if each individual application comprises a different department making a contribution to the entire teaching and research mission of the library.</p>
<p>Part of this line of thinking is influenced by an excellent article my colleague Allan forwarded, &#8220;<a href="http://www.use8.net/magazine.php?ArticleId=101">Design in the Age of Biology</a>.&#8221; In it, the author discusses what he calls the rise of service design. He characterizes service in the following way:</p>
<blockquote><p>Robert Lusch [14] wrote about changes in marketing, describing a service-dominant logic in which &#8220;value is defined by and co-created with the consumer rather than embedded in output.&#8221; The &#8220;make-and-sell&#8221; strategy of linear value chains gives way to the &#8220;sense-and-respond&#8221; strategy of self-reinforcing &#8220;value cycles.&#8221; Lusch described traditional goods-centered dominant logic as focused on &#8220;operand resources,&#8221; tangible assets with inherent value. He contrasted that logic with emerging service-centered dominant logic focused on &#8220;operant resources,&#8221; intangible assets, which create value in their use, such as skills, technologies, and knowledge.</p></blockquote>
<p>In our case looking at the way in which our applications operate, the value is derived from continuing to further develop their service orientation. Their value is initially based on the service providing behavior: they expose data that is reused and repurposed by other applications. But now I am finding that there is a self reinforcing cycle that is beginning to emerge as we discover other ways to put that data to work.</p>
<p>Which is to say, these applications are beginning to take on a life of their own.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2009/07/06/lifelike-messy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>No, It&#8217;s the Network Effect, Stupid</title>
		<link>http://circulatable.org/2008/06/03/no-its-the-network-effect-stupid/</link>
		<comments>http://circulatable.org/2008/06/03/no-its-the-network-effect-stupid/#comments</comments>
		<pubDate>Tue, 03 Jun 2008 15:39:44 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=116</guid>
		<description><![CDATA[My former boss and colleague Andrew Pace recently commented on the nature of the network and how he was rebuffed by a colleague for overlooking the fact people that make up the network and this is the most sigificant piece of a network. I would like to respectfully disagree with his post. Andrew used to [...]]]></description>
			<content:encoded><![CDATA[<p>My former boss and colleague Andrew Pace <a href="http://community.oclc.org/hecticpace/archive/2008/05/no-its-the-network-stupid.html">recently commented on the nature of the network</a> and how he was rebuffed by a colleague for overlooking the fact people that make up the network and this is the most sigificant piece of a network. I would like to <strong>respectfully</strong> disagree with his post. Andrew used to boast that he is 100% right 50% of the time and in this case I believe he was right during the initial part of his musings on this topic.</p>
<p>What is the significance of the network in the 21st century? What we understand as the network is a contemporary realization, or maybe the automated reality, of the old adage that the total is greater than the sum of its parts. And quite frankly this realization was made possible by the amazing things that computers are doing with data.</p>
<p><a href="http://www.google.com/technology/">Page Rank</a> is arguably the shot heard throughout the Web. With their Page Rank algorithm Google was able to solve a problem that was plaguing relevancy in Internet search results: we&#8217;re all a bunch of dirty rotten liars. Back in the Yahoo/Alta Vista early days of search engines people were figuring out ways to game the system by lying through their metadata. In order to have their crappy cover band&#8217;s web page show up when a user searches for the Rolling Stones the cover band simply needed to put &#8216;rolling stones&#8217; into its metadata.</p>
<p>Page Rank came along and solved the problem by saying, ok, we will let the network sort out the relevancy and if the network can prove that your website is a good one, you will be rewarded in search results rankings. This is the significance of the network. For better or for worse, the network can prove whether or not the data byproduct of the people is in fact worth what those people claim it is worth. </p>
<p>As Ian Ayers points out in his book <a href="http://www.randomhouse.com/bantamdell/supercrunchers/">Super Crunchers</a>, the world is now using data to make better predictions than traditional experts. What is more, the statistical models being used by doctors, corporations, governments and non-profits are able to leverage the network effects of large data sets to verify how well those predictions are performing and improve those predictions instantly as new data becomes available. </p>
<p>I believe that my issue here is all sematics and I may simply be quibbling over something petty. However, I am splitting hairs over this point because this is a troubling area for libraries in my view. If we get caught up in the mushy people narrative over one of the most significant cultural shifts that is occurring right now, we will miss the point and consequently we will miss the opportunities to maintain the cultural relevancy of libraries in the future. The danger, in my opinion, is similar to the paralogism that because I know the structure of a MARC record I understand how it is stored in a modern <acronym title="Relational Database Management System">RDBMS</acronym>.</p>
<p>It is imperitive that we know how <a href="http://lucene.apache.org/solr/">Lucene/Solr</a> works so that we can make better resource discovery systems. It is similarly imperitive that we understand how to get in the super crunching game. As Andrew and his colleague Lorcan Dempsey have noted on numerous occassions, we need to do much more with our data, because it&#8217;s the network effect, stupid.</p>
<p>(For the record, I do not intend to call either Andrew or his colleagues stupid, I am just leveraging a theme that he and I have been riffing on for a couple of years.)</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2008/06/03/no-its-the-network-effect-stupid/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Is search != search</title>
		<link>http://circulatable.org/2008/05/12/is-search-search/</link>
		<comments>http://circulatable.org/2008/05/12/is-search-search/#comments</comments>
		<pubDate>Mon, 12 May 2008 21:41:25 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=115</guid>
		<description><![CDATA[Here is a simple question with profound implications: is library search the same thing as the &#8220;search&#8221; in the way the population at large understands search or Googling? 
The question is very simple and one that I think has been in the back of my mind for quite some time, but I just read an [...]]]></description>
			<content:encoded><![CDATA[<p>Here is a simple question with profound implications: is library search the same thing as the &#8220;search&#8221; in the way the population at large understands search or Googling? </p>
<p>The question is very simple and one that I think has been in the back of my mind for quite some time, but I just read an excerpt on statelessness on the Web from <a href="http://www.oreilly.com/catalog/9780596529260/"><em>RESTful Web Services</em></a> that provided me with a new way to frame the question. Richardson and Ruby write:</p>
<blockquote><p>When you ask for a directory of resources about mice or jellyfish, you don&#8217;t get the whole directory. <strong>You get a single page of the directory: a list of the 10 or so items the search engine considers the best matches for your query.</strong> To get more of the directory you must make more HTTP requests. The second and subsequent pages are distinct states of the application, and they need to have their own URIs: something like http://www.google.com/search?q=jellyfish&#038;start=10. As with any addressable resource, you can transmit that state of the application to someone else, cache it, or bookmark it and come back to it later. (emphasis added)</p></blockquote>
<p>Here the user behavior seems to be: &#8220;Hey, Google, show me whatcha got for jellyfish.&#8221;</p>
<p>When I go to my library&#8217;s catalog and search for the word jellyfish I think my behavior is different because my expectations are different. I am not expecting the top 10 items on the topic. I am instead doing two different things:</p>
<ol>
<li>First, determining whether anything exists on the topic at my library</li>
<li>Second, retrieving and evaluating a list of these items if they do in fact exist</li>
</ol>
<p>The difference is that of course Google will have information on a topic because Google aggregates everything (or so it goes in the popular consciousness). The library on the other hand should have something on your topic if your topic serves one of the known collection areas of the library. Understanding the stateless nature of the Web seems to bring this out. The following URIs do not reveal the same state:</p>
<ol>
<li><a href="http://www.google.com/search?q=jellyfish">http://www.google.com/search?q=jellyfish</a>: what are the ten best resources about jelly fish according to Google</li>
<li><a href="http://madcat.library.wisc.edu/cgi-bin/Pwebrecon.cgi?DB=local&#038;HIST=1&#038;CNT=50&#038;Search_Arg=jellyfish&#038;Search_Code=GKEY%5E">madcat.library.wisc.edu&#8230;Search_Arg=jellyfish&#8230;</a>:  how many, if any, resources about jellyfish does my library have</li>
</ol>
<p>In designing the interfaces for a library catalog front-end, it would be important to be mindful of this distinction since you are answering two very different questions.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2008/05/12/is-search-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Know Yourself First</title>
		<link>http://circulatable.org/2008/02/03/know-yourself-first/</link>
		<comments>http://circulatable.org/2008/02/03/know-yourself-first/#comments</comments>
		<pubDate>Sun, 03 Feb 2008 19:04:56 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=114</guid>
		<description><![CDATA[It does not matter that Microsoft may buy Yahoo&#8211;the acquisition is based on a flawed premise. Technology companies cannot operate like the GEs and General Motors of the world and serve as the be-all-end-all of technology. The New York Times today put the acquisition in the right context. Describing the business culture of Silicon Valley, [...]]]></description>
			<content:encoded><![CDATA[<p>It does not matter that Microsoft may buy Yahoo&#8211;the acquisition is based on a flawed premise. Technology companies cannot operate like the GEs and General Motors of the world and serve as the be-all-end-all of technology. The New York Times today put the acquisition in the right context. <a href="http://www.nytimes.com/2008/02/03/technology/03valley.html">Describing the business culture of Silicon Valley</a>, they write:</p>
<blockquote><p>
The economist Joseph Alois Schumpeter had a name for this principle of capitalism: creative destruction. Perhaps nowhere does it play out more dramatically â€” and more rapidly â€” than in Silicon Valley, where innovation unleashes a force that creates and destroys, over and over.
</p></blockquote>
<p>Technology companies are susceptible to creatively destructive forces when they try to expand too far beyond their original mission. Technologies like computer programming can only be successful if they break problems into smaller pieces that individually solve only a single component of the larger goal. At the time of writing, a <a href="http://en.wikipedia.org/wiki/Subroutine">computer programming function</a> is defined by the masses (Wikipedia) as &#8220;a portion of code within a larger program, which performs a <em>specific task and can be relatively independent of the remaining code</em>&#8221; (my emphasis). This principle of modularization at the most basic level of contemporary information technology is important to a technology organization&#8217;s business model.</p>
<p>Microsoft and Yahoo both fail so horribly at the world of search and Internet advertising because those problem domains lie at the heart of neither companies&#8217; core service: the operating system/desktop platform and the Internet portal. The reason Google so thoroughly dominates the world of search and Internet advertising is because that is its only core. Everything it does revolves around this core service and all of its activities support this model. The moral of the story is that you must choose your core, your identity and your raison d&#8217;Ãªtre and you must choose it wisely because trying to be all things to all people is a futile exercise.</p>
<p>What does this mean for libraries? In the techie realm of libraries, an institution needs to determine what its core mission is and decide how it will define itself in a world of creative destruction. It will need to be able to clearly and succinctly articulate what those goals are to its affiliate institutions: universities or local governments. The library must not try to do everything; as the current computing paradigm of APIs and web services demonstrates, technology works when it is implemented singularly and exceptionally, but in a manner that is open and unafraid of sharing its data and services.</p>
<p>And finally, the modern library must not be afraid to get in the game and take a turn at trying to creatively destroy the old guard, lest it fall prey to the fate of the Yahoos of the world.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2008/02/03/know-yourself-first/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Google influenced by librarians?</title>
		<link>http://circulatable.org/2007/12/17/google-influenced-by-librarians/</link>
		<comments>http://circulatable.org/2007/12/17/google-influenced-by-librarians/#comments</comments>
		<pubDate>Mon, 17 Dec 2007 04:35:35 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=113</guid>
		<description><![CDATA[The New York Times has a short piece on a new Google service called Knol that sounds like it could have been conceived by librarians:
&#8220;We believe that knowing who wrote what will significantly help users make better use of web content,&#8221; wrote Udi Manber, vice president of engineering, on the official Google blog.
The service appears [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.nytimes.com/idg/IDG_002570DE00740E18C12573B10046E582.html">New York Times has a short piece on a new Google service called Knol</a> that sounds like it could have been conceived by librarians:</p>
<blockquote><p>&#8220;We believe that knowing who wrote what will significantly help users make better use of web content,&#8221; wrote Udi Manber, vice president of engineering, on the official Google blog.</p></blockquote>
<p>The service appears to be a wiki-style hosting service that puts a premium on identifying authorship.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2007/12/17/google-influenced-by-librarians/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Modeling Things or Revealing Things</title>
		<link>http://circulatable.org/2007/11/23/modeling-things-or-revealing-things/</link>
		<comments>http://circulatable.org/2007/11/23/modeling-things-or-revealing-things/#comments</comments>
		<pubDate>Fri, 23 Nov 2007 19:52:18 +0000</pubDate>
		<dc:creator>Steve</dc:creator>
				<category><![CDATA[Cataloging/Classification]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=112</guid>
		<description><![CDATA[Karen Coyle has a great piece on Hierarchies vs. Relationships in bibliographic modeling. She points out that the point of the FRBR model is not so much the hierarchy that you get to model, but the relationships that you can reveal among things.
This is a keen insight in my view since it really begins to [...]]]></description>
			<content:encoded><![CDATA[<p>Karen Coyle has a great piece on <a href="http://kcoyle.blogspot.com/2007/11/use-of-hierarchy-as-organizing.html">Hierarchies vs. Relationships</a> in bibliographic modeling. She points out that the point of the FRBR model is not so much the hierarchy that you get to model, but the relationships that you can reveal among things.</p>
<p>This is a keen insight in my view since it really begins to get at the fun stuff that the Googles, Amazons, etc are doing with data that libraries long to do with bibliographic data. Coyle starts to articulate something here that I have not been able to put my finger on: the way that FRBR is a huge step forward but still only has an eye toward an implementation rooted in the way libraries have traditionally done things.</p>
<p>My library right now has been in discussions about subject guides and how to best build and provide access to them. I have felt for some time now that it would be great to get out of a next-generation catalog a system that imparts the kind of knowledge our librarians and subject liaisons put into these projects. Coyleâ€™s post renewed this thought by framing the new catalog model in terms of a â€œKnowledge Management system,â€ which to my mind is the true aim of a discovery system.</p>
<p>In the past when I have tried to express a hybrid of a next-generation catalog and a subject discovery tool, I have always framed it in terms of applying graph theory to bibliographic data. I think Coyleâ€™s post helps me to understand this. It seems obvious to use subject terms and call number ranges as one type of edge/vertex for nodes which are bibliographic items. However, her discussion raises the possibility of a new set of different kinds of edge types: translations, abridgements, extensions, etc.</p>
<p>More on this later&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2007/11/23/modeling-things-or-revealing-things/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I&#8217;ve been busted!</title>
		<link>http://circulatable.org/2007/10/17/ive-been-busted/</link>
		<comments>http://circulatable.org/2007/10/17/ive-been-busted/#comments</comments>
		<pubDate>Wed, 17 Oct 2007 22:22:45 +0000</pubDate>
		<dc:creator>nate</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=111</guid>
		<description><![CDATA[ Unless Karen Coombs is writing about some other reference statistics tracking package that has an (until recently) undocumented dependency on Pear::DB, her blog post calls out one of the (numerous) failings of Libstats: Installation is difficult for a lot of people. I get a lot of questions from people who have trouble with mod_rewrite [...]]]></description>
			<content:encoded><![CDATA[<p> Unless Karen Coombs is writing about some <em>other</em> reference statistics tracking package that has an (until recently) undocumented dependency on Pear::DB, <a href="http://www.librarywebchic.net/wordpress/2007/10/12/open-source-software-pet-peeve/">her blog post</a> calls out one of the (numerous) failings of <a href="http://code.google.com/p/libstats/">Libstats</a>: Installation is difficult for a lot of people. I get a lot of questions from people who have trouble with mod_rewrite or don&#8217;t know DB is required or various other things.</p>
<p>I&#8217;ve had similar negative experiences with open-source software, and actually releasing something gave me a much better understanding of why things wind up like this.</p>
<p>A few years ago, our library decided to write a reference tracking system and pilot it at a few libraries across campus. Since I was, then, the only developer at our library, the task fell to me. Once the system had proven successful at Madison, I thought, &#8220;Hey, maybe other people would like this, too.&#8221; I got the OK from my boss to release the code under an open-source license.</p>
<p>This, it turns out, is tricker than it might seem. All of those steps I&#8217;d fumbled through to make the software run, I had to eliminate, or at least explain, to people installing this software on the servers they have on hand. Databases need to be created and populated with initial data. Web servers need to be configured. Did I want to provide a demo? Screenshots? Big software projects provide installation wizards, but writing those is a bunch of work, and from my boss&#8217;s perspective, the software was written and done, and I had other projects to work on.</p>
<p>Then, there were concerns over the quality of the code. There&#8217;s some ugly shit in there. Did I really want people looking at that, and pointing and laughing? What if there&#8217;s a <a href="http://www.frsirt.com/english/advisories/2007/1880">security bug</a> in the code that could compromise someone&#8217;s server? Even if it relies on server misconfiguration, I&#8217;d feel pretty lousy if my code got someone hacked. How will people find out about, obtain, and install patches? Seriously, I wondered, is it even worth the work it&#8217;s gonna take to release this code?</p>
<p>Finally, I decided that it <em>was</em> worth the work, and that I&#8217;d release it, warts and all, in the hopes that it would be useful to some people.  In the time since then, I&#8217;ve realized that the motivations of an open-source developer are different from that of a commercial project manager. I don&#8217;t get any reward from wide adoption, except a warm fuzzy feeling inside and possibly bragging rights if I make something exceptionally neat.</p>
<p>The bottom line: There&#8217;s a large cost and a limited benefit to making an open-source <em>project</em> into an open-source <em>product</em>, and that work will <em>never ever happen</em> as long as the project is only used internally &#8212; it&#8217;s not needed.</p>
<p>Here&#8217;s the question, then: Is it better to release something half-baked, in the hopes that it will be useful, or to keep it purely internal and let someone else solve the problem?</p>
<p>(On the particular topic of not documenting the Pear::DB requirement: when Libstats was released, DB was part of the standard PHP install, so this wasn&#8217;t a common issue. Reworking the code to use Pear::MDB is the right option, but that&#8217;s nontrivial.)</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2007/10/17/ive-been-busted/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to implement OpenID with Pubcookie</title>
		<link>http://circulatable.org/2007/08/07/how-to-implement-openid-with-pubcookie/</link>
		<comments>http://circulatable.org/2007/08/07/how-to-implement-openid-with-pubcookie/#comments</comments>
		<pubDate>Tue, 07 Aug 2007 17:50:03 +0000</pubDate>
		<dc:creator>nate</dc:creator>
				<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://circulatable.org/?p=110</guid>
		<description><![CDATA[Pubcookie is pretty neat. It lets you authenticate against a login server without ever personally seeing the user&#8217;s password &#8212; it&#8217;s all handled via clever web server modules, redirects, and the REMOTE_USER variable. But, when you go to build a web app with it, you&#8217;ll likely find yourself pining for session-based logins. Fortunately, it&#8217;s easy [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.pubcookie.org/">Pubcookie</a> is pretty neat. It lets you authenticate against a login server without ever personally seeing the user&#8217;s password &#8212; it&#8217;s all handled via clever web server modules, redirects, and the REMOTE_USER variable. But, when you go to build a web app with it, you&#8217;ll likely find yourself pining for session-based logins. Fortunately, it&#8217;s easy to build an <a href="http://openid.net/">OpenID</a> service that&#8217;s backed by Pubcookie. Here&#8217;s how:</p>
<h3>What you need</h3>
<ol>
<li> A web server with working Pubcookie authentication.</li>
<li>An OpenID server. I had good luck with <a href="http://www.openidenabled.com/openid/libraries/php">PHP-OpenID</a>, and I&#8217;ll be using their example server in this post.</li>
</ol>
<h3>Set up your identity URLs</h3>
<p>OpenID identity URLs are what people enter in OpenID login boxes around the net. The pages they point to aren&#8217;t anything special &#8212; in the simplest case, they just need to have a link to your OpenID server (also called a &#8216;provider&#8217;). It&#8217;ll look like:</p>
<p><code>&lt;link rel="openid.server" href="http://example.edu/op/server.php" /&gt;</code></p>
<p>I used Apache&#8217;s mod_rewrite such that all URLs of the format:</p>
<p>http://example.edu/id/&lt;username&gt;</p>
<p>Would be valid identity URLs, linking to an identity provider service.</p>
<p><strong>Note</strong>:  Your identity URLs don&#8217;t need to be served over HTTPS, and they must not be protected behind Pubcookie.</p>
<h3>Set up the OpenID provider</h3>
<p>Follow your package&#8217;s installation notes, and get one statically-defined identity URL working. Also test to make sure the other OpenID identity URLs you&#8217;re providing <em>don&#8217;t</em> work.</p>
<p>If you&#8217;re looking for a place to test URLs, try this <a href="http://www.openidenabled.com/resources/openid-test/diagnose-server">OpenURL test service</a>. Your provider URL can&#8217;t be behind a firewall or protected by Pubcookie &#8212; other web servers need to talk to it.</p>
<p>Make note of the name of the session key your OpenID library is using. By default, PHP-OpenID uses <code>openid_server</code>. You&#8217;ll need it in the next step.</p>
<h3>Make Pubcookie set a session variable</h3>
<p>Here&#8217;s the magic step. You need a script, protected by Pubcookie, that puts the value of REMOTE_USER into your session (remember, your provider can&#8217;t be behind Pubcookie) and redirects you to your OpenID provider&#8217;s login URL. Since no one can view this script without authenticating via Pubcookie, and this script is the <em>only</em> place this session variable can be set, you need to go through Pubcookie to set this variable.</p>
<p>I put this script in http://example.edu/op/pubcookie/index.php:<br />
<code><br />
session_name('openid_server');<br />
session_start();<br />
$_SESSION['pubcookie_user'] = $_SERVER['REMOTE_USER'];<br />
header("Location: http://example.edu/op/server.php/login");</code></p>
<h3>Hack your OpenID provider to respect the session</h3>
<p>Here, you want to find the code in which authentication is checked, and replace it with a check for the session variable you set above. In this example, I replaced action_login() in actions.php with:</p>
<p><code><br />
function action_login() {<br />
if (isset($_SESSION['pubcookie_user'])) {<br />
$info = getRequestInfo();<br />
$openid_url = "http://example.edu/id/".$_SESSION['pubcookie_user'];<br />
setLoggedInUser($openid_url);<br />
return doAuth($info);<br />
}<br />
else {<br />
return login_pubcookie_render();<br />
}<br />
}<br />
</code></p>
<p>I also added login_pubcookie_render() to render/login.php &#8212; it simply uses redirect_render() to send visitors to the pubcookie-protected page.  Anywhere else in the code you&#8217;re showing the login page, use login_pubcookie_render() instead.</p>
<p>Finally, you&#8217;ll want to do a check in the method that actually does the authentication to make sure the identity URL matches with the Pubcookie username &#8212; you don&#8217;t want people to use their own credentials to log in as someone else. In common.php, I added a check to the start of doAuth():</p>
<p><code>if ($req_url != $user) {<br />
return login_pubcookie_mismatch($user, $req_url);<br />
}</code></p>
<p>And added a login_pubcookie_mismatch() method to login.php, which warns that their username and URL don&#8217;t match, and that they should fix that situation.</p>
<p>Log out of everything and give the OpenID test a try. It should redirect you to your Pubcookie login system, and from there, to a working ID.</p>
]]></content:encoded>
			<wfw:commentRss>http://circulatable.org/2007/08/07/how-to-implement-openid-with-pubcookie/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
