Stop Making Sense

Last night I attended a talk at Princeton title Stop Making Sense: On Collecting, Sorting and Presenting Data presented by Rudolf Frieling, Curator of Media Arts at SFMOMA, San Francisco. I have to start by saying that the artsy parts lost me! Frieling would show and art piece and say – of course you’ve seen this or – you know this – and I’d be thinkin “huh? should I?”

Other than that – this was an interesting talk about how we organize our data and how technology is changing so fast and so much that our delivery methods and storage methods are not going to be the delivery methods and storage methods of the future – so how does one successfully archive media materials? When Frieling was introduced, the professor mentioned a few stories that were a bit funny – but also very sad if you think about it. The first was that when presenting in a newly built theater, he found that he could not play his VHS tape because the people who designed the theater had decided that VHS was no longer a valid storage format. The other was about a store here in town that actually sold its entire collection of VHS tapes to an artist so that he could make a sculpture out of them – this store no longer sells VHS tapes. The final story was about the library at the university no longer storing VHS tapes. He had approached them to ask for a space in the the high density storage unit for his tapes and the library said they were no longer keeping tapes and that anyone who had provided to the VHS collection at the library could come pick up their items or they would be given away first come first serve.

Along those lines, my husband and I donated all of our VHS tapes to the local public library a couple of years ago – the plan being to replace them with DVDs – a media type that takes up less space on our shelves and that we found ourselves using more than VHS.

Frieling provided some keywords for his talk (I didn’t catch them all): collecting, linking, presenting all in terms of data. The fact of the matter is (and we librarians know this already) not everything is available online and if it is – it’s possible that it’s not accessible because of hardware, software, or firewall reasons. He spoke of a tool that he and others had developed for CD-ROM that no longer worked on current systems due to hardware and software changes in these systems. He spoke of websites developed at the beginning of the web that no longer work as they were intended because they were developed with system limitations in mind. The long and short of it is that systems change and as archivers and curators how are we going to preserve information for future generations?

Freiling mentioned a TV show collector by the name of (excuse mis-spellings – the font was small and I was in the back of the room) Pentti Pajukallio. This man has spent most of his life recording TV shows and collecting these VHS tapes. He only stopped to have open heart surgery and even then his wife recorded what she could for him. The question is that what value does this collection have to anyone but Pentti? And if it does have value for others how will we access it?

One of the best slides (for me) was the one of a pile of 3×5 index cards that Frieling had put together as his first database. These cards contained bibliographic references that were of use to him. He keeps this “database” today because it has nostalgic value for him – but most of the references are probably inaccessible or unavailable – or even out-dated. This collection only has value to him or those studying him. Another great point that he brought up in reference to his note cards – information like technology is always changing so databases like this are not always going to be valuable – so are they worth archiving and making accessible? I don’t know – that was the question of the night.

One great quote was when Frieling mentioned that now that we have search engines and the world wide web it’s even harder to find the “pearl among the rubbish” when we’re browsing through collections. Books are a strong model to provide content. They can be browsed, you can jump back and forth, or you can read cover to cover. This 2D model (sounds a bit like Weinberger’s first order of order) allows the user to read the text as is or randomly, but it’s physical – it’s the pearl and it’s easy (in theory) to find because it’s not (in theory) surrounded by rubbish.

When it comes to webpages we may think of the “home” page as the entry point into our site but in reality people are entering our sites from every which way because search engines are indexing all (one again – in theory) of our pages and providing them in piecemeal to searchers. Frieling described this as users coming at our sites diagonally instead of straight on like they do with books. This means they only get parts of the information we’re providing and not getting the whole picture.

One way to look at information or media is that each item has two stories. One story is that of the artist or the collector and is usually personal in nature. The other story is that of the viewer. This story gives us the perspective of the outsider. This is the perspective that we’re giving in our catalogs – the perspective of the cataloger when viewing the item – so why not let the other “viewers” (our patrons) add their perspectives as well? This isn’t something that Frieling said exactly – just something I thought when he started talking about the two stories. What he did show us was Steve and how allowing others to add tags to art gave the piece a whole new perspective and a whole new value.

He ended by showing us the Way Maker (if you have a link please share it with me). This program is downloaded to your phone and then you attach your phone to your body and you record your life through your eyes. Does this hold value for anyone but you? Maybe not – but it allows you to see your life from another perspective. It shows you things that you maybe weren’t paying attention to throughout the day – and maybe even makes you more aware of your surroundings. Would a series of videos like this be worth archiving? Who knows – maybe it would be educational for future generations or other cultures to see what a day in the life of Nicole is like. Would I do it? Nope! I don’t need to go to that level of sharing my life – I have this blog and my personal networks – that’s enough for me :)

It was a great talk, while the art aspects were over my head, I’m glad I attended – I just wish that there were more links provided or that the slides were available as I’d like to link you to more information and I don’t have the time just now to do the research on Pentti or the Way Maker.

Metadata Tools

I just read on a few quotes from the the report of the RLG Programs metadata practice survey on Lorcan Dempsey’s blog (I haven’t read the whole report yet) and wanted to add to his comments. The report says:

… RLG Programs surveyed 18 Partner institutions1 in July and August 2007 to obtain a baseline understanding of their current descriptive metadata practices. Although we saw some expected variations in practice across libraries, archives and museums, we were struck by the high levels of customization and local tool development, the limited extent to which tools and practices are, or can be, shared (both within and across institutions), the lack of confidence institutions have in the effectiveness of their tools, and the disconnect between their interest in creating metadata to serve their primary audiences and the inability to serve that audience within the most commonly used discovery systems (such as Google, Yahoo, etc.).

I have heard this many times. At our library we use a combination of metadata standards and the MarkLogic XML Content Server to deliver the information to our patrons.

That said – while our delivery system is awesome, creating a METS document is one of the most cumbersome things I’ve ever had to do! This standard is amazing – it has such power and I can’t think how to make it less stressful to create documents – but it just seems like someone created this standard to torture librarians. This is probably why so many librarians are unsure of their tools and their metadata.

I also find that there are many choices – somewhat too many choices on how we can format our data. There is Dublin Core, MODS, MARCXML, etc. As a cataloger I say we need to use MARCXML – it holds the most data and stays in line with our print collections. As a programmer I say MODS is the easiest to read and retrieve data from. And as a lazy person (yes I too can be lazy) I say Dublin Core because I only need to enter minimal information.

But how do you make these decisions? And have I gotten totally off track? I don’t have any hard and fast answers for you – all I know is that I sympathize with librarians who are unsure and think I should go and read the entire report before adding anything else.

Genius of Cataloging

Via AutoCat (by J. McRee (Mac) Elrod) & Cataloging Futures:

Brian Campbell called my attention to this recorded lecture by Francis Miksa. It’s well worth the hour and a half it takes to listen to it (and I probably kept Hal awake doing so).

After a fascinating detailed history (which establishes that nothing is new under the sun in terms to predicting the end of cataloguing), he turns to the advocacy of classification as an approach to the organization of knowledge, which Google type searches with its granularity can’t do. He calls for multiple class numbers (as well as multiple classification systems), but does not mention one of the major advantages of the classed catalogue – the possibility of indexing in a variety of languages.

While giving lip service to international usage, RDA if becoming much more parochial than AACR2. Rather than fiddling around with description (which isn’t doing too badly), developing alternate modes of organizing and searching informational sources might be a better way to go.

I need to find time this weekend to listen – maybe on the 3 hour drive to West Virginia (or the 1.5 hour drive to Maryland) where we might find a sibling for Coda.

The Return of Everything is Miscellaneous

Last week I wrote about my impressions of David Weinberger’s Everything is Miscellaneous. Well, this morning (around 2am) I finished the book and am so impressed! I love books that make me think – and Weinberger really left my head reeling.

In my role as Metadata Librarian I not only have to work with metadata, but think about ways in which we can manipulate it to provide a better product for our patrons and that’s just what the third order of order is all about – well, not exactly, the third order of order allows the patrons to add value and I hope down the road to be able to open up our metadata to allow for user input.

But, back to the book. David mentions something I’ve heard in several presentations lately. The simple fact that the more “mess” you have the more valuable the data becomes. Basically if you have a tool like Flickr that keeps data from every picture we upload, results can be clustered in ways that are impossible in the first order world (the physical world). This is why LibraryThing is so amazing and the fact that they’re sharing their data with libraries is so great. By using data from LibraryThing, libraries have access to a much wider mess than they would ever be able to compile with their own patron base.

Throughout the book, Weinberger uses Wikipedia as an amazing example of how the third order of order has been successful. On page 208 he makes a great point:

The Britannica includes references at the end of articles to remind us that topics are related to other topics, literally afterthoughts. Wikipedia, on the other hand, is besotted with links…These links are not even bread crumbs, for with two clicks we well may be going down a path no one has trod before and that no one anticipated…In the miscellaneous order, a topic is anything someone somewhere is interested in. Anyone an pull a topic together by contributing to Wikipedia, writing a blog post, creating a playlist, or starting a discussion thread.

While librarians and researchers question the accuracy of Wikipedia (and rightly so) it cannot be dismissed as a powerful research tool. I like looking at Wikipedia and following the links to find additional information. As a librarian, I then go and research the topic further using additional tools to confirm accuracy – but if I hadn’t used Wikipedia in the first place I may not have ended up down the path I did.

Along similar lines, the value of tools like Wikipedia and the blogosphere is that it shares information in the words of the users – these sites include language that matches how the average person thinks and speaks. Weinberger used the example of the blogosphere’s reaction to Bush’s speech on immigration on May 15, 2006. After the talk the blogosphere exploded in comments and interpretations. Weinberger explains the speech as “Simple arguments, simple ideas, simple language.” and goes on to say, “That’s how politicians talk. But it’s not how we, their constituents talk.” (p.209).

Next, as I mentioned yesterday, Weinberger touches on the future of the ebook. He talked about how we could collect data from how people read books, the passages they highlight, where people read books and so much more using wireless enabled ebook readers (p.222) – and while it sounds like science fiction – we’re almost there. Kindle has the power of wireless technology – meaning that in theory, Amazon could connect to our readers and collect data. While this sounds scary and like a huge invasion of privacy – imagine the power that this data could provide. Some examples Weinberger has is that you could create a list of books that people most often read at the beach or a list of books people stopped reading 1/2 way through – how cool would that be?

So, like I said at the beginning – my head is reeling with information and I’ll probably have to read this book again to get a real hold on some of the theory involved, but I loved the book! I think it’s a great read for all librarians – but if I have to specific – Metadata Librarians in particular.

PS. In this article I linked you out to 9 other resources on the topics I was covering – what print product can do that??

Technorati Tags: , ,

Everything is Miscellaneous

This is not a review – so much as it is a review of points that have stuck with me from my reading of Everything is Miscellaneous by David Weinberger. I’m not done yet – but I can’t hold it in anymore – and my husband is tired of listening to me rant about library-type stuff :)

Point one: Allowing users to write reviews:

When I was at the NFAIS Humanities Roundtable, I faced this very question. “Why would we want to let amateurs write reviews?” and “Publishers will pull their content if we let them do that!” It was for this reason that I found page 59 so funny!

[Greg] Hark remarks. “Publishers said you’re allowing users to say that they hate a book.” The response from Jeff Bezos, Amazon’s founder, as Hart recalls it, was: “It will sell more books…just not ones customers don’t like.”

This was in response to Amazon allowing users to review books in their store – and it’s perfect! My answer at the conference was another question. What’s to stop a professional reviewer from saying they hate the book? The fact of the matter is that the average reader cares more about what other readers think than what professional book reviewers think – at least I do!

Point two: Library catalog limitations:

Weinberger points out (on page 119) that when looking at a record in a library card catalog:

Generally you will not find how well the book sold, if it’s been banned in any countries, a list of the books it cites, the college the author attended, what the reviewers said about it, the full index from the back of the book, or how many times it’s been checked out of the library…

Now, while we aren’t using cards to store our data anymore (well most of us aren’t) we’re still following the same rules – and more importantly, we’re still thinking about how much time it would take for us to add that extra metadata.

This is the beauty of LibraryThing’s new Common Knowledge – while it doesn’t have all of these things it does have some and they’re adding new fields all of the time! I love it! One day I spent hours just filling in all of the info I could find on my favorite authors – not a great use of time – but so useful to someone searching for that book!

Point three: Knowledge is social:

Starting on page 144, Weinberger discusses our education system here in the U.S. and how we’re taught to work in silos. Students are made to sit and take tests to measure what they’ve learned:

The implicit lesson is unmistakable: Knowing is something done by individuals. It is something that happens inside your brain. The mark of knowing is being able to fill in a paper with the right answers. Knowledge could not get any less social. In fact, in those circumstances when knowledge is social we call it cheating.

When I was in college, I lived with my husband (boyfriend at that time) and we took many of the same classes – since we had the same degree. We would sit and do our homework together and yes, come up with the same answers. Most of the professors were okay with this as long as we could fill out those test papers on our own come exam time – all except one – but we won’t go there. Now, Weinberger guarantees that students are on IM, chatting while doing homework – which probably ends up with the same result – shared knowledge. This – in my eyes – is the way of the world! You learn so much more by sharing with others than you do sitting alone at your desk. This is part of the reason why I started this blog – I wanted to share what I was learning so that others could learn too.

Two more quotes from Weinberger in this section that made me interrupt my husband as he tried to read his book last night …

Memorizing facts is often now a skill more relevant to quiz shows than to life … One thing is for sure: When our kids become teachers, they’re not going to be administering tests to students sitting in a neat grid of separated desks with the shades down.

So true!! And:

One of the lessons of Wikipedia is that conversation improves expertise by exposing weaknesses, introducing new viewpoints, and pushing ideas into accessible form.

Long story short – knowledge should be shared! And in doing so learning will be more valuable.

More points to come:

I’m only 1/2 way through with the book – and I’m sure I’ll have more to share with you as I finish – if you haven’t read the book – I highly recommend it just based on the first 150 pages and the conversations that I’ve seen spring up from it!

Open Source MODS-generating software

Via Metadatalibrarians:

The University of Tennessee Digital Library Center is proud to announce the release of the DLC-MODS Workbook, version 1.2 under the GNU General Public License version 3.

The DLC-MODS Workbook provides a series of web pages that enable users to easily generate complex, valid MODS metadata records that meet the 1-4 levels of specification outlined in the Digital Library Federation Implementation Guidelines for Shareable MODS Records, (DLF Aquifer Guidelines November 2006).

Developed by programmer Christine Haygood Deane under the direction of metadata librarian Melanie Feltner-Reichert, this open source client-side software provides control of date formats and other problematic fields at the point of creation, while shielding creators from the need to work in XML. Metadata records created can be partially created, saved to the desktop, reloaded and completed at a later date.

Final versions can be downloaded or cut-and-pasted into text editors for use elsewhere.

Developed in support for our state-wide digitization project, Volunteer Voices, we hope this system will assist others in their efforts to create valuable digital libraries also. The software can be viewed here and downloaded here.

Please address comments and questions to Melanie Feltner-Reichert ( ) and Cricket Deane ( ).

Technorati Tags: ,

OCLC Connexion Tips?

I think there is a need for a blog/website/mailing list/general list of OCLC Connexion tips! I’ve been attending training at PALINET and keep learning new little tips that will make my life easier – plus I keep finding that I sometimes know a thing or two that the others in the class didn’t. I think there is a need for a way for us to share this info so we can work more productively. Does such a site/tool exist? Do you want to create it? Just an idea :)

For now, here are the 2 tips I learned this week:

  • Typing ALT E G 6 P will get you the form to fill in for the 006 field
  • There is a macro for the 007 field: User Tools > Manage > Macros > Choose the 007 Macro and then assign it to the user tools and add to the toolbar.

New Mark Twain Digital Collection

I just got this via a few of my mailing lists and thought I should share with you all.

I'm happy to announce that today the University of California launched the beta version of Mark Twain Project Online, a digital critical edition of the writings of Mark Twain, providing access to more than twenty-three hundred letters written between 1853 and 1880, including nearly 100 facsimiles of originals. The site is driven by metadata captured in METS records, the content was encoded in TEI P4, and the search, browse and display functionality was built using the XTF (the eXtensible Text Framework).

Read the full press release here.

Technorati Tags: ,

Book Jacket Brainstorm

While watching a demo of Primo at the EMA conference today I had a brainstorm.

Most academic libraries remove the dust jackets from books before putting them on the shelf. This means that adding images of book covers isn’t quite at valuable to us as it might be to a public library. So – what if, while we were cataloging we were given a choice. We can include the dust jacket pulled from whatever source or we can choose from a color chooser so that the image shows a generic book in that color? I know that there is a catalog out there where the catalogers include the book covers – but this would be a little different – it would not only have the color coded into the record, but it would show the right colored book in the search results instead of a “no image available” image. You can even use a bit of scripting to have the book’s title printed on the generic image – like LibraryThing is doing now if your book doesn’t have a cover associated with it.

Just a little brainstorm of mine ;)

Technorati Tags:

So many rules!

The Daily News: This just in, a volunteer at the Crocker Art Museum Library was crushed to death by the AACR2 (Anglo-American Rules for Cataloguing 2002 edition).

Well, not quite, but that’s what I felt like. I am now volunteering at the Hansen Library at the Crocker Art Museum in downtown Sacramento on Saturdays. I met with the Librarian yesterday to see what he wanted me to do, and he ended up giving me a 2 hour crash course on cataloguing. I had no idea it was so complex. MARC (MAchine Readable Catalogue Record) format is a long list of numbered fields that you fill in with everything under the sun so the computer can read the record and show the user what we normally see, a normal bibliographic reference. The librarian loaned me his edition of the AACR2. It is a binder full of about 300-400 pages of cataloguing rules.

This from Laura Francabandera.

I find this very ironic!! I just finished taking a class on Subject Cataloging – that’s right – not all cataloging, just 2 whole days on subjects! The funny part was the list of books we were given to help us decide how to assign subjects to items we catalog. There is the 6 volume set of “big red books” that helps us find valid subject headings, then there’s a set of several “big red binders” that tell us the rules for assigning headings, and lastly there is a manual just for free-floating subdivisions (sub headings that can be applied almost anywhere).

As someone who loves to learn and who enjoys cataloging and finding the perfect heading – I now want to read these insane manuals – buy why???? Why are there so many rules? Why can’t it all be simplified?

Right now I’ll have to forego reading those manuals because I’m giving my reading energy to Weinberger’s Everything is Miscellaneous.