The third panel for today was titled Digital Natives and Professional Searching: Improving the User Experience.
Chris Lamb from Thomson Reuters took the podium first with his talk entitled Desperately Seeking Paris in which he talked about Calais a free web service and open API (there are already plugins for Drupal, OpenOffice, WordPress and others) that makes obvious to computers what is obvious to humans. The difference when searching for ‘Paris’ between ‘Paris, France'; ‘Paris, Texas'; and ‘Paris Hilton’. Basically it reads the metadata to distinguish between results.
It takes unstructured documents (text, HTML, XML) and extracts named entities, facts, events, and categories from unstructured text and makes connections between entities in your content and related data in DBpedia, GeoNames, CIA World Factbook and more. In short, it’s a realization of the semantic web.
Some sample extraction applications for Calais include
- Indexing and Abstracting (sorting and collating)
- Investigative Reporting (tag background documents and reveal hidden connections)
- Media Monitoring (competitive intelligence and blogosphere monitoring)
- Online publishing
How are people using Calais in search? Calais is a platform, not an application – and so it’s not a search engine. People are using Calais to supplement indexing engines like FAST. Once the data is returned it has semantically enhanced content allowing the index engine to support semantic enabled results and links. There are people working on projects like this – but they have yet to be released or even announced, so there is no way to see it in action online yet.
That said, there are over 9,000 developers using it already and there are over 1 million daily submissions – so if you want to play with Calais you won’t be a guinea pig.
This talk made me curious enough to check out the plugins for apps that I already use to see what kind of value it will add.
Rudy Potenzone from Microsoft came next with his talk, And the Barbarians have Phasers: Authors and Their Tools Come of Age.
This is the most information aware group (the digital natives) that has ever come into our offices, our libraries, our universities and they come equipped – already knowing how to use Twitter and blogs and they expect these tools when then come into the workforce – how do we deal with this and prepare for them?
Microsoft is envisioning a new era of research reporting. The author of today is the reader of tomorrow – so how do we capture enough information to make the content interesting to the reader. Office 2007 and Sharepoint are the ways they’re opening up to these new ideals – Sharepoint is the most popular product that Microsoft has – with the highest sales of any product they have. This is a sign of the times – how people want to work in their offices.
Rudy talked about Microsoft’s efforts to bring their tools up to the expectations of today’s academic environment. There are a lot of projects going on to try and bring these ideals to life. One example is The British Library’s Research Information Centre (RIC) and another is an eJournal Publishing Service. All of these examples are built on Microsoft products and can be found online here. You can also find code online at codeplex.com.
One tool that sounds neat to me was the author add-In tool that lets you get the rules for the publication you’re writing for and add your own metadata as you’re writing the article in Microsoft Word. My only problem is that while these add on tools are open source and free to download – you still need the Microsoft software to use them – which I do – but you get my point
Kristian Hammond finished up with his presentation Frictionless Information: Adding Value in the Age of Google.
Coming for the IT world, he’s trying to understand our world – the world of publishers and content providers. He thinks we create really high value, fabulous content – and that it used to be that people were willing to pay for that – but they’re not anymore He listed our pressing problems as:
- Social Media
- Content, content, content
- Bounce (somebody finds something on Google that leads them to you site and they bounce on and then bounce off – never to return again)
His department decided that the way around these problems is focussing only on the user. Giving the user what they want, when they want it, without taking them away from what they’re doing. He doesn’t care what the source is (Blogs, Services, Web, News, Opinion, Video, etc etc). Nothing is going to stop people from using web resources – so let’s embrace it – bring together the high class content and mix it with the other content – provide it all to the user.
His solution to this is the Relevance Engine. While he’s writing it’s reading, while he’s reading it’s reading – it’s building a gist – what it thinks this document is about and give additional information about it from anywhere on the web – all sources. Because it’s reading the context of the document you’re working on it’s going to find better results than if you typed in a query. This can be done both on the desktop and on the web – from a piece of indexed text we can find anything! The better the indexing the better the results.
It all comes down to loving your user, protecting your user from the horror of the text box!! Providing your user with your amazing content so that they never have to go looking for it elsewhere.
Wow, what an animated speaker – I wish he was one of my professors when I was in school I bet that class is so much fun!!