Workshop - Free Your Metadata: Why Does it Matter?
Workshop
Free Your Metadata: Why Does it Matter?
Monday, February 13, 2012, 12:00-1:30 pm
Place: Hornbake Library, South Wing, Room 2119
Note: If you would like to participate in the hands-on portion of the presentation, please pre-install Google Refine on your laptop and bring it with you.
The economic downturn in the US and Europe has forced cultural heritage institutions to adopt a more pragmatic stance towards metadata creation and to deliver short-term results towards grant providers. In parallel, the concept of Linked and Open Data (LOD) has gained momentum. In this presentation, we want to focus on metadata cleaning and reconciliation, two elementary steps to bring cultural heritage collections into the Linked Data cloud. After an initial cleaning process, involving for example the detection of duplicates and the unifying of encoding formats, metadata are reconciled by mapping a domain specific and/or local vocabulary to another (more commonly used) vocabulary that is already a part of the Semantic Web. We believe that the integration of heterogeneous collections can be managed by using subject vocabularies for cross linking between collections, since major classifications and thesauri (e.g. LCSH, DDC, RAMEAU, etc.) have been made available following Linked Data Principles. Re-using these established terms for indexing cultural heritage resources represents a big potential of Linked Data for digital library projects, but there is a common belief that the application of LOD publishing still requires expert knowledge of Semantic Web technologies. We will therefore demonstrate how Semantic Web novices can start experimenting on their own with non-expert software such as Google Refine. All necessary operations to reconcile metadata with controlled vocabularies which are already a part of the Linked Data cloud will be presented in detail. More information regarding the research project can be found on freeyourmetadata.org.
Bios:
Seth van Hooland holds the chair in Digital Information at the Information and Communication Science department of the Université Libre de Bruxelles (ULB), Belgium. His research focuses on metadata in the cultural heritage sector and documentation practices in large public and private bodies. Van Hooland also works as a consultant on the topic of digital cultural heritage and is active as a trainer in the domain of records and document management for the European Commission. He is a member of the Dublin Core Metadata Initiative (DCMI) Advisory Board and is also involved in the dissemination of Collective-Access, an open-source collection management system.
Max De Wilde is a PhD student and teaching assistant at the Université librede Bruxelles, departement of Information and Communication Sciences. He holds a master degree in linguistics from the ULB and an advanced master in computational linguistics from the University of Antwerp. Currently, he prepares a doctoral thesis on the impact of language-independent information extraction on document retrieval. At the same time, he works as full-time assistant and supervises practical classes for master level students in a number of courses related to information technology.
Ruben Verborgh received his Master degree in Computer Science Engineering from Ghent University, Belgium in 2010. He is a researcher with the Multimedia Lab Ghent University, Department of Electronics and Information Systems (ELIS) and the Interdisciplinary Institute for Broadband Technology (IBBT). His interests include Semantic Web technologies, multimedia annotation, artificial intelligence, and their relation to multimedia processing. Currently, he is working on moving multimedia algorithms to the Web and associated problems, such as semantic service descriptions.
