Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

Overview of OAI-PMH and support in Samvera

The Open Archive Initiative Protocol for Metadata Harvesting (OAI-PMH) facilitates the sharing of metadata via an endpoint, or base URL, that exposes metadata to harvesting by HTTP requests. Librarians working with metadata rely on OAI-PMH to make structured data available to a variety of platforms with differing requirements. Flexibility in how data can be structured via OAI-PMH is therefore key. Current use cases for OAI-PMH include but are not limited to sharing metadata with DPLA, EBSCO Discovery Service, Worldcat’s Digital Collections Gateway, local catalogs (PrimoVE), and importing metadata and files with Bulkrax.

Although support for OAI-PMH isn’t present by default in all Samvera repository solutions, it can be configured for use in Hyrax using the BlacklightOAIProvider plug-in and ruby-oai library. The current version of Hyku (version 5) has Blacklight OAI built into it. Note that the only metadata prefix that is supported out of the box using this plug-in is Dublin Core. Other established and custom metadata formats can be configured. There is no OAI-PMH support in Avalon.

Configure OAI with Hyrax

Detailed step-by-step instructions for creating an OAI-PMH service endpoint in a Hyrax application can be found in the README.md file of the BlacklightOAIProvider plug-in github repository.   

As indicated in the README, customization of OAI-PMH provider parameters is done in the catalog controller (app/controllers/catalog_controller.rb).  The OAI-PMH endpoint URL receives requests from metadata harvesters in the form of HTTP POST keys, then sends back information in XML serialization. To customize/modify these fields as they appear in the XML, define the field_semantics in the Solr document (app/models/solr_document.rb). 

Configure OAI with Hyku

The current version of Hyku has Blacklight OAI built into it. As per the Hyrax instructions, it is configured in the catalog controller (app/controllers/catalog_controller.rb) with metadata field specifics in the field_semantics section of the SOLR document (app/models/solr_document.rb).

Samvera defaults to oai_dc as its metadata format in the OAI-PMH feed. However, many repositories have custom formats. SoftServ has implemented code in Hyku for easier customization of OAI-PMH data (https://playbook-staging.notch8.com/en/samvera/oai-feeds ). For example, Hyku for Consortia uses the custom metadata prefix oai_hyku in its Hyku Commons repository

The standard OAI XSLT transformation does not fully support displaying the custom metadata prefixes. See example below.  However, one can right click to view the View Page Source to see the raw XML. One can get all the XML data with the custom prefixes relatively easily or a developer can configure that. This includes the custom header specs and custom encodings. It can generate the information from a SOLR service query.

Example of custom metadata format display:

Example of custom metadata format

OAI-PMH Feed Display & Queries

The OAI-PMH feed can be accessed by /catalog/oai at the end of your Samvera-based app (e.g. [URL]/catalog/oai).*

*Note: The initial OAI-PMH feed page gives an error code of “badVerb”.

As per the OAI-PMH specification, there are various queries that can be performed on the records in your Samvera repository. The string [URL] displayed in these examples should be replaced with the repository’s URL when following the query’s syntax.

Identify

To view general information about your OAI-PMH feed, including your Request URL, click the Identify link or enter the following query.

[URL]/catalog/oai?verb=Identify

List records

To view all records in the repository, click the ListRecords link or enter the following query. 

[URL]/catalog/oai?verb=ListRecords

If the repository has different metadata prefixes, it may be necessary to specify the prefix as per the following queries.

[URL]/catalog/oai?verb=ListRecords&metadataPrefix=oai_dc

[URL]/catalog/oai?verb=ListRecords&metadataPrefix=oai_hyku

List sets

Whether the repository uses collections and/or admin sets, one can query by the sets. For a list of the sets, click the ListSets link or enter the following query.

[URL]/catalog/oai?verb=ListSets

List metadata formats

To view all the metadata formats in the OAI feed, click the link for ListMetadataFormats or enter the following query.

[URL]/catalog/oai?verb=ListMetadataFormats

Unless other metadata formats have been configured, the default metadata prefix is oai_dc.

List Identifiers

To view all identifiers in the repository, click the ListIdentifiers link or enter the following query.

[URL]/catalog/oai?verb=ListIdentifiers

[URL]/catalog/oai?verb=ListIdentifiers&metadataPrefix=oai_dc

List records by set

Sets are configured locally by an institution. To view all records by a set, click on ListSets link or, in a particular implementation, enter according to the query syntax examples.

[URL]/catalog/oai?verb=ListRecords&metadataPrefix=oai_dc&set=collection:[collection name]

[URL]/catalog/oai?verb=ListRecords&metadataPrefix=oai_dc&set=unit:[internal identifier]

Resumption Tokens

OAI-PMH feeds typically consist of a large amount of data. For most queries, you will likely need to use the resumption tokens to get the entire data set. One can use the Resume link or copy and paste the resumptionToken string into the query to continue pulling results.

  • No labels