OAI-PMH is the Open Archives Initiative Protocol for Metadata Harvesting. Essentially, OAI-PMH is a set of rules that allow Data Providers to share their metadata with Service Providers.
A Data Provider is any entity using OAI-PMH to allow access to their data. In the context of Spydus, a Data Provider is a library using OAI-PMH to allow access to bibliographic (and archival description) information.
A Service Provider uses metadata harvested from a Data Provider to perform services for the provider. In the context of Spydus, a Service Provider might be a 3rd party that is harvesting a library's bibliographic (and archival description) data in order to maintain a database of holdings at institutions at state and national levels (e.g. Libraries Australia or EBSCO).
OAI-PMH is performed using a client application with a specific URL syntax, and uses a number of request and response "verbs" in order to facilitate the retrieval of information from a Data Provider. Some verb examples include:
More information on specific URL syntax will be provided later in this factsheet. For full details on specifications of the OAI-PMH standards, or full verb dictionary and explanations, see the OpenArchives website.
Spydus currently supports the ability of Data Providers (libraries) to provide bibliographic (and archive) information to Service Providers in the MARCXML or DublinCore formats. MARCXML is supported by services such as Libraries Australia and the EBSCO Discovery Service.
The most common application of OAI-PMH in libraries is for discovery and maintenance of collections. That is, allowing new material to be indexed, and unavailable content to be removed, by the Service Provider.
As OAI-PMH allows the Service Provider to apply criteria to the harvesting based on the date that records were updated, libraries need only ensure that other criteria appropriately filters the data desired to be harvested (e.g. exclude stack locations, whether or not to include reference material and/or electronic resource collections).
The recommended way to prepare record sets in Spydus is to used the Saved Query functionality.
A useful query for many Service Providers is one capturing all content that is available in the library's catalogue. Below is an example of how such a query might be composed:
The Saved Query Description will be visible to Service Providers, so the description should identify the record set in a meaningful way for the Service Provider.
For a record set to be available to a Service Provider via OAI-PMH, the set must be 'exposed' by adding it to the OAI Set Editor. To do this:
Note that the setName and setSpec details for the set have been populated from the existing Saved Query description and IRN. The Format column identifies the set as either a Saved Query or a Saved List.
Each vendor works slightly differently, but typically there are just two pieces of information that a Service Provider will require in order to harvest from a library's repository:
Once the record sets have been exposed and made available to Service Providers, the content within those sets can be harvested. As mentioned previously, there is a specific syntax for OAI-PMH URLs that must be adhered to in order to successfully retrieve the set. The URL syntax for OAI-PMH in Spydus is:
Identifiers will usually only be used to retrieve an individual record, but arguments may be used to further refine sets. Some examples will be provided below under relevant headings. Multiple additional arguments may be concatenated to a URL using the & operator. e.g. baseURL?verb&argument1&argument2&argument3
In Spydus, the base URL is the libraries Spydus domain with an additional path on the server. The additional path is the same for all Spydus libraries, but is slightly different depending on what record type is being harvested: /oai/data/oai2_0 for bibliographic records, or /oai/data/oai2_0_arc for archival description records.
e.g. https://libraryname.spydus.com/oai/data/oai2_0 for bibliographic records or https://libraryname.spydus.com/oai/data/oai2_0_arc for archival description records
There are six OAI-PMH verbs that may be used to harvest information from Spydus. These verbs are consistent for both bibliographic records and for archival description records.
The Identify verb returns basic information about the repository being queried. An Identify URL would look like this:
https://libraryname.spydus.com/oai/data/oai2_0?verb=Identify
And the response would look like this:
The ListIdentifiers verb will return a list of the headers and unique identifiers for all bibliographic records. In Spydus, the unique identifier is the BRN of a bibliographic record. A ListIdentifiers URL would look like this:
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListIdentifiers
And the response would look like this:
The syntax of the Identifier is oai:institutioncode:brn/BRN. i.e. where the institution code is ABCD, and the BRN of a record is 123456, the identifier will be oai:ABCD:brn/123456
The ListMetadataFormats verb returns the supported metadata formats in the repository. Currently, Spydus supports:
A ListMetadataFormats URL would look like this:
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListMetadataFormats
And the response would look like this:
To output in a specified format, use the metadataPrefix argument.
To output in the MARCXML format, use the marcxml prefix. To output in DublinCore format, use the oai_dc prefix.
The ListSets verb returns the details of any sets (Saved Queries or Saved Lists) that have been made available for harvesting (by adding them to the OAI Set Editor in Maintenance). A ListSets URL will look like this:
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListSets
And the response will look like this:
The setSpec value can be used as an argument with the ListRecords verb to harvest the set. The setName is not used in the URL or arguments, but should be appropriately descriptive so that the required setSpec value can be recognised.
The ListRecords verb will return the details of records in a Data Provider's repository. Without additional arguments, the ListRecords verb can be invoked to return all bibliographic records in a library's repository. It is recommended to use the set argument to restrict to prepared record sets. Starting with the ListRecords verb URL:
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListRecords
..the setSpec value from a set is added to the argument. e.g. where the setSpec value is SQRY:123456, the ListRecords URL with the set argument will be:
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListRecords&set=SQRY:123456
The set may be further refined based on the datestamp field in the record header by using the optional from and until arguments. The datestamp field corresponds to the Last Updated date in Spydus.
From and until arguments would usually be added by the Service Provider, and based on the last time that a harvest was performed. e.g. If a harvest was last performed on the Deleted Items set (setSpec 123456) on 31/01/2020, the from argument would target the following day.
https://libraryname.spydus.com/oai/data/oai2_0?verb=ListRecords&set=SQRY:123456&from=2020-02-01
Additional arguments may be specified if desired (metadataPrefix, resumptionToken), and more information on these arguments is available in Cataloguing Maintenance help articles, and at the OpenArchives website.
The GetRecord verb, together with an Identifier, can retrieve a single record from a repository. As mentioned above under the ListIdentifiers heading, the syntax of an Identifier is oai:institutioncode:brn/BRN. i.e. where the institution code is ABCD, and the BRN of a record is 123456, the identifier will be oai:ABCD:brn/123456
Using the getRecord verb and an identifier, the URL would look like this:
https://libraryname.spydus.com/oai/data/oai2_0?verb=GetRecord&identifier=oai:ABCD:brn/123456
This fact sheet is only a brief introduction to OAI-PMH standards, and is far from exhaustive. Some additional information is available in the Cataloguing Maintenance help articles. For full details on the specifications, and extensive information on additional functions, visit the OpenArchives website.