Metadata Call 2022-04-26
Time: 2:00pm-3:00pm Eastern
Call-In Info: Join our Cloud HD Video Meeting
Community Notes: Samvera Metadata Interest Group Meeting: 2022-04-26
Moderator(s): Nora Z.
Notetaker: also Nora Z.
Attendees:
Julie Hardesty (Indiana University)
Nora Zimmerman (Lafayette College)
Cara Key (Oregon State University)
Annamarie Klose (Ohio State University)
Rob Kaufman (Scientist.com)
Agenda:
Subgroup Reports
URI Selection Working Group
Should this group go on hiatus?
UOregon is working on a soft launch of their new platform schedule for July, so Ryan has limited availability at the present time.
Julie: may try meeting with Ryan Wick over the summer or in early Fall
Roadmaps Alignment Group Update (Annamarie Klose)
Last mtg was very brief, ICLA/CCLA requirements have been removed for the Samvera Community
Hyrax is currently looking at blacklight/bootstrap/Rails upgrades – this may have impacts on the OAI work?
This is targeting version 7.0.x
Issues/Questions
Rob: there seems to be motivation for reigniting work on M3 and Houndstooth in several different sectors of the community
The AllinsonFlex work, and Valkyrization work, are both in these directions.
Three parts: React.js interface, dynamic schema editing, and schema versioning
Julie: IU is working on this at the Admin Set level as well
He has observed that Samvera metadata schema’s need for heavy-lifting developer labor is a major turn-off to many potential adopters
Coming out of the AllinsonFlex work, all of which relies on the M3/Houndstooth specification, there was a bit of discussion of the addition of ‘contexts’
Can contexts be added to the M3 spec, and how?
Tamsin’s implementation of dynamic metadata in Hyrax 3, where they provide YAML schemas for worktypes, is a subset of the M3 spec
In proof-of-concept phase, which Tamsin is on parental leave
Annamarie: we did have potential DPLA mappings in the Hyrax 3 YAML
Enhancements could be made to the overriding specification in order to make it more broadly implementable, e.g. OAI mappings
A dynamic metadata configuration interface that could add Dublin Core and Qualified Dublin core fields, and set OAI mappings, would solve numerous community problems
If this were to be a WG, it would need to be developer-driven
Group agreed to add this topic to the agenda at our next meeting to continue the discussion
Project Sharing
Discussion Topics
OAI documentation collection
How is OAI-PMH connected to Hyrax / Hyku implementations in the community?
Annamarie – communicated with OhioHub to ask if ResourceSync is being used by anyone, and heard back that no one in OhioHub is using it in production
Rob – ATLA (conglomerate/consortium of ~20 organizations) had some movement towards potentially encouraging its use, but they determined that it is not widely used
Notch8 is involved in around 10 Hyku implementations; Bulkrax supports OAI import as a core feature. Used by 4 or 5 of their client repositories. Relies heavily on the ruby-oai gem.
Theoretically XML could be generated and hosted on a dumb server and used to publish OAI metadata
Utilize ruby-oai gem on several blacklight/spotight projects that are not necessarily Samvera projects
In most of Notch8’s implementations of OAI, they enhance the bulkrax oai importer, specifically in relation to including the files themselves (stuffing them in either an inferred URL that relies on data in the OAI e.g. an ID field OR a Dublin core field). This is used for importing data, not just for exposing it to harvesters
For example, a university is using this to expose data from one repository to two other services, as an import/roundtripping service
“Where the files go” does not seem to be part of the OAI specification, for either the thumbnails or the primary files
Annamarie: We put links to thumbnail and handle in the dc:identifier field. However, it could be in a different field. https://library.osu.edu/dc/api/oai?verb=ListRecords&metadataPrefix=oai_dc&set=unit:xp68kg24f
Best practice recommendations that includes reference to the files in the metadata records themselves would be helpful
Adding additional formats for the blacklight-oai is on Notch8’s roadmap, but it is hard partly because of also having to make changes to ruby-oai gem
If bulkrax is installed, ruby-oai is included, and there is an interface for mapping in your metadata fields in bulkrax
In Hyku, there is an interface for selecting metadata prefix and other customizable fields, and each tenant has a separate OAI feed that is somewhat sparsely populated. Used in production by the British Library project, and by some other Ubiquity Press Hyku repositories
Previously, the blacklight-oai gem was only configurable at boot. Due to the multi-tenant problem, Rob believes it is now possible to configure which fields get displayed at the OAI in a live fashion
The OAI feature appears to be something alongside Hyrax/Hyku, not out-of-the-box but using open source tools
Having Dublin Core harvesting out-of-the-box in the OAI feed would be very helpful
ArchiveSpace and ContentDM both have OAI feed mapping interfaces at the repository level
blacklight-oai takes a search of the Solr index and maps those to OAI
UCLA oral history site is a pure blacklight app that uses OAI
Issue 3328 Make OAI-PMH support tested and clear
Hyku applies as an additional example
There has been new information (from Ben Arminter) that there will be upcoming changes to the blacklight-oai and ruby-oai gems
He is prepping releases of ruby-oai (v1.2.0) and then two releases of blacklight_oai_provider (v7.0 and v7.15, pegged against blacklight releases) this week.
Would be valuable in the documentation to make it clear that OAI-PMH can be used to expose and share data for harvesting, as well as for ingest, but they are two different tasks/projects
Also, to mention to users/prospective usersthat this feature does not ship out of the box, but rather is an additional implementation
What is British Library using their OAI feed to contribute to?
Rob is willing to help with documentation, like making a demo video about OAI feature work in Hyku
The group agreed to digest this information and continue the discussion at our next meeting
Next call: May 24th, 2022 from 2-3pm EST