MODS and RDF: UCSB Application Profile

Samvera Community Wiki


MODS and RDF: UCSB Application Profile

The original email sent by Justin Coyne was the following:

To those interested in Importing MODS and persisting as RDF, we've worked with UCSB to have a go at this for some of their Images.  We based this work on OregonDigital but we extended it quite a bit to support MODS elements that OregonDigital doesn't.  I don't think this will work for every MODS file, nor for every data model (see metadata.rb for ours), but it might be a good place to start.  Some of this code depends on UCSB specific tags for linking to images in <mods:extension>

You start the script by giving it a directory that has a bunch of MODS files and a directory with corresponding image files.  For example:

script/import_mods_records ../mods_demo_set ../images/spc-flying-a

The script works by parsing the MODS file into an in-memory Hash representation, then sending this to the appropriate ObjectFactory (either CollectionFactory or ImageFactory) which creates the objects.


Here's links to some of the code:
https://github.com/curationexperts/alexandria-v2/blob/master/script/import_mods_records
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/mods_importer.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/mods_parser.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/image_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/collection_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/object_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/concerns/metadata.rb

 

Most of their mappings are in the metadata.rb files in the list above. For example, "accession_number" is mapped to "http://opaquenamespace.org/ns/cco/accessionNumber". There are a bunch of mappings to the "OARGUN" vocabulary which essentially uses that same opaquenamespace.org and comes from the following gem: https://github.com/curationexperts/oargun.

I have run the import script that Justin mentioned in his email. The "before" XML files are located at: https://github.com/curationexperts/alexandria-v2/tree/master/spec/fixtures/mods. The output of these sample records can be seen at: . (The namespace definitions are in a file in the root of the zip ... UPDATE: dates added and a namespace for those added to the root).

 

There is an additional email of note in the past that talks about how approximate date ranges were handled. That email is essentially:

We chose to go with the Europeana Data Model and added predicates from CIDOC CRM (end_is_qualified_by, beginning_is_qualified_by)


Here's some code:
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/concerns/metadata.rb#L100-L104
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/time_span.rb

CIDOC CRM properties:
http://erlangen-crm.org/docs/ecrm/100302/objectproperties/P79.beginning_is_qualified_by___-343678885.html
http://erlangen-crm.org/docs/ecrm/091125/objectproperties/P80.end_is_qualified_by___-1512738953.html

-Justin