MODS and RDF: UCSB Application Profile

The original email sent by Justin Coyne was the following:

To those interested in Importing MODS and persisting as RDF, we've worked with UCSB to have a go at this for some of their Images.  We based this work on OregonDigital but we extended it quite a bit to support MODS elements that OregonDigital doesn't.  I don't think this will work for every MODS file, nor for every data model (see metadata.rb for ours), but it might be a good place to start.  Some of this code depends on UCSB specific tags for linking to images in <mods:extension>

You start the script by giving it a directory that has a bunch of MODS files and a directory with corresponding image files.  For example:

script/import_mods_records ../mods_demo_set ../images/spc-flying-a
The script works by parsing the MODS file into an in-memory Hash representation, then sending this to the appropriate ObjectFactory (either CollectionFactory or ImageFactory) which creates the objects.

 

Most of their mappings are in the metadata.rb files in the list above. For example, "accession_number" is mapped to "http://opaquenamespace.org/ns/cco/accessionNumber". There are a bunch of mappings to the "OARGUN" vocabulary which essentially uses that same opaquenamespace.org and comes from the following gem: https://github.com/curationexperts/oargun.

I have run the import script that Justin mentioned in his email. The "before" XML files are located at: https://github.com/curationexperts/alexandria-v2/tree/master/spec/fixtures/mods. The output of these sample records can be seen at: ucsb_samples.zip. (The namespace definitions are in a file in the root of the zip ... UPDATE: dates added and a namespace for those added to the root).

 

There is an additional email of note in the past that talks about how approximate date ranges were handled. That email is essentially:

-Justin