MODS and RDF: UCSB Application Profile
The original email sent by Justin Coyne was the following:
To those interested in Importing MODS and persisting as RDF, we've worked with UCSB to have a go at this for some of their Images. We based this work on OregonDigital but we extended it quite a bit to support MODS elements that OregonDigital doesn't. I don't think this will work for every MODS file, nor for every data model (see metadata.rb for ours), but it might be a good place to start. Some of this code depends on UCSB specific tags for linking to images in <mods:extension>You start the script by giving it a directory that has a bunch of MODS files and a directory with corresponding image files. For example:
script/import_mods_records ../mods_demo_set ../images/spc-flying-aThe script works by parsing the MODS file into an in-memory Hash representation, then sending this to the appropriate ObjectFactory (either CollectionFactory or ImageFactory) which creates the objects.
Here's links to some of the code:
https://github.com/curationexperts/alexandria-v2/blob/master/script/import_mods_records
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/mods_importer.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/mods_parser.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/image_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/collection_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/lib/importer/factories/object_factory.rb
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/concerns/metadata.rb
Most of their mappings are in the metadata.rb files in the list above. For example, "accession_number" is mapped to "http://opaquenamespace.org/ns/cco/accessionNumber". There are a bunch of mappings to the "OARGUN" vocabulary which essentially uses that same opaquenamespace.org and comes from the following gem: https://github.com/curationexperts/oargun.
I have run the import script that Justin mentioned in his email. The "before" XML files are located at: https://github.com/curationexperts/alexandria-v2/tree/master/spec/fixtures/mods. The output of these sample records can be seen at: ucsb_samples.zip. (The namespace definitions are in a file in the root of the zip ... UPDATE: dates added and a namespace for those added to the root).
There is an additional email of note in the past that talks about how approximate date ranges were handled. That email is essentially:
CIDOC CRM properties:We chose to go with the Europeana Data Model and added predicates from CIDOC CRM (end_is_qualified_by, beginning_is_qualified_by)Here's some code:
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/concerns/metadata.rb#L100-L104
https://github.com/curationexperts/alexandria-v2/blob/master/app/models/time_span.rb
http://erlangen-crm.org/docs/ecrm/100302/objectproperties/P79.beginning_is_qualified_by___-343678885.html
http://erlangen-crm.org/docs/ecrm/091125/objectproperties/P80.end_is_qualified_by___-1512738953.html-Justin