MODS and RDF Call 2015-08-10

Time: 9am PDT / Noon EDT

Call-In Info: 712-775-7035 (Access Code: 960009)

Homework Reminder: Add your titleInfo MODS information to: MODS Title Individual Institution Usage And RDF Conversion

Moderator: Steven Anderson (Boston Public Library)

Primary Notetaker: cmharlow (raw etherpad note link: http://etherpad.wikimedia.org/p/RDF-MODS-20150810)

Attendees:

sanderson (BPL)
Danny Pucci (BPL)
Chuck Schoppet (National Agricultural Library)
Juliet Hardesty (Indiana University)
cmharlow (University of Tennessee)
jen young (Northwestern University)
Karen Miller (Northwestern University)
ksgerrity (Amherst College)
Kelcy Shepherd (Amherst College)
Simon O'Riordan (Emory University)
Rebecca Fraimow (WGBH)
Bria Lynn Parker (University of Maryland)
Nick Ruest (York University)
saverkamp (NYPL)
Sara Rubinow (NYPL)

Agenda:

Introductions
Have each institution that could contribute briefly go over what they provided for their MODS title usage.
1. Moving through list here: https://wiki.duraspace.org/display/hydra/MODS+Title+Individual+Institution+Usage+And+RDF+Conversion
2. York University:
  1. Documented what mods:title examples they are using now.
    1. Didn't try to map it to RDF.
  2. Cases with translated titles, nonSort used in cases, alternative, uniform, then standard titleInfo/title.
  3. Question: For burmese monograph, you have brackets around title - so do not use the supplied attribute in MODS?
    1. Answer: brackets are supposed to mean that it is a transliterated title
  4. Question: titleInfo/title@type=uniform is series title?
    1. From MARC/XML transform, took as found
  5. Question: Use relatedInfo, then Part element
    1. RelatedInfo in spec coll, will have link
    2. some of Burmese ones have part for series/journal issue
  6. When running against certain records in PBCore, RDF is designed to build relationships btwn modular items, so whenever possible, have new item we can create relationships with.
3. Boston Public Library
  1. Two different versions:
  2. Simpilifed:
    1. Left column examples from current MODS XML Application Profile, middle column RDF, right column what it looks like when transformed back.
    2. Elements:
      1. opaque:prefDisplay: the single "uncomplicated" (ie. no language version and no multiples) title to use in the display of the system. Also useful for others querying your SPARQL endpoint.
      2. skos:prefLabel: the usage="primary" equivalent to MODS XML. Can be multiple in the case of parallel titles. Has the language qualifier.
      3. dc:title: a listing of all the non-alternative titles. These will include translated titles.
      4. dc:alternative: all of the alternative titles of the object (or type="alternative" in MODS).
      5. opaque:titleSupplied: Whether the title of the object was created by the cataloger or not.
      6. opaque:altTitleSupplied: Whether the alternative title(s) of the object was created by the cataloger or not.
      7. dce:title: used for uniform titles and points to that linked data uri for the title.
      8. skosxl: same as dce:title.
    3. Flaws:
      1. Lacks the nonSort, subtitle, and part element breakdowns since those are concatenated to a single string.
      2. Not easily expandable if there is some use case we missed.
  3. Complicated:
    1. Elements:
      1. Similar elements overall but with local minted uri's for your title.
      2. Due to local minted uri's, you can assign new properties to each title and more easily breakout its parts for nonSort, subtitle, etc.
    2. Flaws:
      1. Duplicate the title string to a much greater extent that can make keeping that in sync complicated. You have to update the title in multiple places.
      2. Minting local uri's adds additional complexity.
4. University of Maryland:
  1. Not using MODS but local version currently. Straight-forward examples.
  2. Question: Uniform titles?
    1. Not using uniform titles currently, not in the schema
5. NYPL
  1. Similar to U of Md - trying to keep simple, and working on in tandem with another LD initiative - put layer on top of all collections/systems.
  2. Primary purpose: support some existing functions in digital collections, around discovery and navigation.
  3. Essential model: one DC.title and everything else mapped to DC.alternative with language values (if available).
  4. If is already uniform title, will have the label, but not capturing the original uri / source.
  5. Question: parallel titles?
    1. Don't really have parallel titles
  6. Thinking about handling translated titles - something likely to already be described in catalog if has, or should be, so don't do much as far as search and display for that currently, so not trying to replicate in dig coll
  7. Question: Item with title for alternative scripts - but both go to alternate, what is that exactly?
    1. Not certain, from MARC to MODS XSLT (standard LoC xslt?)
    2. Not trying to replicate data from catalog, but rather point people back to catalog
    3. Response from others on call: those alternate titles from MARC could be from other sources (cover, spine, etc.)
    4. Primary use of primary is to help with Solr index
    5. Response: similar to what is done in MARC as well
6. Indiana University
  1. Tried to find samples for digital repository stored in MODS, or information called from library catalog in MODS, and working with that
  2. Haven't found every example of titleInfo types, but thinks there will be examples of all use in their repository
  3. Took each MODS record to transform with LoC MODS/XML to MODS/RDF transform, then took output RDF/XML and transformed to RDF N Triples
  4. Third column in sample has actual Fedora use, leaning towards using DC when got parts, subtitles - bring together, concatenate into single title, then find way to express series/uniform
  5. Unless Series is object in Fedora repository, not sure we need to then model it for relations in Fedora
    1. Argument for having object for Series: provides more opportunity for growth in future, for tracing
    2. If you're more worried about just getting objects pulled together, just getting stuff in Fedora and described, this is fine
    3. Not sure about having objects in Fedora just for descriptive elements
7. Amherst College
  1. Don't use lots of attributes, do use some subelements, but relatively straight forward
  2. Most of material in collection are digitized archival materials, so little MARC to MODS conversion
  3. Question: displayLabel use? (example of "Variant Title")
    1. Do rely on displayLabel a lot, note to discuss with programmer on how to handle that data
    2. Used in interface as label for field.
    3. Indiana University: they do the same thing
  4. Question: so what are some other uses of displayLabel?
    1. If title came from cover or other part of piece, be more specific than alternate title
8. Emory University
  1. The only thing in their system currently is a simplified flat single title.
  2. In ingest form, there are fields for alt, other titles though. Not sure where that information is being stored.
  3. Since our data is so simplified, we probably will not be using MODSRDF
    1. We'll update the doc when we decide
Discussion on RDF modeling from those examples.
1. Question: how to move to next point? dcterms title/alternative used most from examples...
  1. Could we offer a minimal versus a extended version?
    1. Yes, this could be a way to approach
  2. Question: For some institution, will they need to transform back into mods for a different system?
  3. For Boston's use cases, when they go to mods/rdf, they also need to also be able to translate back to mods/xml
    1. Boston has external systems that use their MODS XML currently. (DPLA harvest, large touchscreen display system that is part of a renovation, etc).
  4. Question: Do we want to support then being able to transform back to MODS?
    1. Amherst: We want at least to get back to MODS
    2. Different ways of implementing MODS, will need to continually discuss levels of granularity for MODS through process. May be alright losing some granularity in the process.
  5. Question: can Boston explain a bit more their use of multiple namespaces?
    1. Allows for multiple sources to understand the data. (Those consuming outside of the library world would recognize "skos:prefLabel" while "dc:title" would be foreign to them).
    2. Allow one to differentiate between a "primary usage title" and some other "non-alternative title" (usually cataloger translated). This is done by comparing the "skos:prefLabel" and "dc:title" elements. If there are two (or more) prefLabels, you know it is a case of parallel titles which is essential for correct research citation. Meanwhile, one "skos:prefLabel" but multiple "dc:title" elements indicates that they are cataloger translated titles.
    3. You will have to use multiple namespaces eventually anyway... not everything will fit into Dublin Core elements when we move beyond title.
    4. The "opaque" namespace is used by Oregon Digital and UCSB among others in the Hydra community. It seemed the best place for the "prefDisplay" concept that is based on what Getty did in their vocabularies with "prefLabelGVP". Having a single primary uncomplicated label really helps external queries of your SPARQL endpoint and reduces logic on needs to figure out which language of a title to use when displaying them along with adding a place to put non-official title information (like an extra qualifier of where the title is from).
  6. Do we want to require core elements and rest are optional then?
    1. Sounds like a good approach.
    2. Also gets to levels of granularity people may/may not want to use.
    3. Goes back to high level goals, we could have same conversations and conclusions on each element if this isn't better fleshed out...
      1. Need a vote / decision on whether or not to capture different levels of granularity or to just focus on 'core'/simplified?
      2. Just can't assume that we support a minimal and then a more complex version of this for those institutions that want more granularity.
End of meeting wrap-up
1. A straw poll of some kind will go out for institutions to vote on what this group will support in its discussions (re: simplified vs more complex examples).
  1. It will include a field to offer anonymous comments on what institutions have done thus far for feedback to be compiled and later posted.
2. Institutions should make any further refinement they want to their models and just think on what approaches were presented.
3. Hopefully we can better figure out this element during the next meeting!
Next meeting: August 24th at 9:00 AM PST / Noon EST.