MODS and RDF Call 2015-11-16

Time: 9am PDT / Noon EDT

Call-In Info: 712-775-7035 (Access Code: 960009)

Homework Reminder: 

Moderator: Steven Anderson (Boston Public Library)

Primary Notetaker:  Juliet Hardesty (etherpad link: https://etherpad.wikimedia.org/p/RDF-MODS-20151116) 

Attendees:

Agenda:

  1. Conversion code update
    1. Username and password is on site now and also in notes from last meeting.
    2. Steven DiDomenico updated Fedora and is deploying Solr on Northwestern-hosted site.
    3. Continuing to develop instance and add features, where should announcements go? Github readme, elsewhere?
      1. use Github readme and add link to Fedora example site.
  2. MODS Genre Items
    1. NYPL example use case (https://docs.google.com/document/d/11XyFEyLN8_LZp1Yb2Y-qlgD-kogqmeEIC6gOItmj32s/edit?usp=sharing)
      1. for local genre resolution.
      2. NYPL is minting local terms so would be consistent way of identifying those.
      3. Can go from SKOS concept to scheme or number of schemes so might have different types of schemes in use (broader/narrower terms).
      4.  Might not really affect recommendation for MODS to RDF.
      5. NYPL approach connected more to larger scale RDF vocabulary management beyond Fedora and digital object mgmt
      6. Have to dereference and cache those terms somewhere anyway, so easier to store those locally and index from that so there doesn't have to be terms indexed separately as terms we manage and terms managed somewhere else
        1. Means that Fedora records have to be updated with remote sources labels.
          1. How is this different from text being stored in Solr and using a caching sidecar (ie. could use Marmotta-style triple store that would handle dereferencing URIs)?
      7. Not something we would use as part of our recommendation but good use case to have as example for folks with similar situation
    2. Collaboration Document: https://goo.gl/jX2NOy
      1. marcgt is not linked data list; map these 50 terms to AAT?
        1. Marcgt in genre form terms? http://id.loc.gov/vocabulary/genreFormSchemes.html ?
          1. Just the schemas for possible genre vocabularies are located there... individual term definitions are not.
        2. What are advantages to using AAT for these terms? might not all map from marcgt to aat
          1. Doesn't need to all be AAT... just a consistent mapping to some equivalent vocabulary term.
          2. Just that marcgt is fairly common so could be helpful to have those mapped to Linked Data but can leave that as text string?
        3. In the end, decision was to not include any default marcgt conversion mapping. Would just become string literal values... conversion to a linked data source would have to be run by the individual institution first.
  3. MODS OriginInfo Dates individual institution mappings.
    1. Amherst
      1. Didn't accommodate attributes as much; how to accommodate keyDates? for date ranges, have to have app logic to decide which date is keyDate - probably first date but app decides this.
        1. BPL has application logic as to what dates they consider most important if multiple date types exist (ie. dateCreated over dateIssued and such).
    2. BPL
      1. EDTF standard in draft from LOC; date range can have open end date, can have unknown for beginning date. Essentially all of the date logic is encoded into that string format.
        1. Using dcterms for mapping and syntax from EDTF.
        2. No inferred concept in EDTF but there is approximate using tilda (~); add a note (maybe in SKOS) saying date was supplied by cataloger.
        3. Is it important to have a way to note that date is inferred or not?
          1. If info is available, it’s helpful, but not used often.
          2. Amherst and BPL already have those notes when it was inferred - but for others, generate automatically or just have translation include approximate notation and not include note?
            1. Drop inferred, make it approximate, make date questionable.
            2. Archival materials are often undated so wouldn't want to recommend supplying a note.
            3. Amherst would map each occurence (approx and questionable).
            4. Maybe it is better to drop any ~ or ? and allow notes to clarify (if they are supplied).
              1. This last option is what was decided.
    3. UCSB - locally-managed Fedora object - TimeSpan
      1. zip file attached to wiki page - https://wiki.duraspace.org/pages/viewpageattachments.action?pageId=69830402&metadataLink=true
        1. Root has namespaces for the objects and for the date objects. Each folder has the main object and its date part.
      2. EDTF vs edm:TimeSpan model? 
        1. EDM has to mint an object for each date. But each part of the date logic is then more clearly broken out (begin, end, qualified or not, etc).
        2. EDTF is just a string literal so needs no minted objects. However, all of the date logic is in that string so other systems would need to understand EDTF to figure out your date start / date end / qualifiers / etc.
    4. Indiana University
      1. Have idea for what dates to map but we use different encodings so have to consider what to do with those - report more next time.
    5. NYPL 
      1. Options for either EDTF and edm:TimeSpan.
        1. Right now equal options.
        2. Question of representing if someting was created in 1999 and something else has timespan 1999-present, how is that distinguished in edm:TimeSpan?
          1. Unsure on this.
  4. End Notes:
    1. Look over pain points and discuss dates more next time.
    2. Continue to update code on translation site.