MODS and RDF Call 2015-07-27

Time: 9am PDT / Noon EDT

Call-In Info: 712-775-7035 (Access Code: 960009)

Moderator: Steven Anderson (Boston Public Library)

Primary Notetaker:  cmharlow (raw etherpad note link: http://etherpad.wikimedia.org/p/RDF-MODS-20150727)

Attendees:

Agenda:

  1. Introductions
    1. Gave overview of institutional stacks, interest in mods/rdf.
    2. Mixture of Hydra (~5), Islandora (~3), and Custom (~2) systems.
       
  2. Scheduling Survey Question Results
    1. Call every 2 weeks was near unanimous response.
    2. A little over 50% for the "Creation of a shared "unofficial standard" Application Profile that we crosswalk to" response.
       
  3. Focus / what is in scope.
    1. goals: discussion of modsxml problematic mappings and going to using an unofficial standard as primary goals.
    2. Anything our of scope?
      1. Question: Are we talking about MODS generally or talking specifically to Hydra?
        1. It is a Hydra subgroup but wanted to be as general as possible
        2. Eventually coming out of this will be an unofficial standard, which can be used in diff platforms. A subset of Hydra people could implement this, for example.
        3. Nothing hydra-specific about that portion of the group/goal.
      2. Question: Are we looking at speciifc ontologies to map to? Like BIBFRAME?
        1. This is something that will probably come up, we can discuss a general good approach to this.
        2. Looks like a lot of institutions map to a variety of schemas.
        3. Comments: Re: one attendees position
          1. University of Maryland doesn't currently use MODS, use MODS-like schema, but still useful to participate.
          2. As far as MODS/RDF, leaning towards not using MODS/RDF, accomodating MODS-like metadata in something that is more 'linked data native'.
      3. Question: Do we want to have stable recommendation to go with or just list options?
        1. Nick's response: good to come up with options.
        2. Steven (BPL): would have a stable recommendation as a valid outcome of the group. But wouldn't enforce that people have to use that recommendation and would still discuss and share other options institutions could go with for various MODS elements.
      4. Concern from Bri: how MODS/RDF uses blank nodes currently
        1. Steve from NW: This brings up the questions are we targeting certain systems to make sure this works in those systems or develop a standard separately from those systems.
        2. Steven (BPL): Almost every here is using Fedora Commons and trying to use Fedora 4. So focus on that (though it could work on other backend systems... just not tested or researched).
        3. Nick: Can expand to include any LD platform and spec, include working with other platforms like Marmotta.

  4. Timeline(s)
    1. From the survey, the majority were 6+ months away from doing this potential metadata migration.
    2. In terms of this group, would we try to have an unofficial draft in 4 months or so?
      1. Comments of that being good. Suggestion for end of the year?
        1. Those that responded in agreement for just end of the calendar year as the goal.
      2. Emory: they are doing their own evaluation in parallel, so they are doing that in next 3-6 months, and part of that evaluation is going with RDF or stay with XML metadata, so they move a bit more quickly.
        1. Response: but the discussion here will still be helpful
      3. Question: Jen from NW: What are we talking about as the output at the end of the year?
        1. Steven (BPL): combination of
          1. series of recommendations and options for moving to RDF and LD-compliant, summary of some discussions, options for conversion
          2. the actual "unofficial standard" mapping
            1. might be break outs from this for different platforms to develop code components in the various communities
      4. Christina interested in using multiple namespaces but then use MODS RDF where the granularity is missing.
        1. Complete MODS RDF might be an interesting project. But going to one that can support multiple namespaces is better.
           
  5. Discuss current reference materials available from other institutions (see MODS and RDF Descriptive Metadata Subgroup).
    1. MODS RDF V2: https://github.com/blunalucero/MODS-RDF
      1. How to make MODS RDF work better. Still discussing as part of a group that Columbia leads.
      2. Calls are open to everyone. Anyone can comment or submit issues on the github repo.
      3. Wasn't worried so much about the blank node issue. Used a lot of elements from MADS and Bibframe.
      4. No one is currently using it thus far - theoretical at the moment.
      5. MODSRDFV1 transformation is missing many things and was never completed. Not meant for production. MODSRDFV2 transform still being worked on.
        1. Question on writing to get that V1 transform improved. Christina would be interested in doing that but may not be focus of this group.
    2. Emory University's MODS/RDF documents
      1. Emily: local metadata working group, completed year long project to normalize, decentralize metadata practices, and get their own institutional application profile.
        1. Baseline encoding has been MODS with other standards mixed in.
        2. Significant investment in MODS, not sure if moving forward they will stay with MODS, especially as looking at RDF and scenario of mixed namespaces.
        3. Made some use cases, questions.
        4. Have to make a practical recommendation soon for their own platform migration timeframe.
        5. Do they want to stay in XML or more fluid RDF metadata schema in Hydra? Unsure on if they will move from MODS XML yet.
        6. They do have a RDF prototype mapping document:
          1. local core vocabularies and mapping to MODS.
          2. ran into some stumbling blocks with MODS in Hydra. 
          3. RDF namespaces document for properties that have already defined ruby gems they can use in Hydra, so what can they use that already exists.
          4. Running into questions as others, also the technical need to get this working in Hydra.
          5. A lot of the Hydra tutorials are simple DC, flat schema, so they have more questions about bnodes and more complicated metadata.
          6. Hydra gem for different types of identifier is example of something already existing they can use, seems straight forward to use.
        7. Questions:
          1. Julie H.: wondering about the identified core elements/fields in MODS, is part of what you're contemplating is what core in MODS to keep in RDF, but also keep the XML record?
            1. Emily's response: went to Hydra camp last spring, scenario discussed - use light weight search and disocvery in RDF, something that works well with Solr, but also store a richer XML metadata doc.
            2. They are considering it, also for other types of metadata (like PREMIS), but not trying to model that right away.
          2. Julie H.: Idea of making use of multiple namespaces - in terms of functional use, to get something into RDF, but what does this mean long term? The more namespaces that are brought in, is it possible that you end up with obscure namespace at time that is not supported long term... are we introducting that problem?
            1. Steven (BPL): this group can looking to use namespaces that have communities that can support, so they just don't disappear.
            2. Steven (BPL): This is why we are interested in this type of unofficial mapping. If we just pick mappings that work for us, may not be the ones others happen to pick which puts our metadata at risk if they disappear (or even if it doesn't, less linked data sources may understand that element) due to lack of community support.
            3. Emily: we've been pretty liberal at Emory so far in which namespaces we're mapping to in our prototype.
            4. RDF-ruby, RDF-vocab - ruby gems they are using, lots of vocabularies already defined there.
            5. Julie: at Open Repo, some of folks who had made those gems, one in particular for MODS, the creators said that's not something that people are trying to use.
            6. Emily: they've heard from other institutions already using RDF in Hydra is there is use case for switching out predicates and namespaces rather easily, so going to RDF, it is easier to change if mapping doesn't work after 6 months, then if they go the serialized XML route
    3. Amherst git repo:
      1. (With Aaron Coburn not on the call, Steven (BPL) to talk very briefly)
      2. Amherst link has initial mapping started by Aaron Coburn, something basic and starting to look at mapping there.
      3. They currently have MODS as represented as MODS/RDF version 1, for IR metadata only. Would not recommend anyone else use MODRDFV1 in Fedora Commons.
      4. Fedora 4 Metadata Group (library/IT) in early stages of mapping from Fedora 3.
      5. Currently use MODS for descriptive metadata for all digital collections (along with VRA Core, Dublin Core, Darwin Core). Will want to continue to output MODS XML but assume that move to Fedora 4 will entail use of multiple namespaces in RDF.
    4. UC Santa Barbara:
      1. (With none of them on call, Steven (BPL) to talk briefly)
      2. Their system was done by Digital Curation Experts and they had some MODS XML records previously. The new system is purely Fedora 4 with RDF metadata.
      3. They use a lot of existing vocabs, DC, gem from Oregon Digital, custom namespaces they defined for that project.
      4. A lot of things in that system that refer back to namespace used by Oregon Digital themselves. (opaquenamespace.org)
      5. Bunch of examples of MODS XML records that are then converted to their Fedora 4 RDF Application Profile. These can be seen at:  MODS and RDF: UCSB Application Profile.
        1. First time Steven (BPL) has seen relatively complex MODS converted to Fedora 4 RDF with multiple namespaces.
        2. Namespaces used are in the root of the zip of the output from converting their MODS sample files.
      6. They do a lot of work on date ranges, they ended up using the Europeana Data Model for that with some extensions mentioned on their Application Profile page.
      7. Questions: 
        1. Emily Porter: Poked around in opaque namespaces used, out there, and is possiby the Hydra Descriptive Metadata group looking at something about using shared namespaces, is there a Hydra namespace that might be evolving?
          1. Steven (BPL): not sure, will bring up the question at the next Descriptive Metadata group.
    5. UC San Diego:
      1. (With no one from there on the call, just an example of a MODS-like schema that they then are mapping to multiple linked data namespaces).
    6. WGBH PBCore to RDF Project:
      1. Early stages of project, they are going example to example.
      2. Something of interest for work in this group, to see how a different schema is being taking to RDF.
      3. Questions:
        1. Steven (BPL): Multiple namespaces?
          1. Answer: Working with EBUcore to get properties that were absent and work those into PBCore/EBUCore as part of ontology
          2. But EBUCore does pull in properties that exist from other ontologies as much as possible, so they do use other namespaces
        2. Question [Emily?]: PBCore is fairly hierarchical schema, so you have same challenge with blank nodes and such?
          1. Yes, very much so.
          2. We are looking how we can flatten out those hierarchies as much as possible, also see what information they can/should bring over, as well as keep a XML copy of the original record linked to RDF record as well so other/legacy platforms can process, also keep the original hierarchies attached.
             
  6. Conclusions
    1. Seems like most people want to have multiple namespaces, not just a MODS/XML to MODS/RDF mapping?
      1. No objections raised.
    2. Goal for next meeting:
      1. Start talking about what type of vocabularies and namespaces we want to include?
      2. What types of initial fields/mappings we want to use?
      3. Any take aways?
      4. Josh: sit down with your MODS or other XML metadata records and look at what is not obvious to map to existing RDF MAP, focus in on problem areas
        1. If every one did that, we can have draft list of areas we need solutions.
      5. Christina:
        1. Have mapping spreadsheet as reference document.
      6. Steven (BPL):
        1. Doing all of that might be a bit much initially. Space has been added on the main wiki for those that have the ability to produce those types of extensive documents though.
        2. For next meeting, could have each institution provide information on their titleInfo element and what attributes / subelements they use for that in their system. They can then propose how they would personally map if they had to do so tomorrow.
    3. Homework for group:
      1. Review resources mentioned during this call now that there is more context for them.
      2. Submit how you do model object titles and how you would convert that to RDF at:  MODS Title Individual Institution Usage And RDF Conversion. Can use this for our discussion next time and start actual mapping work!
         
  7. Next Call
    1. 2 weeks from today - August 10th at 9:00 AM PST, Noon EST
    2. Call Page:  MODS and RDF Call 2015-08-10