Descriptive Metadata Call 2016-04-13

Time: 1:00pm EDT / 10:00am PDT 

Call-In Info: Google Hangout:  https://plus.google.com/hangouts/_/g2jey2y5cjcnggkmymxziudmw4a 

New Hangout Linkhttps://hangouts.google.com/call/bat5imirxfdzbkeq23uze2ljzae

Moderator:  carolyn.hansen (U. of Cincinnati)

Notetaker:  mcmillwh (U. of Cincinnati)

Attendees: 

Agenda: Metadata data modeling 

  1. Review of specific usage questions; see: https://docs.google.com/spreadsheets/d/1YnunRMNS9T6j7cgsYFxqMPZoTvyGY-ZofZvdFGAykb8/edit?usp=sharing (updated to spreadsheet)
    Note: For background on UCSD/UCSB data modeling, see notes from last call:  Descriptive Metadata Call 2016-03-02
    1. more institutions have added data to the spreadsheet
    2. discussion of individual pain points
      1. Coding of dates in name records
        1. DPLA and EDM documentation is confusing
        2. e.g. Coding birth and death dates in an author record, not in the author's name as a string
          1. if date is uncertain, it can be difficult to conform to NACO standards while keeping it machine-readable
          2. use case: UCSD has local records that don't have external author records, but they'd like to be able to describe the relationship between agents and not have the date as a string in the author's name
        3. Boston Public is using schema:birthDate and schema:deathDate when the date is machine-parseable
          1. it only made sense to break it out into a separate record if a machine could use it
      2. Handling of corporate names with subordinate units (e.g. |a University of Cincinnati |b Libraries)
        1. has anyone come up with a solution where there is a URI for the corporate name, but not the corporate name and subordinate unit?
          1. if a URI exists only for |a, are you using only that URI?
          2. at UCSB, if the full name heading isn't established in the NACO authority file, it's treated as a local name
            1. the hope is to get the local names into the national file and use that URI
          3. this presents challenges for harvesting MARC records
        2. Geographic subdivisions
          1. different headings referring to the same place are appearing because of this
            1. in MARC: Santa Barbara (Calif.)
            2. in LD: California – Santa Barbara
          2. talk of implementing FAST in MARC records to aid ingest into repository
            1. will also aid in discovery layer using facets
          3. Are you making curatorial decisions?
            1. for some records if you broke them out, the context would be lost
            2. UCSD: we've identified the master record for digital objects.
              1. in some cases, it's the MARC record
                1. these are harvested and there's a mapping that breaks the headings out, so they're stored as facets, not as a pre-coordinated string
              2. in other cases, the master record is in the repository
                1. these don't use pre coordinated headings, 
                2. if in the past they had pre-coordinated headings, the equivalents are substituted
            3. the larger trend is moving away from complex, pre-coordinated headings towards more faceted, almost indexing terms
              1. pre-coordinated headings are very precise, but were meant for card catalogs
            4. UCSD is going through a process to convert headings in their digital repo
              1. for DAMS, they did a pilot of conversion and looked for loss of meaning when breaking into facets
              2. matched them to FAST headings, so more of a conversion than a deconstruction
              3. FAST seems to keep things together that need to be together - there are still hyphens
              4. Considered going to curators (esp. for large subject areas) to make sure meaning is not lost in the conversion to FAST
              5. ongoing harvest from MARC - the records from OCLC have FAST in them and we're hoping to use those to ingest items from MARC
                1. may not help with rare materials
              6. The geographic headings were terrible in FAST, so another vocab might be used
                1. finding a vocabulary that's consistent in coverage is hard
                2. FAST headings had good human-readable labels, but the underlying URIs for the concepts were inconsistent - TGN seems better
            5. UCSB talked about retaining LC labels for geographic, but they're using Open GeoNames to provide back-end info
            6. Problems with countries - terms for entities that no longer exist?
              1. Open GeoNames handles historic names
                1. historic names are there, but that they are not handled very well. For example, some only have a point coordinate (not bounding boxes) and others don’t exist. Some like Istanbul/Constantinople have historic names attached to the current record (which does include a bounding box)
              2. TGN and LC have them
  2. Other items? 
    1. none
  3. Next steps (do we have a project deliverable?)
    1. Continue adding to the spreadsheet
    2. at the next meeting, we'll discuss the divergence between BIBFRAME and Hydra - how can we bring them back together?