Samvera Newspapers Interest Group Call: 2018-02-01

Time: 1:00 PM EST / 10:00 AM PST

Call-In Info: 712-775-7035 (Access Code: 960009)

Moderator: Eben English (Boston Public Library)

Notetaker: TK (Etherpad link: https://etherpad.wikimedia.org/p/Samvera_Newspapers_Interest_Group_Call__2018-02-01)

Attendees:

  • Cliff Wulfman (Princeton)
  • Nick Homenda (Indiana University)
  • Gordon Leacock (University of Michigan)
  • Brian McBride (University of Utah)
  • pbinkley (University of Alberta) (joined late)

Agenda

  1. IMLS Grant Update

  2. PCDM Profile updates: https://docs.google.com/document/d/1T_gKqkKoik7h9WweYB46S9NrwAXmJTB0g3hCrqH3Q6Q/edit?usp=sharing
    1. Diagram simplified
    2. NewspaperArticleFileSet
    3. NewspaperTitle now pcdm:Object

  3. Newspaper search results UX (page-level objects)
    1. Page-level results
      1. ChronAm: search results
      2. Michigan Daily Digital Archives: search results
      3. Utah Digital Newspapers: search results
      4. NewspaperArchive.com: search_results (screenshot)
      5. Newspapers.com: search results (screenshot)
    1. Issue-level results
      1. Archive.org: search results
      2. HathiTrust: search results
    2. Hybrid
      1. World Digital Library: search results

  4. Grant documentation
    1. Ingest Scenarios
    2. Design Overviews:
      1. NewspaperWorks (admin gem)
      2. NewspaperViews (display gem)
    3. Metadata Profile
      1. Article types controlled vocabulary (tab)

  5. Content Examples: https://drive.google.com/drive/folders/0BwKKtxaBVqjEbE5zMFdWUEU4WGM?usp=sharing
    1. Still need: CONTENTdm, TEI, Olive

  6. Intel sharing from other groups/projects

  7. Next meeting: Thursday March 1, 1 PM EST

Notes

  1. The metadata model, PCDM based, has been refined and simplified. 
  2. The programming of the metadata model in Hyrax is coming along, however not complete. We ran into a bit of friction with the BasicMetaData module in Hyrax, found a workaround. 
  3. Hyrax, by default, excludes file sets from the search results. Need to conduct further research on the issue if we want to display a page as part of search results.
    1. What is the best way to handle result results of term appears across multiple pages of same issue?
    2. What is the atomic unit? article or page?
    3. University of Michigan ended up using page as the returned result. (University Newspaper collection)
    4. Discussion on existing newspaper systems, focused on how to best display/handle returned search results either by page/issue, or hybrid.  (see 3A-B for links to existing systems)
      1. If collection is large we may need context, may want to offer different search output options based on system setting
      2. If the collection is small, context might be unecessary
      3. Might not be able to create a one size fits all model, appears difficult 
  4. Briefly discussed the ingestion scenarios. General consensus that the scenarios are capture most, if not all cases. University of Alberta uses XML-ALTA as their baseline for ingesting newspapers.
  5. Peter Binkley (University of Alberta) will provide an example of their newspaper ingestion package. Nick Homenda will contact his collegues at IUPUI to request that they upload an example newspaper ingestion package.
  6. Two items
    1. IIIF text grandularity group has determinted that the use cases provided to them are not solid enough to create a specifications.
    2. University of Michigan - under the bentley program is adding new newspapers to their collecion.