Metadata Call 2026-02-24

Metadata Call 2026-02-24

Time: 2:00pm-3:00pm Eastern / 1-2 pm Central / 11 am-12 pm Pacific

Community Notes: https://docs.google.com/document/d/1G8wmi9R3q1uHatQLa3JfmlVr1-laTO5vgYzUZz-B7QU/edit?tab=t.0

 

Moderator(s): Annamarie Klose

Notetaker: Emma Beck

Attendees: 

  • @Annamarie Klose, Ohio State University

  • @Emma Beck- University of Louisville

  • @Benjamin Riesenberg - U. of Oregon

  • @Anna Goslen, UNC-Chapel Hill

  • @Tiffany Chan, University of Victoria

  • @Morgan McKeehan(she/her/hers), Oregon State University Libraries & Press

  • @Juliet Hardesty, Indiana University

  • @Sudha Anand, Indiana University

  • @Nick Steinwachs, Notch 8

  • @Sarah Proctor, Notch 8

  • Steve McDonald, Tufts

Agenda: 

  • Samvera updates

    • Bulkrax updates are in progress - last demo is tomorrow with Notch8. Test it at (demo.hykudemo.org/) contact Nic Don for access

  • Accessibility presentations - Video recording and WebVTT captions file

    • Patrick Burden, West Virginia University - Postponed due to a scheduling conflict.

    • Stacy Snyder, University at Buffalo - recorded 

      • Accessibility Coordinator for the Libraries at the University of Buffalo 

      • Content in Digital collections not accessible

      • Generative AI to create alt-text for images

      • Tested 3 gen ai services

      • 4k(?) images from various categories of content. Compared the various alt-text and description and ChatGBT was the best

      • Using gen ai for OCR and text

      • Seeing lots of improvements since originally starting this work

      • 3 students doing DC remediation 

    • Steve McDonald, Tufts University

      • Library and Library IT group has been doing sprints to take on projects with AI. Running in parallel with one happening across the university 

      • Experimented with transcripts creating video and audio using whisper

      • Library volunteers are doing validation of the work

      • Working on 30 videos ranging 3 minutes to a couple hours 

      • OCR is the new gen ai sprint - Institutional Books Initiative

      • Using Tesseract for OCR 

      • Limited test produced few hallucinations 

    • Annamarie Klose, Ohio State University

      • 44k records in digital collections 

      • Focusing on publicly available content, then institutional records, then private content

      • File level accessibility support

      • Supporting accessibility through: ai, repurposing existing summary and description values, outsource to venders

      • Have to use University approved venders 

      • Using Human in the Loop review

      • Claude sonnet workflow

      • Adding to alt-test “detailed description can be found in the description field” 

      • Tracking progress in airtable 

      • Relying on those who submit content to update their content accessibility

    • Questions:

    • Did I just hear that OSU’s accessibility features are going back to Hyrax main?

      • YES

      • Nothing up in github yet. Have shared information with the Hyrax interest group

    • What type of documents are people using for their OCR remediation and testing?

      • UB - worked on handwriting from 21st century, and printed items from 1800s forward. Better results from ChatGBT than out of specific OCR software. PDFs are different can of worms. Uses ABBY Fine Reader for PDFs. 

      • Tuffs - focusing on doing OCR for books that are digitized, almost exclusively out of copyright material

      • OCR with handwritten text - some recommend ChatGB, but not great results all around

      • Various tools on how to convert to PDF on demand. One idea:

      • Presentation from Patrick at NISO last week. Shared notes

    • Various policies vs risk for accessible content? Archives exception

      • IU: We have an on-demand accessibility remediation form, and most of our requests have been for time-based content. For PDF accessibility the remediation group was trained on Equidox which has a limited number of licenses share across IU campuses

      • UB - digital collections and ir are living objects, so don't consider them archives. University archives use presevica and using the exception there. Some overlap between DC/IR and preservica

      • Tuffs - all digital materials, acquired, adopted, developed, or updated after April 19th must be accessible. Anything before then must be made accessible upon request. Collection uploaded before that April date does not need to have to be accessible 

      • Where does information literacy come into play?

    • Humans do better at editing than starting from scratch. AI content to create a chart or other content than starting from scratch 

  • Open discussion (time permitting)

Next meeting: March 24, 2026