Meeting Agenda - May 28th, 2015

Date and Time:

May 28th

1pm ET/ 12pm CT/ 10am PT 

Call in details:

US: 1-866-398-2885

Participant code: 2819057339#

Attendees:

Julie Rudder - moderator

Hannah Frost - notes

Agenda:

Intros: New people? (From the notetaker: sorry, I missed the name of the new participant!)

Presentation:  Will Cowan from IU gives an overview of IU's plans for using Avalon and HydraDAM2. 

HydraDAM 2

  • Taking suggestions for better name!
  • Funded by Mellon through 2016
  • Primary purpose:  Combine IU's experience with video processing with WGBH's HydraDAM and preservation workflows to arrive at best practices.
  • HydraDAM currently runs on Fedora 3; need to consider extend this system to Fedora 4. In that process, develop content models for AV preservation
    • descriptive
    • structural
    • technical
    • provenance metadata
    • RDF capabilities
  • Look at new storage models
    • HSM at IU: most on tape. some content cached on disk
    • Want to be able to use that HSM structure inside Fedora 4 and related Hydra applications
    • WGBH is interested in how to track files stored on offline LTO
    • Want to arrive at a solution that addresses these storage concerns
    • Video files are large, present particular challenges for ingest workflows and different storage architectures
  • HydraDAM originally built on Sufia. Can we upgrade to Sufia 6, which enables connection to Fedora 4? Currently looking into what that will entail.
  • Key aspect of Fedora 4: "mode shape" connectors
    • Java-based environment used extensively by JBoss 
    • Has projection capability: represents federated objects across a file system.
    • So instead of ingest into Fedora 4, can they be represented in Fedora as federated objects outside of Fedora
    • IU's MDPI project needs this support for bulk ingest of 10-12 TB per day
  • Also want to incorporate Avalon as a means of accessing AV managed in Fedora 
    • Another grant-funded effort running in parallel with HydraDAM 2 project
  • IU treats preservation and access as separate concerns
    • HydraDAM 2 is for preservation
    • Avalon is for access
    • Not required to use both, but HydraDAM 2 should make it easy to access the content it contains via Avalon

Questions for Will and Jon

Linda Newman asks: how does Fedora 4's projection capability really provide an advantage beyond external file links used in Fedora 3? 

  • Good question!
  • Better integrity between files and Fedora 
  • Better (less manual) metadata processing?
  • More performant fixity checking?
  • Impact of latency? - Fedora still needs some work to deal well with high latency

Linda again: does IU have a file size cut off that determines storage location?

  • Not really.
  • Goal is to use projection for any file stored in the HSM system

Julie: Will HydraDAM 2 be developed for broader adoption, beyond IU and WGBH?

  • There is that potential, but effort to engage others and write the necessary documentation, installers, etc. – that part is not funded.

Julie: Will all files at IU be in Avalon?

  • Yes, but for time being the files are staged as the systems are developed and eventually integrated

Julie: Handling of descriptive metadata? Where is the record of record? How are things synchronized?

  • At IU, MARC records in ILS are the authoritative records. Not yet worked out how and when changes there will propagate to Avalon and HydraDAM 2 past the initial ingest
  • For non MARC items, metadata record of record will be elsewhere: Avalon or external systems (databases, etc.)

Linda: Lessons of HydraDAM 2 should be applicable to other content in large sizes, such as large data sets

  • IU has already learned a lot about projection, and the Fedora 4 developers. Happy share

Jon  Dunn observes that cost to store content in DPN / AP Trust will be almost 10x what it costs to store locally. 

 

Next meeting topic/other work:  

Julie reviews our ideas from last meeting

  • AMIA Hack Day ideas?
  • Share our use cases
  • PCDM - Drew/WGBH and Julie Hardesty/IU – Julie R. will ask if they can join in late June to explain PCDM how it fits for AV content
  • Plus: HydraConnect planning