2023-10-24 Hyrax Fedora 6 Working Group

 Date

Oct 24, 2023(Hybrid meeting at Samvera Connect in Philadelphia)

 Participants

  • @Jon Dunn

  • @Randall Floyd

  • @Daniel Pierce

  • @Juliet Hardesty

  • @Collin Brittle - notetaker

  • @Ayoub Belemlih

  • @Bradley Watson

  • @Kevin Kochanski

  • @Emily Porter

  • @Ben Pennell

  • @Dan Field

  • @Rob Kaufman

  • @Rebekah Kati

  • @Jeremy Friesen

  • @Karen Cariani

  • @tamsin johnson

  • @Stuart Kenny



  • Apologies

  • @Arran Griffith

 Goals

  •  

 Discussion topics

Time

Item

Presenter

Notes

Time

Item

Presenter

Notes

 

 

 









 

 

 

 

Notes

Background: use cases doc: https://docs.google.com/document/d/15CPJxVIhriCcnzn2boomfWAJvZ3mwN7KDiPHLvvlS2w/edit?usp=sharing

Daniel presented update on Hyrax-Valkyrie-Fedora 6 work as part of this morning’s Hyrax update.

Need to consider a new chair/convener for the WG

  • Contact Arran Griffith or Dan Field if interested

Two options for metadata storage

Binary storage, first proposed by Oxford, versus text storage

Binary storage is similar to what Valkyrie does

Tamsin mentioned working on producing a graph from Valkyrie, but that needs predicates, text.

Binary JSON would be saved on disk as a file, and returned as a file.

Not as well preserved if a binary blob. Better preservation as triples, as it is human readable.

A JSON (text) file would be better

Postgres adapter is adaptable to use other SQL storage, but is designed to work with Postgres first.

Plan is to implement JSON as a secondary format once the initial work is done.

Need to loop in Metadata IG?

Potential for storing some metadata as binary if there is less interaction with it.

Sirenia exists as a proof of concept, is ready to play with. https://github.com/samvera/hyrax/blob/main/docker-compose-sirenia.yml

May become part of Dev Congress

Would be nice to start doing performance testing.

A nurax-style environment would be helpful - in the works

Issues w/ Hyrax + Fedora 4 start around 4K AF objects in a collection, or 100 filesets in an object (which is around 40K+ objects in Fedora). Can force the issue by limiting the amount of memory that Fedora has.

There is a test script that can be shared.

Can use Bulkrax to script up performance testing

Bulkrax is mostly ready for use with Valkyrie

The representation of fiesets as being part of the work node exacerbates the performance issue

Group should revisit the charter and discuss if changes are necessary given the progress in implementation.

Is migration in-scope-able? SoftServ has a metadata migration adapter (No files, yet).

Largest questions are around representation of binaries in Fedora

Another issue is around pairtree id problem.

Another future topic is Archival Groups

How or if they are created may become a configuration option in the adapter.

Performance differences between atomic and archival versioning? On Dan’s TODO list.

Now working against bare metal, will be working with TACC.

Fedora UI, analytics, user admin are also going to get an overhaul.

How is provenance tracked? It’s not, but that use case has come up for Fedora before.

Hyrax interacts with Fedora as one, anonymous user

Fedora provides a log of actions that could be used as an audit log.

To track provenance effectively, Hyrax would need to change.

 

 

 Action items

 Decisions