2023-10-24 Hyrax Fedora 6 Working Group
Date
Oct 24, 2023(Hybrid meeting at Samvera Connect in Philadelphia)
Participants
@Jon Dunn
@Randall Floyd
@Daniel Pierce
@Juliet Hardesty
@Collin Brittle - notetaker
@Ayoub Belemlih
@Bradley Watson
@Kevin Kochanski
@Emily Porter
@Ben Pennell
@Dan Field
@Rob Kaufman
@Rebekah Kati
@Jeremy Friesen
@Karen Cariani
@tamsin johnson
@Stuart Kenny
Apologies
@Arran Griffith
Goals
Discussion topics
Time | Item | Presenter | Notes |
---|---|---|---|
|
|
| |
|
|
|
|
Notes
Background: use cases doc: https://docs.google.com/document/d/15CPJxVIhriCcnzn2boomfWAJvZ3mwN7KDiPHLvvlS2w/edit?usp=sharing
Daniel presented update on Hyrax-Valkyrie-Fedora 6 work as part of this morning’s Hyrax update.
Need to consider a new chair/convener for the WG
Contact Arran Griffith or Dan Field if interested
Two options for metadata storage
Binary storage, first proposed by Oxford, versus text storage
Binary storage is similar to what Valkyrie does
Tamsin mentioned working on producing a graph from Valkyrie, but that needs predicates, text.
Binary JSON would be saved on disk as a file, and returned as a file.
Not as well preserved if a binary blob. Better preservation as triples, as it is human readable.
A JSON (text) file would be better
Postgres adapter is adaptable to use other SQL storage, but is designed to work with Postgres first.
Plan is to implement JSON as a secondary format once the initial work is done.
Need to loop in Metadata IG?
Potential for storing some metadata as binary if there is less interaction with it.
Sirenia exists as a proof of concept, is ready to play with. https://github.com/samvera/hyrax/blob/main/docker-compose-sirenia.yml
May become part of Dev Congress
Would be nice to start doing performance testing.
A nurax-style environment would be helpful - in the works
Issues w/ Hyrax + Fedora 4 start around 4K AF objects in a collection, or 100 filesets in an object (which is around 40K+ objects in Fedora). Can force the issue by limiting the amount of memory that Fedora has.
There is a test script that can be shared.
Can use Bulkrax to script up performance testing
Bulkrax is mostly ready for use with Valkyrie
The representation of fiesets as being part of the work node exacerbates the performance issue
Group should revisit the charter and discuss if changes are necessary given the progress in implementation.
Is migration in-scope-able? SoftServ has a metadata migration adapter (No files, yet).
Largest questions are around representation of binaries in Fedora
Another issue is around pairtree id problem.
Another future topic is Archival Groups
How or if they are created may become a configuration option in the adapter.
Performance differences between atomic and archival versioning? On Dan’s TODO list.
Now working against bare metal, will be working with TACC.
Fedora UI, analytics, user admin are also going to get an overhaul.
How is provenance tracked? It’s not, but that use case has come up for Fedora before.
Hyrax interacts with Fedora as one, anonymous user
Fedora provides a log of actions that could be used as an audit log.
To track provenance effectively, Hyrax would need to change.