2024-10-25 Hyrax Fedora 6 Working Group
Date
Oct 25, 2024@9am Eastern
Participants
@Arran Griffith
@Ben Pennell
@Bradley Watson
@Collin Brittle
@Dan Field
@Daniel Pierce
@Emily Porter
@Heather Greer Klein
@Jon Dunn
@Juliet Hardesty
@Kate Dohe
Scott Prater
@Randall Floyd
@Rebekah Kati
@Tom Wrobel
indicates note taker
Goals
Updates on progress for ongoing development work
Discussion topics
Item |
---|
Welcome |
Updates:
|
What work remains? |
Wrap Up Next meeting:
Next note taker: Scott Prater |
Notes
Welcome and introductions
New: Scott Prater, Chair-Elect of Fedora Governance
Updates -
New Hyrax Product Owner (Rebekah)
Rebekah transitioning out next sprint, will likely still attend this group, but need to invite new PO, Nick Homenda (Tufts University) - Arran to reach out to Nick
Emory Updates
Provided updates at Fedora Virtual Showcase
Baseline vanilla performance testing with Sirenia Docker - Brad wrote a process to create minimal metadata and simple files. Good results locally, AWS base instance has been more challenging (configuration--don't want to bring Emory-specific processes in)
Running larger test loads in the preproduction environment - 1K works/batch, avg around 7 files/work. Working more with Sidekiq to resolve issues. Using bulkrax 8.1 and discovered looping issue; may resolve with a bulkrax update. Works create quickly, but the file level processing is slow.
Part of the job is having hyrax create the derivatives on ingest, as well as data entities for preservation events.
Getting underpinnings for full-text search and extraction (PDFs, etc.), but that text is staying at the application layer and not in fedora. Main goal is getting that text into Solr, not necessarily longer term preservation.
OCFL - a little concerned about OCFL object bloat, happy to have it.
Does F6 need to be reindexed periodically after larger ingests? Should be indexing upon batch receipt. Monitoring Fedora resource count as well.
Performance question - try to start with AWS defaults in the pre-production environment. However, Fedora's tomcat's running out of space due to log files. What's the recommended size for a fedora tomcat space allocation? Fedora can be very verbose, so watch that - but info or error level shouldn't generate anything major in regular use. Log rotation - Dan will look into it more.
At Wisconsin - usually set log level high during initial testing, review carefully, resolve issues, then rotate out and step the level down for regular production.
Unsupported Hyrax features
PUI fixity checking
Statistics dashboard: mimetypes, etc. FCRstats endpoint: can these be brought back into the Hyrax UI? Will be working on it.
Are there other stats that are important to people?
Can see a count of binary files, but not mimetypes at the binary level. That seems like a samvera issue at the file ingest - mimetype isn't being passed along
Is fedora doing mime typing on the fly? No, not through the API - will take what you supply at ingest.
FCRstats has been very useful - will create tickets in fedora JIRA to discuss at a tech meeting
HasModel: breakdown of filesets, fedora resources
FCRsearch endpoint: discussion at Samvera NE. Valkyrie FITS Fedora adapter - really written for F4, doesn't take advantage of some specific features in F6. Adapter could be used though; some features aren't implemented yet though
How to query model type through UI? Didn't seem like a direct way to query
How flexible and extensible is the field search? Dan can investigate further.
(Not for nothing, we have these queries at UMD in our Archelon UI)
Updates from Samvera Europe Meetup
Strong community interest in F6-Hyrax support
Also interest in F6 preservation services
Sirenia Testing Work
next hyrax update release (5.1) is underway
sirenia and F6 work is part of that
Accessibility focus in next release as well
dependency upgrades are also required
Will stable.nurax be part of that? Part of the plan, will be the next site for performance testing
Fedora 6.5.1 is at RC stage. Includes new OCFL-java S3 implementation for significantly improved performance
Action items
Check on if more work needs to be done to cover all the predicates from hyrax so that they get persisted to Fedora with a real URI, particularly FITS ones.
Discussed Items: