Hydra Tech Call 2015-09-16
Time: 9:00am PDT / Noon EDT
Call-In Info: 1-641-715-3660, access code 651025
Moderator: Carolyn Cole
Notetaker: Nikitas Tampakis
Attendees:
- Carolyn Cole (Penn State)
- Nikitas Tampakis (Princeton)
- Lakeisha Robinson (Yale)
- Anna Headley (Chemical Heritage Foundation)
- Colin Gross (UMich)
- Lynette Rayle (Cornell)
- Corey Harper (NYU)
- Steven Ng (Temple)
- Justin Coyne (Data Curation Experts)
- Mike Giarlo (Penn State)
- Trey Terrell (Princeton)
- Drew Myers (WGBH)
Agenda:
- Call for agenda items
- Derivatives - current examples for generating derivatives: https://gist.github.com/elrayle/9a72ffc0c879927b327b
- a la carte API - Lynette to make tickets hydra-works and hydra-derivatives to make full-text extraction a configurable derivative in hydra-derivatives.
- Currently full-text extraction isn't in Derivatives, it's in hydra-works (recently pulled down from curation concerns)
- Characterization currently encapsulates full-text extraction - suggested to move full-text out of characterization and into derivatives to specify which formats should have the full text extraction service run on it.
- When using the Hydra Works PersistOutputFile service (https://github.com/projecthydra-labs/hydra-works/blob/master/lib/hydra/works/services/generic_file/persist_derivative.rb) defining a custom makes_derivatives proc currently appends to the set of derivatives defined in hydra-works: https://github.com/projecthydra-labs/hydra-works/blob/master/lib/hydra/works/models/concerns/generic_file/derivatives.rb#L12-L22. It was suggested to have the custom derivatives override the defaults.
- side-loading and derivatives - Discuss in next -tech call
- Nathan Rogers not on the call. He expressed interest in minimizing calls to Fedora when batch ingesting files.
- Note: calling create_derivatives isn't required to create the derivatives.
- a la carte API - Lynette to make tickets hydra-works and hydra-derivatives to make full-text extraction a configurable derivative in hydra-derivatives.
- Dive in to Hydra PCDM - review https://github.com/projecthydra-labs/hydra-pcdm/wiki/Dive-into-Hydra-PCDM
- Characterization - Colin to continue working on moving characterization from curation concerns into hydra-works. Follow-up discussion to continue on hydra-tech e-mail.
- E-mail thread: https://groups.google.com/forum/#!topic/hydra-tech/KWH-bUo1F3s
- Should the generic file model in curation concerns include all characterization properites? Or should different formats be included a la carte? Consensus seemed to be to include a top-level characterization base class, and then allow behaviors for specific formats to be included in addition to the base.
- Derivatives - current examples for generating derivatives: https://gist.github.com/elrayle/9a72ffc0c879927b327b
Next call
- Date: September 30, 2015 (skipping 9/23 due to HydraConnect)
- Moderator: Justin Coyne
- Notetaker: Colin Gross
- Date: September 30, 2015 (skipping 9/23 due to HydraConnect)