Hydra Tech Call 2015-09-16

Time: 9:00am PDT / Noon EDT

Call-In Info: 1-641-715-3660, access code 651025

Moderator: Carolyn Cole

Notetaker: Nikitas Tampakis 

Attendees:

  • Carolyn Cole (Penn State)
  • Nikitas Tampakis (Princeton)
  • Lakeisha Robinson (Yale)
  • Anna Headley (Chemical Heritage Foundation)
  • Colin Gross (UMich)
  • Lynette Rayle (Cornell)
  • Corey Harper (NYU)
  • Steven Ng (Temple)
  • Justin Coyne (Data Curation Experts)
  • Mike Giarlo (Penn State)
  • Trey Terrell (Princeton)
  • Drew Myers (WGBH)

Agenda:

  1. Call for agenda items
    1. Derivatives - current examples for generating derivatives: https://gist.github.com/elrayle/9a72ffc0c879927b327b
      1. a la carte API - Lynette to make tickets hydra-works and hydra-derivatives to make full-text extraction a configurable derivative in hydra-derivatives.
        1. Currently full-text extraction isn't in Derivatives, it's in hydra-works (recently pulled down from curation concerns)
        2. Characterization currently encapsulates full-text extraction - suggested to move full-text out of characterization and into derivatives to specify which formats should have the full text extraction service run on it.
        3. When using the Hydra Works PersistOutputFile service (https://github.com/projecthydra-labs/hydra-works/blob/master/lib/hydra/works/services/generic_file/persist_derivative.rb) defining a custom makes_derivatives proc currently appends to the set of derivatives defined in hydra-works: https://github.com/projecthydra-labs/hydra-works/blob/master/lib/hydra/works/models/concerns/generic_file/derivatives.rb#L12-L22. It was suggested to have the custom derivatives override the defaults.
      2. side-loading and derivatives - Discuss in next -tech call
        1. Nathan Rogers not on the call. He expressed interest in minimizing calls to Fedora when batch ingesting files.
        2. Note: calling create_derivatives isn't required to create the derivatives.
    2. Dive in to Hydra PCDM - review https://github.com/projecthydra-labs/hydra-pcdm/wiki/Dive-into-Hydra-PCDM
    3. Characterization - Colin to continue working on moving characterization from curation concerns into hydra-works. Follow-up discussion to continue on hydra-tech e-mail.
      1. E-mail thread: https://groups.google.com/forum/#!topic/hydra-tech/KWH-bUo1F3s
      2. Should the generic file model in curation concerns include all characterization properites? Or should different formats be included a la carte? Consensus seemed to be to include a top-level characterization base class, and then allow behaviors for specific formats to be included in addition to the base.
  2. Next call

    1. Date: September 30, 2015 (skipping 9/23 due to HydraConnect)
    2. Moderator: Justin Coyne
    3. Notetaker: Colin Gross