June 2, 2014

Date/Time: June 2, 2014 @ 4pm EDT

Dial-in Info: Phone number and access code

Moderator: Adam Wead

Notetaker: Matthew Farrell

Attendees:

Nathan Tallman
Eira Tansey
Mike Giarlo
Michael Levy
Paul Ruderman

Agenda & Notes

Call for agenda items
1. Nothing offered
Discuss Mike Giarlo's summation of use cases: Hydra ArchivesSpace Integration Features
1. Mike summarized what he'd done (described in the bolded text below) before opening the floor up for discussion and questions.
2. Generally, where is this project headed? For example, will we agree upon a set of community best practices, propose or develop a new Hydra head?
  1. Ideally, a gem that can talk to both Hydra and ASpace
  2. lean, and widely usable, something that individual institutions could adopt and adapt, layering functions into existing Hydra apps and ignoring aspects they wish to.
3. Some discussion about whether scoping to Aspace has made HAWG too exclusionary
  1. many of our user stories, and the functions they describe are not specific to Aspace. Aspace is a product that many institutions are looking at and/or currently implementing, and so is viewed as a good starting point
  2. once we've worked on integration, can open this group up to other discussions (BitCurator, e.g.)
4. Timeline and next steps
  1. ~3rd week of June, Mike will return to the user stories (linked above), five broad minimum functions (below), and feedback provided by HAWG members (Hydra-tech list or individual email) and refine
  2. at next monthly call, group to discuss integration functionality and start determining asynchronous activities.
  3. how development will proceed at that point is a bit of an open question, which could run the gamut between
    1. development centered at one or two institutions, using pull requests on github to determine development priorities
    2. a grant project with multiple institutions participating in official capacity
  4. potential deliverable: develop a gem that handles functionality on the Hydra side and articulates what is required from the other system (using Aspace as an example)
    1. i.e., focus on the arrows from the Hydra API rather than the boxes
Next meeting
- Date: July 7
- Moderator: Nathan Tallman
- Notetaker: Matthew Farrell

At LAMDevConX last week, I volunteered to read through all of HAWG's user stories about integration between repositories/catalogs (Blacklight/Hydra/Fedora) and archival collection management systems (ArchivesSpace) and boil them down. You'll find a summary of this here:

Hydra ArchivesSpace Integration Features

I came up with 13 more or less unique (though some highly interrelated) user stories, and they break down thusly in terms of directionality:

* 4 with data flowing from ArchivesSpace into Blacklight/Hydra/Fedora
* 4 with data flowing from Blacklight/Hydra/Fedora into ArchivesSpace
* 5 with data flowing in both ways

You'll notice that user stories focusing on faceting, filtering, crosswalking, access controls/rights, full-text indexing, folder/batch ingest, and checksum verification without an explicit mention of ArchivesSpace aren't reflected in this list. These functions are all provided by Hydra, so it was unclear to me how ArchivesSpace was involved. (Which isn't to say that these user stories won't be useful, but that first I'd like to nail down the possible integration points before focusing on what Blacklight/Hydra/Fedora will do with this data once it's got it.) If that's a faulty assumption, please let me know.

Boiling down the 13 features that I extracted from the user stories, it seems to me that the functions that are generally needed are:

* Bidirectional linking between ArchivesSpace and Hydra to associate an object in one system with an object in the other.
* These links may need some semantic disambiguation beyond "associated with," and so a list of relevant relationships (from one or more ontologies) between archival and digital objects would come in handy.
* Persistent, unique identifiers to make associations "stick."
* For an identified thing in ArchivesSpace or Hydra, the ability to grab its (descriptive, technical, provenance, event, finding aid) metadata.
* Query based on an identifier or set of identifiers.
* Apply filters to queries.

I'd welcome your comments and questions on the above!