EAD in Hydra-Blacklight-Atrium

9/11/12

EAD in Hydra / BL

Compare EAD in Fedora vs. EAD in Solr
Identify common approaches, common needs
Identify opportunities for collective advancement

Indiana, Northwestern, Notre Dame, Virginia = Fedora then solr
Stanford, RockHall = solr w/ digital objects in Fedora

Common Requirements:

EAD Generation (in AT, Archon or Oxygen)
Digital Item linking in Finding Aid
Finding Aid presentation (search and hierarchical browse)
Item Level Discovery

Stanford Worfklow

AT produces EAD
EAD mined for item level descMD in for digital objects in Fedora
links to digital objects put back into AT
EAD w/ PURLs indexed into solr
presentation via Blacklight

Indiana Workflow

EAD generation in Oxygen
storage in Xubmit (VCS)
export to XTF for user presentation
EAD ingested into Fedora as one whole object (part of digital collection)
EAD mined to provide MODS descMD for individual digital objects (item level)
additional process enriches the XTF presentation with links to digital objects

Notre Dame

AT generates EAD
Heracles worfklow system orchestrates processes for digitization, ingest, indexing
EAD ingested into Fedora
digital objects ingested into Fedora
indexing into solr
presentation via BL/Atrium

Northwestern

Archon produced EAD
JMS workflow-based ingest into Fedora as a single object / data stream
indexed into solr
presentation via Blacklight

not currently atomizing EAD on ingest into Fedora
no current method for providing links to individual digital objects. Would like to work on this, but want support for collapsing results / not drowning in results for components with the same descMD
see http://findingaids.library.northwestern.edu/

RockHall

AT generates EAD
index using solr_EAD gem, with OM and solrizer
solr (containing both MARC & EAD)
Blacklight

http://catalog.rockhall.com https://github.com/awead/solr_ead

indexing scheme filters out intermediate components (subseries, etc.)
some performance issues with collections that have thousands of items

Virginia

EAD is produced by heterogeneous workflows due to evolving workflow or different sources (Special Collections vs Law).
Current effort is to atomize EAD in Fedora, then index via SOLR and disseminate via Blacklight.
The why behind atomizing EAD -
- We have broken the EAD up (losslessly) to support appropriate linking to other objects in fedora.
- One of the major challenges of EAD it represents an abstract view of the collection that is almost always incomplete with regards to actual real-world items. For instance, even the most rigorous EADs do not go into details about the item that are needed to present a digital representation of that item, such as the individual scanned pages. What's even more common are EADs that don't even describe the collection at the item level, but instead just refer to a series or some other logical grouping of items. Our approach allows us to easily link digitized materials onto the logical components. EAD isn't suited for this level of detail.
- Furthermore, our approach allows us to represent the same resource in multiple contexts.
- It also allows us to link our finding aids with the MARC records used for accessioning and holdings information (as well as additional descriptive information).
- We chose such an ambitious and complex representation of these finding aids in fedora in order to support a wider array of use cases. It makes some of our work normalizing the current data more difficult, but because we embarked on this *before* there was a comprehensive effort to normalize the descriptive output of special collections (and other sources) we need our data model to be independent from the current data formats we have available. (ie, we won't always be getting EAD XML files anything like the ones we have now, if we even get them at all)
- Status: We are working on a pilot project to flexibly disseminate hierarchical content. Project will be discussed at DLF 2012.

Labels:

Follow Ons:

explore making Atrium a tool for EAD delivery (faceted search & browse as well as hierarchical browse)
- have a default EAD hiearchical browse emplate using collection & exhibit generators
explore strategies for successful
share mock ups of desired output (Stanford -> ND based on Italian collection)
Stanford looking to have a workable EAD search/browse solution (hopefully using Atrium) by Winter Quarter
- target usable (but Beta) product by C4L (Feb 2013)
solr_EAD gem from Awead looking for use & feedback

Samvera