EAD in Hydra-Blacklight-Atrium

9/11/12

EAD in Hydra / BL

Compare EAD in Fedora vs. EAD in Solr
Identify common approaches, common needs
Identify opportunities for collective advancement

Indiana, Northwestern, Notre Dame, Virginia = Fedora then solr
Stanford, RockHall = solr w/ digital objects in Fedora

Common Requirements:

  • EAD Generation (in AT, Archon or Oxygen)
  • Digital Item linking in Finding Aid
  • Finding Aid presentation (search and hierarchical browse)
  • Item Level Discovery

Stanford Worfklow

  • AT produces EAD
  • EAD mined for item level descMD in for digital objects in Fedora
  • links to digital objects put back into AT
  • EAD w/ PURLs indexed into solr
  • presentation via Blacklight

Indiana Workflow

  • EAD generation in Oxygen
  • storage in Xubmit (VCS)
  • export to XTF for user presentation
  • EAD ingested into Fedora as one whole object (part of digital collection)
  • EAD mined to provide MODS descMD for individual digital objects (item level)
  • additional process enriches the XTF presentation with links to digital objects

Notre Dame

  • AT generates EAD
  • Heracles worfklow system orchestrates processes for digitization, ingest, indexing
  • EAD ingested into Fedora
  • digital objects ingested into Fedora
  • indexing into solr
  • presentation via BL/Atrium

Northwestern

  • Archon produced EAD
  • JMS workflow-based ingest into Fedora as a single object / data stream
  • indexed into solr
  • presentation via Blacklight

not currently atomizing EAD on ingest into Fedora
no current method for providing links to individual digital objects. Would like to work on this, but want support for collapsing results / not drowning in results for components with the same descMD
see http://findingaids.library.northwestern.edu/

RockHall

  • AT generates EAD
  • index using solr_EAD gem, with OM and solrizer
  • solr (containing both MARC & EAD)
  • Blacklight

http://catalog.rockhall.comhttps://github.com/awead/solr_ead

indexing scheme filters out intermediate components (subseries, etc.)
some performance issues with collections that have thousands of items

Virginia

  • EAD is produced by heterogeneous workflows due to evolving workflow or different sources (Special Collections vs Law).
  • Current effort is to atomize EAD in Fedora, then index via SOLR and disseminate via Blacklight. 
  • The why behind atomizing EAD - 
    • We have broken the EAD up (losslessly) to support appropriate linking to other objects in fedora.
    • One of the major challenges of EAD it represents an abstract view of the collection that is almost always incomplete with regards to actual real-world items.  For instance, even the most rigorous EADs do not go into details about the item that are needed to present a digital representation of that item, such as the individual scanned pages.  What's even more common are EADs that don't even describe the collection at the item level, but instead just refer to a series or some other logical grouping of items.  Our approach allows us to easily link digitized materials onto the logical components.  EAD isn't suited for this level of detail. 
    • Furthermore, our approach allows us to represent the same resource in multiple contexts.  
    • It also allows us to link our finding aids with the MARC records used for accessioning and holdings information (as well as additional descriptive information).
    • We chose such an ambitious and complex representation of these finding aids in fedora in order to support a wider array of use cases.  It makes some of our work normalizing the current data more difficult, but because we embarked on this *before* there was a comprehensive effort to normalize the descriptive output of special collections (and other sources) we need our data model to be independent from the current data formats we have available.  (ie, we won't always be getting EAD XML files anything like the ones we have now, if we even get them at all)
    • Status: We are working on a pilot project to flexibly disseminate hierarchical content. Project will be discussed at DLF 2012.

Labels:

See also

Follow Ons:

  • explore making Atrium a tool for EAD delivery (faceted search & browse as well as hierarchical browse)
    • have a default EAD hiearchical browse emplate using collection & exhibit generators
  • explore strategies for successful
  • share mock ups of desired output (Stanford -> ND based on Italian collection)
  • Stanford looking to have a workable EAD search/browse solution (hopefully using Atrium) by Winter Quarter
    • target usable (but Beta) product by C4L (Feb 2013)
  • solr_EAD gem from Awead looking for use & feedback