Hydra North Use Cases
Since we haven't got our contributor agreements filed yet, I'm parking these here rather than submitting a PR. - pbinkley
METS/ALTO Newspaper Issues
Given a newspaper issue that has been digitized in the METS/ALTO format, with article-level MODS records embedded in the METS
As a repository
I want to
store the digital images and the METS/ALTO files, maintaining their relationships
store the article-level MODS as discrete items, maintaining their relationships to zones on pages as specified in the METS/ALTO structures
search and display individual articles
extract the text of a given article from its zones for indexing
display the zones for a given article
So that articles can be presented as top-level objects in the discovery system with views of the appropriate zones on the appropriate pages.
METS/ALTO Monographs
Given a monograph that has been digitized in the METS/ALTO format, with metadata in the form of a MODS record
As a repository I want to
store the digital images and the METS/ALTO files, maintaining their relationships
extract the full text from the ALTO files for indexing
display search results in the form of page images with overlaid highlighting to show the positions of the search terms (i.e. the nesting of text blocks within the page is used in the display)
enable specifying sections of a work (e.g. chapters) which may be discoverable (i.e. have their own metadata) or merely navigable (e.g. from a table of contents in the monograph-level metadata record)
display a table of contents, with links to individual pages
display the book in the Internet Archive bookreader, with full-text search enabled
So that the book is navigable and searchable.
Scrapbooks
Given a digitized scrapbook which has been scanned at the page level as well as at the level of each item (clipping, picture, etc.) attached to a given page, for which MODS records have been created both at the scrapbook level and at the attached item level
As a repository I want to
make the scrapbook browsable at the page level, with the ability to view individual items independently
make the scrapbook and the individual items discoverable as top-level items in the discovery system
display the individual items in the context of their page and independently
So that the contents of the scrapbook are accessible in as rich a way as possible.
Research Data Sets
Given a research project with multiple researchers producing a number of datasets
As a data curator I want to
archive these datasets while maintaining relationships between a project and its datasets and sub-datasets, if any, so that at any point in the future users are able to pull all related datasets
archive metadata at the project level as well as at each dataset level
archive researcher information and maintain relationships with datasets they produced so that users are able to see contributions of a particular researcher
may be versioned at the object level or the file level or both
would want to optimize storage in the case of multiple versions of large objects/files
So that future users are able to discover, interpret and reuse these research contributions
Archival documents
Given a digitized archival collection with multiple object types and formats,
As an archivist I want to
provide access to manuscript letters, where multiple text pages may be on a single folded physical page, and the order may not be consistent
allow flexibility in storing archival units (e.g. some at folder level, some at item level)
So that archival collections can be presented to users with as much richness of content and navigation as we can afford in our digitization and metadata work.
Annotated items
Given a research project making heavy use of linked open data to manage relationships among objects, which may contain an entity (person, organization, place, custom, bib-record, work, annotation, event) and its representation (document, event, image, map, audio, video)
As a researcher I want to
maintain the relationship between an entity and all its representations and annotations
store any aggregation details which could be a collection level aggregation or a project level aggregation
store information about a representation object pointing to multiple entities
archive all major revisions of entities/representations/annotations
So that an annotated object can be properly presented and associated with all its representations
Website in a WARC
Given a WARC file for a website which is part of a bigger collection, e.g. Government Websites
As a web archiving coordinator I want to
keep WARC files from a particular area e.g. Government Websites, under one collection so that users have access to all the related WARC files from one collection
provide access to all the PDFs harvested within a single WARC file
so that users interested in a specific object can find and use it
provide bulk or otherwise efficient access to the collection
So that researchers can study or perform computational analysis (e.g. data mining)