SoaF Use Cases

Use cases by file format

Scenarios describing the need to call up segments within a file for Hydra applications. Based on different file formats (image, time-based media, PDF, disk image, XML, etc)

Time-based Media

Avalon uses W3C Media Fragments for calling up segments of a file (example with Track 2 being called up using https://pawpaw.dlib.indiana.edu/media_objects/avalon:1854/section/avalon:1855?t=131.0,332.0). This spec is used within the web page showing the player to call up a fragment of an audio or video media object. Avalon does not store URIs containing these parameters but instead stores start and end time points in custom XML as a bit stream on the file object (MasterFile) that is part of the media object (MediaObject).

There are needs and issues for referring to segments of a file that Avalon has not handled yet. Right now a single text label is allowed for each start and end time point but no more descriptive metadata capabilities are available. The custom XML being used requires start and end times to be in a certain element (<Span>) and that element is not allowed to contain any further elements.

Another path that annotations will take for audiovisual files are end-user annotations: making playlists, making private annotations or segments. These don't go with the MasterFile object but do need to be stored somewhere and the same method used to call them up (W3C Media Fragments).

Time-based Media - Visual-only

One more way to refer to segments of a time-based media file is spatially, specifically for visual time-based media. For example, showing a static portion of the screen view for a video file during playback, such as the top left corner of a screen view (regardless of zooming, panning, or other camera movements).

XML-encoded texts

XML texts stored in and delivered by Hydra systems should support search, retrieval, and display features that have come to be expected from other platforms.  XML documents to be considered include both fully-encoded texts as well as XML wrappers for structuring page images with OCR.

 

Uses cases to consider:

  • Selecting a specific region of an XML document:
    • Restrict a search to all instances of an element
    • Retrieve content from a specific instance of an element for display
  • Explicitly describing the relationships between parts of a Work that is represented in the repository both as multiple PCDM objects and within a single XML file object. 
    • A digitally reformatted book is modeled in PCDM: The book as a whole is a Work, with each page of the book as a child Object, each of which which has a File of the page scan.  If there also exists a TEI-encoded document describing this same book, then how can we relate the page Objects to the corresponding region of the XML document, and vice versa?
    • What if logical divisions were added to that XML document? How could we represent the relationships between those regions and the groups of page Objects they "contain"?
    • How could a user, who encounters a page image as an object in the repository, be referred to the segment of a related XML document that contains the OCR transcript?

PDFs

Use cases include the ability to refer to (and possibly link directly to) regions of a PDF in the following manners:

  • A specific page of a PDF
    • Supported natively by passing '#page=n' to the PDF viewer
  • A range of pages in a PDF
  • A predefined anchor point ("destination" in Adobe parlance) in a PDF
    • Supported natively by passing '#nameddest=x' to the PDF viewer
    • Requires that destination is already defined in the PDF
  • An arbitrary region of a PDF
    • using two destinations as endpoints
    • using two endpoints that aren't defined as regions in the PDF
      • Means of last resort, when defining destinations in the PDF is not possible. 

Images

Use case - Image files need to be referable by size to show full or smaller versions of an image and region to show portions of an image. This should work for any raster-based image file type (.jpg, .gif, .jp2, .png).

SVG - Use case - Apply SVG vectors as mask to highlight segments of another image file (like a map in GeoBlacklight). Vector-based images can have parameters for non-rectangular or rectangular shapes.


Other possible use cases not yet defined - some with OA Fragment Selector, some not

XHTML - http://tools.ietf.org/rfc/rfc3236 - namedSection
Plain text - http://tools.ietf.org/rfc/rfc5147 - char=0,10
RDF/XML - http://tools.ietf.org/rfc/rfc3870 - namedResource
CSV - http://tools.ietf.org/rfc/rfc7111 - row=5-7
EPUB - http://www.idpf.org/epub/linking/cfi/epub-cfi.html - epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)
Disk image - no fragment selector available