SoaF call 2015-11-02

Connection Info:
4pm Eastern
Google Hangout: https://hangouts.google.com/call/nkb5sb5bu5hamuzubmvqmrvnjea 

Attendees: 
Juliet Hardesty
wgcowan
Jeremy Morse
Esmé Cowles 

Agenda

OA Fragment Selector by format

XHTML - http://tools.ietf.org/rfc/rfc3236 - namedSection
PDF - http://tools.ietf.org/rfc/rfc3778 - page=10&viewrect=50,50,640,480
Plain text - http://tools.ietf.org/rfc/rfc5147 - char=0,10
XML - http://tools.ietf.org/rfc/rfc3023 - xpointer(/a/b/c)
RDF/XML - http://tools.ietf.org/rfc/rfc3870 - namedResource
CSV - http://tools.ietf.org/rfc/rfc7111 - row=5-7
Time-based media [spatial or temporal] - http://www.w3.org/TR/media-frags/ - xywh=50,50,640,480 or t=30,60
SVG - http://www.w3.org/TR/SVG/ - svgView(viewBox(50,50,640,480))
EPUB - http://www.idpf.org/epub/linking/cfi/epub-cfi.html - epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)
IIIF - http://iiif.io/api/annex/openannotation/index.html - xywh=100,150,500,30 or pct:0,0,10,10

No specifications for the following selectors, but they are also included in OA and offer a way to refer to segments of a disk image by byte range:

Text quote selector - annotation is the body and the target is fragment where the annotation goes

{
"@id": "http://example.org/anno16",
"@type": "Annotation",
"body": {"@id": "http://example.org/comment1"},
"target": {
"source": "http://example.org/page1",
"selector": {
"@type": "TextQuoteSelector",
"exact": "anotation",
"prefix": "this is an ",
"suffix": " that has some"
}
}
}

Text position selector

{
"@id": "http://example.org/anno17",
"@type": "Annotation",
"body": {"@id": "http://example.org/review1"},
"target": {
"source": "http://example.org/ebook1",
"selector": {
"@type": "TextPositionSelector",
"start": 412,
"end": 795
}
}
}

Data position selector

{
"@id": "http://example.org/anno18",
"@type": "Annotation",
"body": {"@id": "http://example.org/note1"},
"target": {
"source": "http://example.org/diskimg1",
"selector": {
"@type": "oa:DataPositionSelector",
"start": 4096,
"end": 4104
}
}
}

Notes

  • Discussion of Use Cases:
    • Different file formats
    • Time-based media - segment of a video file W3C media specification
      • Segments created that way
      • User annotations - managed differently
    • XML encoded text example
      • general case - being able to select a region of a text for search and display
      • retrieve a chapter for example
      • restricting search
      • Specific case - XML describes something represented elsewhere in the repository
        • a reference to a digital page in the xml document
        • would work with both OCR file and the tiffs that make it up
        • how to represent in PCDM? how to relate the two schemes XML and PCDM?
      • how to link between all representations - indexed OCR, page TIFFS, and XML
    • Will attended OA conferences 4-5 years ago
      • how can you identify and annotate not just an entire image but a segment within an image
      • extended that to talk about being able to do that in a video as well, spatial and temporal
    • Similar to XML encoding linked up to image is the case of a page with image on it and image is what needs to be referenced
      • important for newspapers, can have 3-4 articles, multiple images that are connected to some articles and not others
    • Irregular shapes when you get into maps, manuscripts that are fragmented
  • OA Selector
    • Three pieces 
    • Byte range for disc image
    • Fragment selector by format
    • File types by fragments (see IIIF)
    • Calling up a fragment by extending the url
    • Line up an XML document as a transcript for a page image.
  • How are the text selector, text position selector, text quote selector, data position selector relate to the fragment selector.
  • Are these other things considered but not adopted? There seems to be an overlap between capabilities? Why a text position selector when you have a text fragment selector?
  • What is relation of selectors in section 4.2 and those selectors discussed (text selector, text position selector, text quote selector, data position selector)
  • Do we have a clear idea of the OA selector? open-ended
  • Do we need to specify the exact selectors we will cover, not just say we will do the W3C OA selector specification?
  • Good starting point; but need to so some more research into annotation fragment selectors.
  • Image, pdf, disc image use case spelled out.

Action Items:
  • Julie: check with Rob Sanderson or John Stroop on where IIIF and other fragment specs are listed besides OA/also image use case
  • Julie: Expand on time-based media use case to include spatial.
  • Jeremy: pdf use case
  • Julie: draft wiki page for segment of a file recommendation - pretty sure we can point to OA
    • OA is open-ended, though, so we need to list out standards we considered.
    • Point to use cases.
Next meeting 16 November at 4 PM Eastern.