SoaF call 2015-11-02

Samvera Community Wiki


SoaF call 2015-11-02

Connection Info:
4pm Eastern
Google Hangout: https://hangouts.google.com/call/nkb5sb5bu5hamuzubmvqmrvnjea 

Attendees: 
@Juliet Hardesty
@wgcowan
@Jeremy Morse
@Esmé Cowles 

Agenda

OA Fragment Selector by format

XHTML - http://tools.ietf.org/rfc/rfc3236 - namedSection
PDF - http://tools.ietf.org/rfc/rfc3778 - page=10&viewrect=50,50,640,480
Plain text - http://tools.ietf.org/rfc/rfc5147 - char=0,10
XML - http://tools.ietf.org/rfc/rfc3023 - xpointer(/a/b/c)
RDF/XML - http://tools.ietf.org/rfc/rfc3870 - namedResource
CSV - http://tools.ietf.org/rfc/rfc7111 - row=5-7
Time-based media [spatial or temporal] - http://www.w3.org/TR/media-frags/ - xywh=50,50,640,480 or t=30,60
SVG - http://www.w3.org/TR/SVG/ - svgView(viewBox(50,50,640,480))
EPUB - http://www.idpf.org/epub/linking/cfi/epub-cfi.html - epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)
IIIF - http://iiif.io/api/annex/openannotation/index.html - xywh=100,150,500,30 or pct:0,0,10,10

No specifications for the following selectors, but they are also included in OA and offer a way to refer to segments of a disk image by byte range:

Text quote selector - annotation is the body and the target is fragment where the annotation goes

{
"@id": "http://example.org/anno16",
"@type": "Annotation",
"body": {"@id": "http://example.org/comment1"},
"target": {
"source": "http://example.org/page1",
"selector": {
"@type": "TextQuoteSelector",
"exact": "anotation",
"prefix": "this is an ",
"suffix": " that has some"
}
}
}

Text position selector

{
"@id": "http://example.org/anno17",
"@type": "Annotation",
"body": {"@id": "http://example.org/review1"},
"target": {
"source": "http://example.org/ebook1",
"selector": {
"@type": "TextPositionSelector",
"start": 412,
"end": 795
}
}
}

Data position selector

{
"@id": "http://example.org/anno18",
"@type": "Annotation",
"body": {"@id": "http://example.org/note1"},
"target": {
"source": "http://example.org/diskimg1",
"selector": {
"@type": "oa:DataPositionSelector",
"start": 4096,
"end": 4104
}
}
}

Notes

  • Discussion of Use Cases:

    • Different file formats

    • Time-based media - segment of a video file W3C media specification

      • Segments created that way

      • User annotations - managed differently

    • XML encoded text example

      • general case - being able to select a region of a text for search and display

      • retrieve a chapter for example

      • restricting search

      • Specific case - XML describes something represented elsewhere in the repository

        • a reference to a digital page in the xml document

        • would work with both OCR file and the tiffs that make it up

        • how to represent in PCDM? how to relate the two schemes XML and PCDM?

      • how to link between all representations - indexed OCR, page TIFFS, and XML

    • Will attended OA conferences 4-5 years ago

      • how can you identify and annotate not just an entire image but a segment within an image

      • extended that to talk about being able to do that in a video as well, spatial and temporal

    • Similar to XML encoding linked up to image is the case of a page with image on it and image is what needs to be referenced

      • important for newspapers, can have 3-4 articles, multiple images that are connected to some articles and not others

    • Irregular shapes when you get into maps, manuscripts that are fragmented

  • OA Selector

    • Three pieces 

    • Byte range for disc image

    • Fragment selector by format

    • File types by fragments (see IIIF)

    • Calling up a fragment by extending the url

    • Line up an XML document as a transcript for a page image.

  • How are the text selector, text position selector, text quote selector, data position selector relate to the fragment selector.

  • Are these other things considered but not adopted? There seems to be an overlap between capabilities? Why a text position selector when you have a text fragment selector?

  • What is relation of selectors in section 4.2 and those selectors discussed (text selector, text position selector, text quote selector, data position selector)

  • Do we have a clear idea of the OA selector? open-ended

  • Do we need to specify the exact selectors we will cover, not just say we will do the W3C OA selector specification?

  • Good starting point; but need to so some more research into annotation fragment selectors.

  • Image, pdf, disc image use case spelled out.


Action Items:

  • Julie: check with Rob Sanderson or John Stroop on where IIIF and other fragment specs are listed besides OA/also image use case

  • Julie: Expand on time-based media use case to include spatial.

  • Jeremy: pdf use case

  • Julie: draft wiki page for segment of a file recommendation - pretty sure we can point to OA

    • OA is open-ended, though, so we need to list out standards we considered.

    • Point to use cases.

Next meeting 16 November at 4 PM Eastern.