Hydra:Works @ Code4Lib Notes

Hydra:Works @ Oregon Universtity 

February 12th and 13th, 2015





Question:  Should this be data modeling be at the Hydra level or the Fedora level?


This data modeling would be useful for not just the Hydra community but the entire Fedora community and should exist outside of Hydra.  The modeling should only include abstractions on relationship and not include functionality.





Our summary of the shared content model included a review of model diagrams created at the earlier meeting in Portland and the Application Profile Google Doc.  We reviewed some of the basic assertions outlined in both documents as it relates to Works, Collections, and Files.  We extended the conversation to include conversations about complex Use Cases that include descriptive or non-member works and how we should handle them.  i.e. proxy nodes, donor agreements, terms of use, and thumbnails.  Since Collections may not have hasFile and we want to distinguish between members and aggregates.  This would be classified as ore:aggregates.



  can have collections

  can have member objects (works)

  can have components

   Use of predicate over subClassing:  describes

  Summary level of collection:  hasMember


 Types of Works that Describe

  thumbnail proxy

  larger proxy

  text proxy

  Types of Works used for attaching a non-member work

  Donor Agreement

  Terms of Use 

  Thumbnail / Representative

  Provenance / curatorial


  Exemplary Record

  Featured Record

  Preferred Member

What do we do with these describe documents?

 Since Collections may not have hasFile and we want to distinguish between members and  aggregates.  This would be classified as ore:aggregates

 Assertion: Using works within works to encapsulate descriptive metadata for files provenance.


Complex Object Examples:

 1. Scholarly Submission (Thesis - PDF, Dataset-CSV, Thumbnail - PNG)

  Thesis (Work)

  Dataset (Work)

  Content (Work)

  Thumbnail (File)





We attempted to come up with a more representative name for our shared content model.  The highlight on attempted.  Although we couldn’t reach any consensus, there were some ideas thrown around with the most support for Object.  Some the ideas included: Asset, Concern,  Subject,  Resource,  FOB,  DOB,  Container,  Object,  and Entity.  Prefix - Generic, Submitted, or Digital. 


Should we stick with works?

 What is the thing in Fedora:  node, object, container,

Likes to standardizing on LDP

Fedora.info - Fedora namespace





Rob Sanderson guided us through the W3 standard for linked data called LDP.  The Linked Data Protocol could be used to organize our objects and data streams in Fedora4. 

Here are two resources for LDP:  alturl.com/b485n and www.w3.org/TR/ldp.

LDP is a platform for managing interactions of linked (RDF source) and non-linked data and containers.  We discussed the use of basic, direct, and indirect containers to create a standard that would be manageable in Fedora 4 and linkable in RDF using REST and SPARKLE queries.  The structure for writing these relationships to FEDORA 4 on works and files would be:




This structure will also work for the descriptive and non-member works 




and collections




Algorithm for updating title


  POST <dc:title"Fish"


  GET /works/1

  /works/1 dc:title "Fish"

  PUT /works/1


Topics to be discussed at LDCX:

Use of direct containers to automatically make assertions about newly created resources.

hasMembership Resource

hasFile:  /works/1/files  automatically

works/1 hasFile files/f1

There was agreement that this would be a good way to manage the objects, but we would need to check with Chris Beer and Andrew Woods to make that sure that Fedora 4 can handle LDP.  Handle things like:

You can't change the e-tags when updating things.

What happens with reciprocal property?

What test need to write test?





Should we have our own fedora convention for deleting resources?

Interact with REST / SPARQL

  PATCH and PUTS (overwrite)



 Membership using PATCH

 ldp - structural metadata

 very REST - SPARQL for metadata





There was a little discussion around ordering efficiencies and RDF.  There was still consensus around ORE ordering, but there were conversations around the use of serialized arrays, double linked list, and micro syntax in implementations.  

Need a plan to implement ORE: There has been a big push since our return on the RDF Proxy List Repo:  https://github.com/projecthydra-labs/rdf-proxy_list.  

POST <> proxyIn Coll

proxyFor W

                type ore:Proxy





Once the decision was reached to treat this as a Fedora Content Model, we started to organize around the functionality that we would would like to make things more compatible with Hydra heads.  The first point was that Active Fedora would need to be updated to use the LDP conventions.  Then we started to model out SUFIA into works models.  Concerns about where functionality should reside and how behavior should be implemented were also voiced.  I think it remains unclear what will be in the Hydra:Works gem once the shared content model is in another FCDM gem.


submitted_resource to be used instead of generic_files?

one parent - one file?

hydra namespace to be used?

associations go in hydra:works?

 Types of Hydra:Works:


  How do i distinguish the leaf items?



    Technical Metadata

    use of file

   predicate for indexing files in solr

   leaf_works are defined by class


  How do i distinguish a submitted_resource?


  How do i distinguish the compartment?

 Where we will put this?



  model/services/sufia - separate structure from behavior

  Seperation of Concerns: Gem that represents data model structure and a gem that represents data model behavior.

 What do we have to do in Active Triples?



   Rules for removal

   type and inheritance i

  Active Fedora 9 will need changes to allow for ldp containers

      indirect containers - should go into ActiveFedora

   Add API that says I have a subcontainer at this place that should be direct or indirect

   Active Fedora API wrappers around hydra:works

   Build Abstractions up front

  Timeframe for Hydra:Works including ordering:  Mid-April




Topics for LDCX:  A few RDF concerns were raised about namespacing, changing graphs, direct pointers, transcoding, and state.  





Our LDP Structure for Admin Sets will resemble this:




Topics for discussion at LCDX: 

 Non-RDF Resources: File in Fedora is a non_rdf resource. 

 WebACL inheritance


 Costly to move from one Admin Set to another with redirect.

 Model… AdminSet can have class





Needs to be discussed further.


Topics for LDCX:

How does this look in Hydra 9 and ActiveFedora 9?

WebACL has rdf representation of rights in Fedora





The action items are: 

  1.   Update Data Model Document (Rob, Esme, Tom, and Jon)

  2.   Coordinate with Fedora Works Repo (Thomas)

           3.   Other Metadata conversation (Karen)

            4.   Hydra:Tech on LDP containers (Rob)

  5.   Topics for LDCX - March 23