Object cModels and datastreams






For the most part, the University of Hull has followed the Hydra guidelines for cModels and datastreams closely; however, there we discovered a number situations where we needed to make slight changes. These are detailed here:

commonMetadata cModel

Whilst implementing our complex objects to represent electronic theses and dissertations we noticed a small problem. Child objects need to subscribe to the commonMetadata cModel so that they can have their own rightsMetadata. However, the children do not have a descMetadata datastream and so we have made this optional in our implementation of the cModel.

compoundContent cModel

Hydra does not yet offer a content model to describe a compound object and so we have invented our own for now. In the pattern of additive cModels, this is intended to be used in addition to the genericContent cModel which specifies the first delivery datastream (content). Our complementary compoundContent cModel specifies a further five (an entirely arbitrary value) optional content datastreams called content02 - content06.

contentMetadata

Whilst largely adopting the schema for contentMetadata suggested by Stanford University we have found it necessary to modify that slightly and to implement some of the possible extensions. For example:

<contentMetadata  type="journalArticle" xmlns="http://hydra-collab.hull.ac.uk/schemas/contentMetadata/v1">
    <resource sequence="1" id="text" type="Journal article" contains="content" displayLabel="Journal article" objectID="hull-res:nnnn" serviceDef="hull-sDef:journalArticle" dsID="content" serviceMethod="getContent" visible="true|false">
        <file format="pdf" id="content" mimeType="application/pdf" size="nnnnnn">
            <location type="url">http://hydra.hull.ac.uk/assets/hull:2376/content</location>
        </file>
    </resource>
</contentMetadata>

We have added three attributes to the <resource> tag:  dsID, serviceDef and and serviceMethod.  These both support work with virtual contentMetadata datastreams in complex (parent-children) objects and support the construction of URLs that may ultimately contain dissemination methods.  Note in passing that the use of these attributes obviates the need for a resourceInfo datastream in the children.

dsID identifies the datastream that the <file> is linked with; serviceDef and serviceMethod optionally the dissemination needed to retrieve it. With this information the <file> and <location> elements can be built for virtual contentMetadata datastreams. Using the dsID, MIME type and size can be retrieved from a child object, whilst serviceDef and serviceMethod allow the location URL to be built. We wondered whether these service tags were strictly necessary because the dissemination could be inferred from the child's cModels - however we realised that there may be occasions when the obvious dissemination was not what was wanted. (Perhaps someone wants to return a pdf as LaTeX?)

"Visible" is a property that determines whether a datastream is visible to end users or just to those with editing rights (essentially our library staff).  If the "visible" property is missing, it is assumed "true".  Set to "false" the datastream (perhaps the archive file) comes up in red on the splash page for our editors - there but not to be touched!

In addition we are allowing a filesize of "" for historic cases where this information is not available.

URLs

In doing this work we have realised that the URL structure we had been working towards for content download is not the best design. We have restructured it so that the service definition and method have gone, eg:

http://hydra.hull.ac.uk/assets/hull:nnnn/content

Here 'content' is a specific reference to the datastream to be returned (see dsID above) and specific disseminators can optionally be added to the end, eg:

http://hydra.hull.ac.uk/assets/hull:nnnn/content/serviceDef/serviceMethod

genericContent

We are very largely using the Hydra cModels set out on the Hydra wiki page. However for content, as opposed to metadata, we had a problem. We have around 30 content types, many of which could be used against the genericContent cModel but some of which need modified display and edit characteristics - so differing Ruby/AF models.

To cope with this, and so as not to use anything potentially unreliable like <genre> in the MODS metadata as a hook, we have created multiple copies of the genericContent cModel 'Lego brick' each with a different name - so 'Presentation', 'journalArticle', 'Report' etc. Think of them as the same brick but in different colours.