Hydra Works Data Model
This is a refinement of the Portland Common Data Model, as used by Samvera applications and frameworks, including Hyrax and Figgy. Note: This is not a proposal for a new data model, but an attempt to document the data model this has been in place for some time.
Classes and PropertiesĀ
- Collection (pcdm:Collection) ā an ordered or unordered group of Objects, which may correspond to an archival collection, a user collection, or any grouping of resources.
- administrative metadata (creation/modification dates, visibility, access controls)
- descriptive metadata (title, etc.)
- ordered_members (pcdm:hasMember)
- Object (pcdm:Object) ā an intellectual work, or a component part thereof (such as a volume in a multi-volume book set)
- administrative metadata (creation/modification dates, visibility, access controls, embargo, lease)
- descriptive metadata (title, etc.)
- ordered_members (pcdm:hasMember)
- member_of (pcdm:memberOf, unordered)
- logical_structure (pcdm:hasRelatedObject) ā a hierarchy of Ranges to represent the logical structure of the work (such as chapters in a book, movements/acts/scenes in music and plays, etc.)
- FileSet (pcdm:Object) ā the most granular piece of a work, which corresponds to one or more related digital files (such as a page in a book, a song in a recording, etc.)
- administrative metadata (creation/modification dates, visibility, access controls)
- descriptive metadata (title)
- file_metadata (pcdm:hasFile)
- FileMetadata (pcdm:File) ā the metadata related to a digital file
- administrative metadata (creation/modification dates)
- technical metadata (use, mime type, size, height, width, bitrate, checksums, etc.)
- file_identifiers (pointer to file storage)
- File (Valkyrie::File) ā the actual digital file itself
- content (bitstream)
Diagram
Notes
- Collection Membership
- PCDM allows Collection/Object membership to go in either direction (object pcdm:membrerOf collection / collection pcdm:hasMember object).
- Ordered collection membership must use the hasMember form, because the links must all be in the same place to order them.
- For performance and convenience reasons (e.g., indexing the names of the collections an object is a member of), it is common to have collection membership that does not require ordering use the memberOf form.
- In practice, the hasMember form is often used for one-to-many relationships (e.g., an application only allows a page to be in a single book), and the memberOf form is often used for many-to-many relationships (e.g., an application allows an object to be in many collections).
- Permissions and order may be overriding concerns here (e.g., aĀ user has permission to modify their playlist, but not the song they want to include, so the hasMember form must be used despite the many-to-many relationship tending towards the memberOf form).
- Files and Related Metadata
- having an object representing pages, songs, etc. in order to model them and attach metadata, build IIIF manifests, etc.
- ability to group all related versions of a file (original, derivatives, extracted text, edited production masters, intermediates, etc.)
- Valkyrie needs to store metadata in the MetadataAdapter, while the bits are stored in the StorageAdapter
- Fedora can store them both, but for performance and scalability reasons, applications often store the metadata in Fedora and store the bits on disk.
- Significantly more structure is added here in order to support the management of derivatives, technical metadata, and the way applications work with files.
- A FileSet is essentially a subclass of pcdm:Object that has a 1-to-1 mapping to digital content. This was proposed for addition to PCDM, but not supported outside of the Samvera community because of concerns about the structure being unwieldy and hurting performance. But we have found a number of key advantages to having FileSets:
- We break the pcdm:File into two pieces: a FileMetadata object that holds the metadata, and a File (which is a simple bitstream). This reflects the fact that these are often managed by two separate systems
Extensions
- Logical Structure (Princeton)
- Logical structure is modeled after IIIF Ranges, which are used to encode structure within an item, and provide navigation in a digital object viewer.
- A hierarchy of ranges is used to model the table-of-contents structure of an object and its children, and each Range can represent a set of Ranges representing child sections and/or FileSets representing pages/songs/etc.