August 2010 - Steering Group Meeting - Agenda and Notes

Must Haves for Fall (UVA)

Set up a pilot version for a small group of faculty to work with in Oct/Nov
Expand to broader group by February
Make available to entire faculty within 9 months

Most Critical

self-deposit of articles/publications
“simple, simple, simple”
bare bones deposit workflow (see below)
interface & interaction must be polished & behave consistently
produce sustainable objects/content & metadata

Strong Desire

ETD submission
Datasets (to satisfy NSF mandate)
basic “Group support” - (set permissions for eduPerson roles)

Not Committed to

“full set of publishing services” (ie. submission to external publications)
full integration with all groups from institutional group infrastructure

Feature Details

Deposit Workflow

If you are faculty, staff or student, you can deposit (this should be configurable per-collection or per-head)
once depositor has hit “Deposit” then asset is not editable

Must Haves for Fall (Hull)

Existing solution (Muradora) with >10k objects
Ingest & management of objects in existing solution is unwieldy, especially when editing metadata
Need to convert existing objects over to Hydra(ngea) asap, release Hydrangea solution in production before Jan 2011 & phase out existing Muradora solution by summer 2011.
All “blob” content is external to Fedora

Must go live for entire University by 1st of January 2011* Fedora underneath, populated by hydra-compliant objects

  • from the end-user point of view looks solid, robust & usable
  • Search & Discovery view with all of the content that’s currently queued up to be converted

Most Critical

Convert existing objects (requires knowledge of exactly what a “Hydra compliant” object looks like)
Allow external solution to construct & submit hydra-compliant objects
Stable URLs
Work with custom MODS and other XML metadata that doesn’t match with Hydra mappings (must be able to declare custom, institution-specific mappings)
Allow administrators to edit xml datastreams as raw xml in the browser (in cases where there’s no explicit support for those metadata schemas)
Image thumbnails in search results & detail views
Gated discovery
“Blob” Content Deposit screen
Image content deposit screen

Strong Desire

Browsing by Collection
Facilitate Sakai & Sharepoint talking to Fedora / Hydra
Ability to view raw xml metadata datastreams
Blacklight Hierarchical Facets & Advanced Search

Not Committed to

Ingest (self-deposit) through Hydrangea is not a priority, as submission will occur through Repomman (existing scholarly workbench) & be copied over to Hydra(ngea) on “deposit”

Hydra Wishlist

Disseminator-mediated download (rather than assuming blob is in DS1)
Stable URLs for downloads, etc.

  • sdef names: default to assuming hydra-sDef namespace, but use sdef as-is if it includes a “:”
  • method names remain untouched, but Hydra sdefs should be re-implemented to leave “get” out of method names

final decision about how to designate parts that should be excluded from search results
DECISION: assertion of hasModel -> afmode:FileAsset

final decision about how to indicate children (implications for AF & existing AF-created content)
DECISION:

  • use “isPartOf/hasPart” relationships to represent relationship between objects and the files that you have uploaded into
  • children should point to parents, not parents pointing to children (uploaded fileAsset should assert isPartOf)

FileAssets should have rightsMetadata (bears UI and access controls implications)
DECISION: Will always support case where chile has no rightsMetadata, so ok to postpone implementation
FileAssets should have descriptive metadata (bears UI implications)
contentMetadata (replacement for METS StructMap -- contextual (parent-specific) information about object’s childen, especially sequence info)

DECISION:

  • Hydrangea is a basic “starter pack” for building a hydra head** includes support for: Hydrangea Article, Hydrangea Dataset, & Hydra models (Image, JPEG2000, SimpleContent, CompoundContent)
  • All Hydra heads build on Hydra Framework (and are encouraged to contribute back to it)
  • Workflows is often the primary driver for creating a different hydra head
  • Content Types can be shared across heads as Rails plugins (BUT you might want to disable editing of a given content type within most heads?)

Catch-all Issues

Deposit should clear out version history of descMetadata datastream
Must create hydra content model assertions (ie. hydra-cModel: )

  • create “extension” models
  • use hydra-cModel:basicMODS
  • use hydra-cModel namespace with model names that indicate provenance & purpose (ie. hydra-cModel:HullUK-DC

version number in footer [blacklight & hydra(ngea)]
Make “Hydra” linked in “powered by hydra”
Make a “powered by hydra” logo

review metadata quality (ie. disallow textile markup in abstracts?  allow it elsewhere?)
store escaped xml but in edit views, display as unescaped xml -- talk to Willy about how this is done in EEMS & ETDs
document testile_area helper method discouraging use for real descriptive metadata

fix facets to display correctly (must be indexed as facets)
add baseline facets
remove title from facets list

edit & browse links have disappeared
accordion still needs work
language should be a drop-down

denote required fields
enforce required fields

prevent edit access for objects that are not supported by this head (possibly just rely on the default edit view)
(long term) extend blacklight’s user implementation to track display name for Users
helper method for getting user’s display name
document which classes & helpers need to be overridden when you integrate your institutional LDAP/Shibboleth, etc.

add suffix support to names in the forms (Jr., Sr., III, etc.)
add mapping for mods:namePart (unqualified)

Issue: node ordering when you delete (solution: update page section on successful delete)

regular html form (minimal javascript)
autosave of regular html form
display notifications of successful ajax edits

bug: invitation text in topic tags is inconsistent with the rest of the form
bug: delete “file assets” header form file assets list
bug: title invitation text truncates display of existing metadata (needs to wrap lines?)

only show file uploader in edit mode (regardless of whether user has edit permissions)
discover access allows you to see the browse view, but not see the file list
section for selecting license (ie creative commons, open data commons) for both articles & datasets

  • attribution/license, embargo
  • pereferred citation?

-- store license options (for populating drop-down) in a fedora object if possible

store metadata in managed content datastreams (requires activefedora fix)
Future: deal with  

Content Types:

  • Open Access Articles
  • ETD
  • Datasets

Display & download from objects of unknown content types
Option to Display metadata datastreams as raw xml

display files in FIFO order

filter for removing anything but straight text from data input
individual files must be able to have separate permissions (not just inherited from parent)
future dated actions (embargo setting & lifting)
pid namespace should default to hydra (but be customizable) -- possibly set pid to hydra-demo in hydra-jetty fedora and document a reminder to set a real namespace in production

TEST: make sure that you can’t access a child through a different parent (ie. access a file_asset from a restricted object by referring to it as a file_asset within a public object)

UI to accommodate large numbers of groups in permissions

Permissions
add “registered user” permissions group
add collection administrator role
add repository administrator role
make clear that if “world” or “public” has permissions, all subgroups also get at least that level of permissions
replace existing groups with eduPerson groups (faculty, student, etc.)
tie into LDAP or Shib
“Publish” and “Submit for Review” buttons that change permissions & notify reviewers in ways that can be customized

Collections

  • collection objects assert isCollection
  • members of collections assert isMemberOf
  • collection-level super-users (collection access[@type=”admin”] )
  • collection-level constraints on deposit permissions (collection access[@type=”create”] )
  • collection-level constraints on publish permissions (collection access[@type=”publish”] )

Nice to have:
tags as comma separated list
urls in license info rendered as links

lower barrier to entry (make installation & adoption as simple as possible)
jruby deployment

Anticipate: Needing to change permissions for all objects in a “collection” --- totally reasonable to provide bulk-update routines (possibly a robot?)

Adoption Paths

These should be documented, at least with a sense of time/resource requirements

Migrating from Fedora 2.x
Migrating from Fedora 3.x
Migrating from File System (hard drives, network, CDs, etc.)

Skinning Hydrangea
Customizing Search Experience
Adding new content types (content models)
Adding new functionality

Resources required to run Hydrangea/Fedora+Solr (ie. servers, etc)