2016-10-20—PCDM FileSets Meeting
Date and Time
October 20 2016, 2pm EDT
Connection Information
Google Hangouts: https://hangouts.google.com/hangouts/_/artic.edu/pcdm-filesets?authuser=0
Attendees
- scossu
- matienzo
- Adam Wead
- Andrew Myers
- cam156
- Esmé Cowles
- Jennifer Lindner
- Joshua Allan Westgard
- nikhil trivedi
Regrets:
Agenda
- Post-Hydra Connect thoughts
- Comments on documents produced so far: https://drive.google.com/open?id=0ByRadxtBjDyjbTdGcGhadFJEclk
- Any missing use cases?
- Unresolved use cases (in particular Andrew Myers' case of extracted metadata files)
- FileSets Manifesto: https://docs.google.com/document/d/1ioBiNqe_bXBm0BPLBnEsbETCZciKAqNHqW253Eud2VI/edit?usp=sharing
- Review for publishing to the wider Hydra and PCDM community
- Review main implementation tasks and challenges
- Gems and projects affected
- Content migration—tooling to be provided
Minutes
General agreement on Hydra Re-imagined One diagram, but code changes need to happen to make the files discoverable -- can be done now, but needs to be implemented.
Andrew -- we've done that, now our strategy is to pull stuff out of metadata and putting them into Fileset mean adding Fileset model to solr query that returns records .. now we're talking about PCDM file to list of items that get returned and leaving fileset out.
Fileset would be adding confusion to WBGH -- they don't have a use case.
Can configuration allow for AIC use cases? We need for fileset to carry metadata. Answer is yes, possible but not simple to do that. However it would be nice to make that configurable with json files, see the plugin working group discussions.
All the machinery required to ingest xml and content -- to be able to go back and build up more technical metadata would be great, but WBGH doesn't have need for UI to handle this. WBGH doesn't have need to put metadata on Fileset. Esme -- it should be pretty easy to override the partial of metadata editing view, an implementation-specific use case.
AIC needs an upload form to add a file to existing Fileset and adding field where you could pick a file and say this is describing that other file would be good. Having files that are different derivatives of same source doesn't mean we have need for files describing other files, also we have heavy metadata on Fileset, taxonomy about department, just like what is in Works in Curation Concerns. We'd need all that transferred to the Fileset for us. We'd want the form to upload a file and use file use attribute to say what kind of file it is.
So we need to build something as flexible as possible to override form partials for these implementation differences.
Adam raised concern about limitation in Curation Concerns that doesn't allow Filesets to have arbitrary relationships with PCDM objects (0..n). You want a Fileset that's part of different PCDM objects -- the problem was PCDM doesn't allow you to have arbitrary number of relationships in Curation Concerns.
In Stefano's proposal, he drew specific diagram with multiple relationships between Filesets so PCDM can have many to many relationships with Filesets. In Sufia we don't have ability for Fileset to be member of different PCDM objects. Fileset and files is direct containment but Fileset and other objects is indirect, seems like it should be fine but don't know what repercussions would be.. Hybox recommendation is that loan document would be an object not a file. So if you have a page and want to use it in some other collection, you need a place to have descriptive metadata and something that can be linked to from outside the context.
So we have arbitrary relationships to do that now, but we want to differentiate between real world object metadata and Fileset metadata. Esme doesn't see PCDM as having anything to do with real world objects, PCDM doesn't have to refer to a real world object.
Their apps (Princeton's) have no real world use cases, but WBGH has both but wants them both to be treated the same. Andrew doesn't think there's a need for a hierarchical relationship between real world and digital objects.
Stefano asks if we want to leave Fileset as carriers of metadata and digital content (whether digitized or born digital), that our relationships are just object to object relationships.
Adam - This model is similar to Lerna and Islandora, where there's a document that governs how the objects are to be treated, which implies an admin role, and admin policies and sets. For archival practices, you receive a gift, you get an object that has the metadata about how to govern the gift, and also has a Fileset that represents that loan agreement. So, not impossible to do this, but seems odd.
The has_documentation could be a very similar use case.
A loan request Fileset object doesn't carry as much weight as donor agreement. One for loan request proper and loan request document. Loan request Fileset has lots of metadata -- this is what policy objects in Lerna are designed to do? They are like what Stefano's talking about, but they make no assumptions that there are any machine readable actionable versions of that document. The APO might be a separate object, though there might be links between loan agreement, object and APO (admin policy objects).
Could anybody draw this? Yes, Esme will try.
So what do we want set as next steps?
More use cases.
Mark -- we have real world objects in Hybox, but relation to PCDM is not explicit there.
The idea is that, when Fedora objects get indexed into Solr, there is a property called “hasModel”, which stores the names of the Ruby classes that represents the Fedora object. Be default, the models that are considered during a Solr query are just “Collection” and any registered Work type — which are any work types that you may have created with using the generator `rails g curation_concerns:work MySpecialWork`.
But if you want files to be discoverable, you have to add the Ruby class that represents a file to the Solr query. Currently, we’re doing that by having Solr also look for `hasModel: "FileSet"`. But given some of the proposed changes, we may also need to tell Solr to consider the ruby class that represents a pcdm:File (although I don’t know what that is off hand).