Notes, Hydra project meeting December 2011

Day 1 Wednesday Morning

Demonstrations and institutional updates

UVa update, Robin Rugabber

Nothing vastly new from Uva, but a brief review of Virgo OPAC. Built on Blacklight. Lately have combined search for catalog and articles, there has been some interest in that implementation. 

Libra open access; electronic theses in, data sets coming soon not yet fully release pending legal signoff, uses the Hydra technology stack. Have some submissions, haven't done a big push yet with faculty. Discussions now about including theses from Engineering, and whether or not to include Masters theses. Also doing other local projects, but working on moving to Rails3, upgrading Hydra stack, will be doing that in the next quarter.

Q about data sets; there have been folks in the science libraries who have been working on data modeling, have also been talking with CDL, focus for now has been on policy. Haven't gotten into some of the interesting problems like very large data sets, etc.  One thing under discussion is possibility of doing sample data sets. Q about whether any modeling on metada needed for data sets and what a submission process might look like for this kind of activity; lots of interest in the room. Stanford has been doing some modeling for science data and it has been challenging to keep the metadata fields under control. 

UVa also interested in geodata, in conjunction with the Scholars Lab people. 

Northwestern Update, Mike Stroming

Digital Image Library code being migrated from Hydrangea to Rails3 Hydra Head. Basic functionality is for multi-res images, can create new collections via drag and drop. Drag and drop rearrange of images in collections. Cropping tool as well. We are about 75% migrated to new Hydra. 

Crop tool for zooming in, out, rotating, creating new crops, which creates new Fedora objects. Also image uploading feature that connects to image processing workflow. All cataloged in VRA Core. 

Finding aids demo; findingaids.library.northwestern.edu which is a Blacklight implementation (not Hydra). Various facets indexed, displays Finding Aid, Fedora dissemination for each. Archon is the collection management system, finding aids are exported to EAD. Also have another photo collection (Winterton) with description in EAD and digital objects. Also have another system for scan and production management (PSDS) that is finding its way forward.  Other things on the horizon, very long list of things that we hope we will do in the future, Claire will upload the list. 

Bill framing the DIL generic head question for later discussion: image stack for multires images with accompanying disseminators, viewer. Hydra community has talked a lot about image management, question always comes up about architecture, and whether image presentation, upload, processing, metadata binding should be abstracted to a shared architecture. Useful to have discussions about these things before Northwestern bakes its assumptions into the DIL hydra head.

Stanford Hypatia update, Naomi Dushay

Have done a lot since the last partners meeting. Now 14 collection records and something like 7 of the collections have items. All born-digital materials. For Stephen Jay Gould, for example, were handed a stack of floppy disks, use the Forensic Tool Kit (FTK) to extract contents, in some cases have background images as well. IN some cases have online the disk image, in other cases were able to extract individual files. THis particular demo was for the AIMS grant, but it's on hold for the moment. Can navigate from a file to other files on the same disk, etc. Discovered that EAD was being created solely for accessioning into Hypatia; decided to change that process by starting with Hypatia instead, and enhancing with things that would have been added to the EAD. If need to export an EAD file, hope to be able to generate one from the app. Showing creating a new asset via Hypatia, in the minimally styuuled Hydra head. Drag and drop to edit relationships/add new sets. CSS to indicate clearly when there are unsaved changes. Not yet paging to show more than the first 10 items in Hypatia, but idea is to hopefully use Blacklight capability to support search & select, that would then feed into the relationship arrangement drag-and-drop. Most of what we're seeing is the result of two months of heavy development in Sept - Oct, is frozen pending discussions about what happens next. May bring Hypatia into Stanford, load in all objects for Archivists and make it look more like browsing a finding aid would look. Hypatia lost its organizing device, which was the AIMS grant, need to decide what will be community approach for moving ahead. Code for integrating FTK is on github. Currently still Rails 2.

Q about how collection memberships espressed? Mostly using RELS EXT . Four kinds of objects in Hypatia: collections, sets, disk images (which behave like sets in that they can have members, but also behave like items in that they can have metadata and attached files), and items. Not a mechanism at this point for ordering members, right now they're showing up in the order added. Q what reaction from archivist community has been? At DLF there were three presentations on archives stuff: ArchiveSpace, Hypatia and something by OCLC. standing room only for ArchiveSpace and Hypatia, lots of interest and informal feedback, pleased to see working tools. N: practices were very different across the different institutions. T: two strains of requirements that we're hearing from community: ways to show finding aids interfiled with digital objects. Other is that archivists don't have sufficient tools for doing the arrangement and description of born-digital objects. M: there has been an IMLS grant funded to add EAD capability to Blacklight.

Argo demo/update, Michael

Shhring the test version of Argo. Window on administrative and in-process side of digital library, management of DOR (digital object registry). Hope to have SDR functions as well (Shared digital repository, preservation environment). Argo is not really a Hydra app, doesn't use the Hydra plugin, actual Hydra cdocde, but is used on Blacklight, Fedora, ActiveFedora. First step of digitization or adding something to system is object registration: give project name, select type (assigns an administrative policy), select worfklow, select metadata source, select editor form, can assign tags. Then get a spreadsheet-like display (metadata ID, SourceID, assigns DRUID), can generate tracking sheets for managing physical items. Also front-ending API calls to register things programmatically. Decelopers on specific projects can bypass this form and do things programmatically. In Admin interface, Tags organized into a facet (registered by, AdminPolicy, etc.). Other facets for object type, content type, owning collection, workflows. Can rotate this facet so that process is second level, status is top level. Achieved through ... something Michael would be happy to talk about at a break. Not using SOLR pivot facet functionality, but by an indexing trick to render in a certain way at display time. Showing maps view; search results, drill down to individual object view: metadata section, Fedora datastreams view (pop up raw XML), life cycle view -- pops up light box showing the details of all the steps in the workflow, showing the error in context. Next steps. Showing a graphical view of the workflow, shows prerequisites. Item detail page also has sidebar to show other views: public-ish view via PURL, link in MD Toolkit Orbeon Forms view, Fedora object profile view, straight to FOXML, etc.  Q about how similar this is to PSDS, whether there would be any benefit to looking at unification. Would like to have a breakout to look more closely at this. Michael: currently not using SOLRizer, are using GSearch; one thing that might be worth discussing is the notion of versioned indexes. Q about how facet vocabularies are handled. M: don't have control of a lot of the tags being used. Anyone who adds something:something ends up in the facet and there have been some issues with variations (space between "Registered" and "by" for example). 

BREAK

Indiana University, Jon Dunn, Variations update

Just funded project, joint development by Indiana and Northwestern, trying to hire developers now. Interest in building digital media management system, for diverse types of collections with broad access needs, from archival video/film collections to teaching collections, public  access, restricted access, etc. Demo with the Opencast Matterhorn component, which is primarily for lecture capture, but we are interested in the video procesing workflow based on ffmpeg wrapped in OSGi services to define a video processing pipeline: validation, transcoding, speech-to-text, fully configurable and scalable across multiple servers. Chris Colvard has been playing around with basic integration between Hydra and Matterhorn (code named Hydrant). Based on 3.14 of Hydra. Demo of default metadata interface, showing a brief 4 minute video upload, attached to a basic workflow to take files, convert to Ogg theora as a test. Showing example of file handed back from Matterhorn to Hydra. End-goal would be to be able to deliver to a streaming server or delivery platform, rather than streaming from Fedora, which is what is being demo'ed. Indiana has been working with Fedora for a long time, have built out some of the things being discussed here but not in a Hydra environment: finding aids, born-digital archival material, digitized GPO content. Also doing a number of projects with Blacklight: in beta have a cross-collection digital search with Blacklight, for example, to integrate content from image collections, music, combine as MODS derived via XSLT from native format (such as EAD). Interested in exploring how to integrate with Hydra community. Q how does Variations video repository integrate with other repositories at Indiana? A integrating with existing Fedora repository, which will probably continue to contain a mix of Hydra and non-Hydra content. VoV is more about access, also other projects at Indiana having to do with video preservation. Q about who is at Indiana that is relevant to Hydra community. A Jon in Digital Library program, which has a content and services group responsible for defining services valuable to depts at IU (Dot Porter, michele Dalmaus), there is a software dev group managed by Will Cowan, includes Mike Durbin, Wei Jang, David Jau (sp?), Randall Floyd. Mark Notess manages local support and community development for Variations and Variations on Videeo. Outside of the Digital Library Group there is an Info Tech services dept that works with a group in Public Services dept called User Experience group, who are workgin on migrating OPAC (?) to Blackglight: Mark Federsen and John P? And there is also IUPUI, who are starting to look at Fedora and possibly Hydra or Islandora. Q about how VoV will be differentiated from the WGBH's new NEH grant. A WGBH grant is mostly about preservation, but there may be a link there with VoV, and ditto for the work Adam has been doing for the Rock Hall.

Rock Hall update, Adam Wead

Adam has been at Rock Hall two years as of January, was tasked with creating an asset manager for all the libraries. Fell into Fedora/Hydrangea, are now partners and things have been moving quickly. Now in production, ingesting video, about 10Tb worth so far, more coming in every week. Overview of ingestion: have partnered with George Blood in Phila, package into SIPs, ingest into Hydra, creates  obj. in Fedora. Initial metadata from vendor, enhanced with additional information, rigged up to a Wowza streaming server.  All managed in PBCore, reviewer information, two digital video data streams with accompanying technical information. Get both from the vendor and from media info; about 80 fields, and getting them entered was a pain, want to automate this generation as much as possible. Needed to create a process for reviewing content as it comes back from the vendor, since many of the works did not have complete metadata and sometimes decisions needed to specify who has access to what content. Logging in as a user part of a reviewer group (specified in role mapper file) gives same search interface but adds a review column, which gives license selection field. This is in addition to the permissions scheme, so there are two separate methods for specifying access, and expects this to be fleshed out a bit more, including linking license selection to some kind of Fedora mechanism. Also building a Blacklight interface to presert EAD and MARC data; video will be restricted to museum and library, but want to be able to show the public the fact that there are videos. Q if any of this is publicly visible? A No, not sure when or if it will be. Soft opening in January, grand opening in April to correspond with the induction ceremonies, and are hoping Blacklight instance will be publicly available in Jan, but not sure exactly how or when. Hoping it might be ready for showing at Code4Lib. Tom suggests possibly doing a screencast if public link not possible, just to show tools. Q are you getting transcoded files from vendor, or are you doing the transcoding? A are getting from the vendor, get a bag back from the vendor. Has a predefined form that Fedora links to when it ingests. All files sit on share and are accessible to ingestion. Predefined structure to the bag: 

MediaShelf update, Matt Zumwalt

Most of what have been working on is infrastructure stuff around getting Hydra into Rails3, stabilizing it, and then stabilizing the release cycle. Also working on documentation, and on HydraCamp. A lot of the work has been on laying groundwork for the community and collaboration. Also worked on Hydra at Hull, which Richard will talk about. Other things they've been working on are not visible to end users, mostly to do with horizontal scaling: bulk-processing hundreds of thousands of objects resulting in multiple writes back to SOLR, so have been working on coordinating these things, identifying and trying to resolve bottlenecks, including some that address Fedora itself and helping to stabilizing and scale Fedora. Will be working on a project soon for indexing hundreds of millions of objects into SOLR in a scalable way. Goal is to be able to support many simultaneous uses/users/processes/updates. Q from Eddie about whether there might be interest in a production implementation breakout, performance discussion. Eddiee is release manager for next Fedora release which is a performance and stability release, interested in best practices around support and scaling. Q: is horizontal scaling dependent on a 4.x release of Fedora? A yes, though there may be some things people can do with what they have now to tune Fedora. Q the discussion about the possibility of genericizing SOLRizer, does this current work touch on those questions? A has time set aside early in the year to work on this, not sure, but right now SOLRizer is tied in to OM, tied into Ruby to derive meaning that is used for indexing, this could as easily be handled in XML and separated. Opens up possibility for anyone using Fedora to use documented config specs to use it generically with SOLRizer. Other focus will be on making solutions that are near completion stable so that they an be adopted: making things easier to install, use, customize, etc. Improving relationships with the community. Q about what interest in Hydra has been in the various communities that MediaShelf works with. A there is a lot of interest generally; Hydra heads are strong, compelling solutions in the specific spaces that various users are in, and in addition, the fact that it has Fedora under it makes it stronger in addition. Much more interest of that sort rather than interest starting out as an interest in Hydra. 

BREAK FOR LUNCH: Hull, Columbia and Notre Dame updates, possible also Brad with a DuraSpace update. 

Day 1 Wednesday Afternoon

Afternoon session, Wednesday, December 7, 2011:

Rick Johnson - University  of Notre Dame

  • The Seaside Research Portal
  • Archiving the Worlds first new Urbanist Community
  • Each area contains an essay and associated images; e.g. multiple versions of a town plan - can animate to view town over time. 
  • Blacklight site, backed by data managed in a separate Hydra Head
  • Searching  - showing search geo-mapped materials. 
  • Fedora for images; Otherwise content is from Solr
  • Working on Blacklight Gem called Atrium
  • Atrium - Like Omeka for Blacklight; Gem on top of Blacklight like Hydra; Uses Blacklight database with some added tables;
  • Using CKEditor for adding and editing essays
  • Can associate facet filters to indicate a collections' scope
  • Create Ad-hoc collections; select from available facets for collection to show in search page;
  • Can add exhibits to a collection; 
  • Can have multiple exhibit within a collection; Can add exhibit facet filters, inheriting its collection filters as well.
  • Currently have 1 default layout; Plan to have multiple styles to choose from for collections and exhibits
  • Will migrate "Seaside" to use Atrium fully.
  • Supports hierarchical facets; Can reorder facet hierarchy for a specific exhibit
  • 'Customize this page" feature available on any page in the tree. 
  • Authorization using CanCan
  • Next Steps:
  • Multiple Style/Layout Templates to choose from
  • Index essay full-text content in Solr
  • Remove Fedora dependency
  • Define a collection from a static list as opposed to a Solr filter
  • Starting discussions with campus partners on data management
  • As Atrium is a Gem over Blacklight like Hydra, it will need some work to coexist with Hydra;

  

Richard Green - University of Hull

  • Building Hydra over an existing repository
  • Public and "Logged in" views
  • Object types include:
    • ETDs
    • Learning materials
    • Datasets
    • Regulations
    • Conference papers…
    • Handbooks
    • HR documents
    • Meeting papers or minutes
    • Conference papers or abstracts
  • Sets;
    • Groups of objects that require context, for example
    • Martine animal datasets
    • Datasets around doomsday book
  • Currently written for Rails 2, but have a development server in Rails 3;
  • Sets contain splash page that provides context and then the set of members
  • Formats available: html, Excel, PDF,  QR Code
  • Everything is in MODS, UKETD_DC, and DC Metadata
  • Running Proai  
  • Repository content originates from other sources / workflows besides Hydra
  • Rails 3 development system also exposes workflow queue and current workflow state indexed / faceted. 
  • Plan to use Fedora's JMS messaging to trigger Solr indexing;
  • Committee papers Set has a rights datastream, which is copied on new objects associated with that Set.
  • Solrizer is doing full-text indexing
  • Multiple content models supporting various types; Can hook different Ruby models in for different displays in this way.
  • In Production since end of September 2011
  • Currently in a JISC funded project - in discussion with History department about their datasets that also contain geo-data; Workflows for processing Access databases; Image collection also needs workflow support;
  • No authority control over repository cataloging
  • Whats the difference between Hull "structural sets" and Stanford APOs? SS can inherit parent SS policies. APO policies do not inherit. 

  

Ben Armintor - Columbia University

  • Version of Argo Set management. Using RELS-EXT for aggregation; Also metadata / facet groupings. Currently using MODS relatedItem; 
  • Multiple image formats and sizes; Large tiff images too big for Ruby buffering; 
  • Audio content; Large audio files from digitization process ~ 1G; Segmented into individual interviews; ADL metadata on how splicing/segmenting occurred; METS files for pulling interview segments and their ADL together.
  • Happy with Rails 3 support for engines
  • There are also transcripts available as associated 'resources' for the audio.
  • UVa is also working with media plus transcripts
  • Interested in annotations;
  • Performs analysis of image technical metadata / geometry into RELS-EXT for decision on what to use for thumbnail and other format sizes.
  • Interested in round table on image management models; e.g. when we have multiple derivatives at different sizes
  • Indiana University has been making similar use of ADL and METS.

Brad McLean - DuraSpace

  • Ongoing conversations about Fedora scaling; Want parallel solution for cloud environments. DuraCloud is interested in this topic;
  • Looking at breaking out separate components of Fedora for distributed configuration and dynamic scaling. Also want Fedora to work well in a single system deployment.
  • Looking at clustered database solution
  1. Working on "high level storage" 
  2. DSpace with Fedora inside it. Data round-tripping DSpace <-> Fedora; Now looking at DSpace user interfaces and workflows; Blacklight for end-user pieces; Ingest workflows could be addressed with Hydra heads, one of many possible solutions in discussion. May discuss this in the context of "One week / One head" at this Hydra Partners' meeting.
  3. With Sloan grant (through 2012), working for version of DuraCloud for researchers who are using cloud-based services for their work and have archivists concerned about how that data will be preserved. 

Day 2 Thursday Morning

Exhibits https://github.com/ndliblis/atrium

Discussion:

  • What is the best path for adopting the code? Pull it into a copy of Blacklight and try to use it?
    • Not shored up enough for others to grab it. Works same as Hydrahead in terms of development - same rake tasks, test app is setup. No current path for working with an existing Blacklight install and adding; only attempted with a clean slate Blacklight. Need setup scripts for that case. Not using the Solr setup scripts from Rails3 yet.
  • For both library collections to exhibits and ad hoc faculty exhibits are there special concerns?
    • Currently just one group; uses existing Blacklight database with additional tables. Moved over time from thinking about placing in Fedora to placing in the database. Haven't yet considered adding in Hydra if fedora is available.
  • How compatible would this be with Searchworks?
    • Likely to be if authentication implemented. Would need to restyle. Note that it is a Rails3 gem. Looking to drop Rails 3.0 support and go to 3.1. NB Hydrahead not currently compatible with 3.1. Atrium wants to use the simpler templating in 3.1. Cancan managing permissions on actions on pages.
      • Action: Take access controls from HydraHead and move it into it's own component (gem) that assumes you are using the field names as they come out of rightsmetadata; then can reuse across Hydrahead, Atrium, Blacklight itself. Then folks can use cancan in own applications.
  • Demand for custom exhibits across multiple collections? Omeka Style.
    • Can do that now; set the collection scope based on a filter. Originally just as a Solr query; shifted to a Blacklight search. Could add back in Solr query for advanced users. Currently supports BL faceted settings and keyword search; looking to do BL advanced search.
  • What is approach to advanced templates?
    • Idea is to have multiple/two choices of layouts (across top, along side, etc). Colorwheel to set background colors in the tool. Ideal is 4-5 options? Would like to know what layout / themes are desired. How about allowing adding templates by sharing libraries? Easy theme overrides.
    • Should we have a theme builder? Perhaps better to have Atrium as a constrained templating system for Hydra. NB. Atrium using compass. Note, building a CMS is an antigoal. Look to ?kaminari?
  • What about Wordpress and Drupal users? Desirable to pull items from repository into Wordpress or Drupal. This is a popular use case for NW and Stanford
    • Use combination of Drupal feeds module (pulling feeds) with an API built into Atrium? Note that this is "level 3 " hydra / islandora compatibility.
  • Linking to records already done; inserting content not yet.
  • Common case to have a set of images, prints, finding aids, need to have a way to build exhibits; Atrium is promising, although concern over need for authentication integration.
  • Is this for pulling content out to Drupal, or to push up from Drupal?
    • Mostly the latter, although interest in exploring content creation / capture of content created in Drupal.

Wrapup:

  • Is this just linking Drupal and Wordpress to Hydra?
  • Do you want to plug Drupal content into Atrium (point to it) (Ideally not, rather have Omeka in Blacklight - i.e. Atrium in Blacklight)
  • Want to use Drupal import from Hydra.
  • Note: University wide CMS at NW and Indiana (Cascade) that is also a candidate for integration.
  • Need shortcut URLs to collections.
  • Need to support rails 3.0? Recommend 3.1. 3.1, Atrium, and hydrahead will all be compatible soon.
  • Steps to community: How can someone else code? How can someone grab and run? Desired timing: other users want in 9-18 months; ideally ready by C4L. Currently weak on cucumber.
  • Intend to mix in DIL like features / coexist within Hydra.

break

DIL http://hydraproject.org/apps-demos/northwestern-digital-image-library-dil/

Discussion

  • How do we make this generic. Currently you have to adopt all the disseminators, datastreams, etc. What features are of interest?
    • Start at the top, (cropping, etc), figure out what is dependent on which components. Currently using Aware; Djtoka known to work. Question is how to abstract within
      the Hydra stack
  • Is DIL a head(application) or gem? How much functionality applies to other applications? How is it decomposed into gems?
    • Three images - the thumbnail, display, and edit versions. Description is separate.
  • Appears to be a problem with Hydra (ActiveFedora?) disseminators that needs to be solved. Does the rubydora library support them? Discuss tomorrow
    • Consume from existing disseminators or create them? disseminators with or without parameters? ActiveFedora doesn't support because no use case yet/until now.
  • UVA using similar technologies with Fedora, BL, SOLR, but not Hydra
  • Is the baseline jp2 for new scanned images, i.e. what should Hull do?
    • UVA does tiffs for archiving and jp2 for fedora.
    • Common to use jp2 for delivery. Preservation is either tiff of jp2, but a different profile of jp2. Stanford currently using jp2 for archiving. NB, some concern around color profiles in jp2 for preservation.
  • Add "get thumbnail", "get screen size", "get full size", "get preview", "get show view" to Hydra framework? Also need disseminators to add editor / viewer in.
    • Proposal: Construct a set of helper methods that don't know about the model, just get the url for the image, and then layer viewer on top. Build as gems so that you can move them around to various usages.
  • Stanford working w/ 8 libraries to sort out image delivery, tools (djtoka), api; expected to have API out in january. Get image, at resolution, at quality, regions. Compatible with linked data from oac.
  • Store cropping and rotation information as another stream, apply with disseminators. N.B. Chris Beer took same approach. Helper methods for videoplayer and image zoomer in OpenVault. Some prior art in Rails.
  • * Action item: Map out the structure of the gems / helper methods / API and what to borrow from OpenVault. Also do this for description. Much interest in the abstraction *
  • NW and ND using ?mdb2?. NW wants to move to Fedora; ND looking at ?mdb3?.
    • Timeline: working on defining key objectives. Expected within calendar year.
  • For video, need to abstract the object from the format.

Wrapup:

  • What do we tell the rest of the world that needs image facilities and guidance on what Hydra will do? Can we announce the intention in a new space on website?
  • Use the duraspace jira for DIL, please!
  • Access control is out of scope (handled elsewhere)
Variations on Video (VoV)
  • Build/evolve a community-based project
  • Use the Hydra/DuraSpace project infrastructure (e.g. JIRA, confluence, Hudson) to make the process as transparent and open to contribution as possible
  • Managing the project/project structure
    • Mix of waterfall & agile for Hypatia
    • Claire: important to communicate infrastructure requirements, e.g. what streaming servers are being used.
    • Claire: Three levels identified at DLF (question) :
      • demos every two-weeks
      • reviewing and commenting on stories/priorities for upcoming sprints
      • hands-on code, testing (e.g. hosted sandbox)
  • John: how the work on VoV relates to other media projects, e.g. RockHall, WGBH
    • need a pbcore cataloging interface, which RockHall already has
  • John: access control will be a big issue
    • in particular, integrating auth w/ streaming services
    • Steve: wrt access control, want to make sure we'll building something that provides interoperability, that is not unique to VoV
  • Claire: at DLF, discussion of a hosted VoV solution for institutions that want to implement/adopt but lack IT resources to do so on their own
  • John: 3 yr grant, but development is front-loaded, so most development will happen in 2012.
Argo/PSDS
  • Path to adoption for Argo
    • Michael: very tightly bound to Stanford infrastructure, fields names, how DOR and SR are laid out. Not clear how pieces should be split out. Work-do implementation
      • Michael: views & helpers look for field names that assume how the metadata is set up, how Stanford indexes. Not part of an especially flexible configuration system
  • What are the features of Argo that ppl are interested in?
    • Job queue mgmt
    • generic repository mgmt viewer/view on repository as repository content
    • hierarchical tags
    • argo as the system that exposes the workflow state
    • Claire: treat (nearly) all registration as templatized
    • integration with purl service
  • PSDS
    • workflow (notion of job & item, with milestones)
  • Non-view management features for Argo
    • set embargo: adding embargo periods or releasing early (PENDING)
    • reset workflow state to waiting (PENDING)
  • Next steps
    • dependencies issues: DOR's generic types & how they relate to AF. DOR's svc gem tightly bound to AF 3.04. tied to cbeer's facet branch of BL
    • Matt: will try to set up Argo against a non-Stanford Fedora repo.

Day 2 Thursday afternoon

Communications: Supporting a Hydra community across time zones and other countries

 •    How do we try to ensure that US-centric meetings and calls are accessible to Hydra users whose working day does not overlap with them comfortably? The current committers' call at 8:30PST is at the extreme of what can be achieved in the UK - it's not good for mainland Europe (or Singapore!)

Need to make sure Hydra isn't as US-centric. Difficult to have phone calls.

- How do we address issues related to language (English-only?) May be something that can be addressed some other date for phone meetings, but there are concerns now about internationalization of the codebase. It's easy to internationalize a Rails app; maintainers manage core file, but a separate doc is maintained that tracks translations. (This is a separate topic possibly needing more discussion).
- How do we address time slots for phone meetings? Richard says the current time works for UK right now, but as we expand it won't include others. Could try to switch off times every week so it will only occasionally inconvenience groups at the edges of zones. Apache model may have relevance in order to support asynchronous, long-term retention (all decisions made through email). The timing of the calls may not be an issue yet though as we don't have international callers yet.
- Duraspace handles this problem with Fedora committer's call by having an IRC chat that is logged.

Action items:
1. Robin will reach out to Egypt to see potential integration points and if there are any issues with them being on the call, and potential issues in the future.
2. Sensitivity to scheduling times based on callers location
3. Conversations should be sent out via email, or message sent saying that the IRC chat (logged) as the info
4. IRC chat room should be logged

Release schedule

Tom: At last meeting at NU, we decided to have better coordination of component releases, and more transparency of releases, to do more formal releases and management. Should we stay with this or fine-tune?

Naomi: Release schedule has been feature-based--need to figure out a way to have release managers switch mid-stream depending on what components are being worked on. A solution may be to look at the features for an upcoming release and selecting the person most approrpriate for being a release manager.

Matt: The idea of time-based releases is to prevent a hold-up of a set list of features that we want for the next point release, even if some of those features are complete. Minimizes backwards-incompatible changes as well. We should put off backwards-incompatible changes until it's worth doing those. Moving to Rails 3.1 would make things backwards-incompatible that would trigger a Hydra 4.0 release. We are already doing time-based releases by having a release every week--so we're doing both time-based and feature-based releases.

If someone needs to have support for Rails 2.0, they can make changes to a branch, but it encourges people to use newer code as time goes on.

As a user, there are three questions that should be answered:
Which version is the one I should use?
If I'm running version x, what fixes are in place?
If I'm running version x, is there something I need to do?
Some of these can be addressed by the release manager, component leads, and having documented a set of configurations that are known to work together.

We also need to decide how testing should work, if Hudson should test over multiple builds. Is it reasonable for a release manager to oversee all of this testing?

Objectives:
* Need to facilitate rotation of release managers
* promote appearance of competence to community
    * avoid appearance of helter-skelter patching
    * or promote appearance of vibrant & dynamic project w/frequent updates
* make clear that we use semantic versioning
    * Major number bumps for incompatible changes
    * Minor number bumps for new features
    * Point number bumps for bug fixes
* make feature releases new items so that managers can pick up on them as well as developers
* allow enough time for appropriate testing & validation (how do we know what's enough time? break out for additional discussion)
* allow release managers (as a group) to do integration testing / validation of components' interoperability at predictable intervals.

Action items:
Let's discuss the release process every quarter and evaluate where we are and what changes we need to make.

Rotating Responsibilities

We've had a notion of a release manager but never had specifics of what they were going to do. Not sure if we should have any rotating duties?
Naomi wrote this, would like feedback (feel free to make changes in wiki): https://wiki.duraspace.org/display/hydra/Release+Manager+Responsibilities
We walked through the bullet points of this document, and made changes in the meeting.

Separate item that wasn't on the list: Making sure fresh installations work correctly (should help new users).

QA person could help with testing and evaluation, Stanford has an open position for a QA person right now.

Current release managers are listed, added column for next release managers for each component.

We're having trouble finding someone to be the next Hydra Head lead. The responsibilities for this will require a lot of time. Solution may be for Matt Z to stay on until he's able to train someone else. Adam Wead or Michael Giarlo could be possibilities, Matt will contact Michael and see what his interest is.