Deployment and DevOps
Attendees
Critchlow, Matthew (UCSD)
Tim Marconi (UCSD)
Ron Stanonik (UCSD)
bess (Stanford)
David Chandek-Stark (old account) (Duke)
Jim Coble (Duke)
Steven Ng (Chinese Historical Society of Southern California)
aliciac (DCE)
Ian Lessing (UCSB)
Dermot Frost (Trinity College, Dublin)
Jimmy Tang (Trinity College, Dublin)
Glen Horton (University of Cincinnati)
Topics (some unanswered or not discussed):
Challenges of going from development to production?
Fedora?
Hardware specs relative to collection size and users?
How long will the migration/ingest process take?
For folks in production: what were easy, challenges, what does prod environment look like?
Streaming video?
How are people dealing with migrating fedora instances?
New Hydra Head: separate repository or new?
Questions:
Chef or Puppet or Ansible? Community using all three, can we organize or rally around this?
consider Avalon as a starting point
Campus IT or external hosts are nervous about granting access to run config, or have already made the decision of Chef/Puppet/Ansible.
Ideally Sufia/Hydramata might have Chef and Puppet scripts for deployment so adopters can choose
Single Machine or multi-machine deployment?
Stanford: load balanced with F5. Searchworks, generally shared Solr cluster and single Fedora instance that apps are connecting to. Trying to load balance as resources are available to do so.
UCSB: haproxy system in front of any sites. does load balancing + security wall
Slashdotted?
Stanford: Yes, crashed. spun up new instances of server and was able to keep up with traffic. Puppet helped significantly in this
UCSD: got on Reddited.
Monitoring Systems?
Nagios (Stanford, UCSD, others)
Rails apps? -- “Is it working” gem https://github.com/tribune/is_it_working
nagios hits the “is it working” URL and reports out
OS’s? CentOS, Scientific (Trinity)
CentOS repos are often out of date
Stanford: RHEL
UCSD: moving away from RHEL due to new Redhat university licensing
Server Configuration management?:
50% of room already using. 100% moving towards/considering
Short Status Notes
Trinity College: A few months from going to production. distributed file systems
Using http://buildbot.net/
UCSB: Deploying in next few months. worried about deploying stable system that can keep up with dev. system that is performance tuned. Want staging to resemble production.
Stanford: How do we anticipate demand and sizing? GIS/Spatial data infrastrure brings new questions.
migrating huge amounts of content from SDR1
created positions for devops, could possibly share the job descriptions for others to reference
still struggling with what operations procedures should be
gotten much better w/ puppet
20-ish instances of hydra, using different ruby gems
puppet helped locate/patch during a ruby vulnerability
devs have a burndown box - VM assigned to them. when they want to deploy an app, they write a puppet recipe for it. devs use a base script, make minor manifest tweaks for their app. once working, ticket put in to devops person who reviews it, once passed it is deployed to server
considering going to vagrant for laptops (so devs can get up to speed quicker)
all puppet scripts in github, but currently private. only sysadmins can merge PR’s to master, devs work on scripts in branches
Action Items:
The group all agreed that a HydraCamp covering operations/devops would be a great step forward
DCE may have some sample scripts from different engagements that could be a good starting point
Case Studies / White Papers
starting point (recognized as largely out of date): https://wiki.duraspace.org/display/hydra/Deployment+Hardware+Information
the group requests white papers from the community
case studies for people already in production to share w/ community.
https://wiki.duraspace.org/display/hydra/Deployment+Hardware+Information
trinity college volunteered
stanford volunteered
requested case studies
ops, dev, or devops - what does that look like?
less than half on hardware they don’t control
migrating fedora instances
how to care/feed for production fedora (daily/weekly/monthly tasks, maintanence)
communication strategies
press releases from collection migration
cloud repository instance
campus IT or other external environment hosted
University of Cincinnati has an excel spreadsheet on production hardware configuration and deployment from all hardware partners they will share
will add to existing wiki page
A RailsCast style deployment video - Steven Ng will take the lead on this concept