2016-04-01 Virtuoso, Rya and where we go from here

Time: 9:00am PDT / Noon EDT

WebEx Info: Join WebEx meeting - Meeting # 649 933 963 , Meeting password: htIG0401  (hotel-tango-Igloo-Golf-zero-three-zero-one  I'm not sure you need the password.)

Audio Connection:  Computer, or 1-855-244-8681 Call-in toll-free number (US/Canada), or 1-650-479-3207 Call-in toll number (US/Canada)

Moderator: E. Lynette Rayle (Cornell)

Notetaker: tamsin woo

Attendees:

Agenda:

  1. Next Call
    1. date/time: 2016-05-20 
    2. Moderator: 
    3. Notetaker: 
  2. Call for additional agenda items
  3. Status of Triplestore Implementations in Ruby
    1. Aaron Coburn - Rya (Apache project)
      1. Triplestore via Apache runs on top of Hadoop/Accumulo
      2. Highly scalable
      3. SPARQL queries are mapped into Pig for MapReduce
      4. This is a fairly new project; and is a lot of work to set-up, largely due to Hadoop
      5. Q: Were you able to load data in?
        1. Just barely got it running. Can attest to the fact that it is hard to setup.
      6. Q: Do we have a sense of scale & performance?
        1. Not really; guessing: simple queries are probably more costly, but complex queries are where you might see the benefits.
      7. Q: How much is kept in memory?
        1. Don't really know, would need to look into Accumulo.
      8. Q: SPARQL Update support?
        1. Yes. In principle, the existing Ruby SPARQL::Client::Repository should work.
      9. RYA Paper: http://sqrrl.com/media/Rya_CloudI20121.pdf
    2. Corey Harper - Virtuoso
      1. Jim Blake joins to talk about LD4L Virtuoso scalability
        1. Working with around 1 Billion triples
        2. Dealing with crashing on many subsequent SPARQL Queries
        3. The work-around is to run 1.5 million queries then restart Virtuoso
        4. The queries look like all properties from a named node and rdf:type statements from them.
        5. Using latest stable release (probably version 7?)
      2. Notes Document: http://bit.ly/virtuoso-hydra
      3. Corey had done some work on OpenRefine, discovered that the DERI LinkedData plugin
      4. The UI is called Virtuoso Conductor
      5. The RDF.rb support is community supported and is over 2 years old; doesn't work with RDF.rb 1.1.x+
      6. Virtuoso Index Scheme documentation: 
  4. Where Next - Direction of the Group
    1. Goals
      1. Can we push toward LDP as a common gateway?
      2. Have the application talk to a triplestore directly, with eventual persistence to Fedora
      3. Look at Fedora API work; which Fedora functions map to a triplestore backend.
        1. See: https://github.com/fcrepo4-labs/derby
        2. Derby talk next meeting.
    2. General Graph Database discussion?
      1. Has anyone used Neo4J?
        1. Hector is willing to do a Neo4J overview next meeting.