Meeting on 2015-04-14

Present: Steven Anderson, Corey Harper, Tom Johnson, Trey Terrell, Ruben Verborgh

Corey introduced topic. Cacahing issues for remote resources. How to do this?

Trey: Caching isn't hard because triples change a lot, it's hard because servers are often down.

Tom: DCMI Types -- Hitting cache that sits in ruby, but for biger data sets, large sizes, this will be a challenge

DPLA interested in reconcilation endpoint, but also for front-end like Trey's use case

Trey: Thinks he understands the goal of TPF & LDF, but wants to hear it. What is the problem set.

Ruben:

Availability on the Web.
Data Dump & do stuff locally -- or --
query live.
Many data sets aren't queriable.
Those that are suffer from downtime
LDF is a conceptual framework to say "This API offers that kind of fragments"
Data dumps have one fragment: the entire dataset
SPARQL endpoints have many highly specific fragments, many of which are expensive to compute
Can we find different types of fragments that divide the workload differently? (Triple Pattern Fragments are an example.)
Moves some intelligence and business logic to client side.
Clients solve complex queries by splitting them into smaller queries the server can handle, depending on its interface.

Q from Trey: High availability theory is that server does less work, so easier to keep up

Ruben: That's part of it. Most APIs on the Web have far less expensive requests than SPARQL endpoints.

1st, the TPF API is low-cost for the server.
2nd, Web is optimized for caching.
overlapping questions can reuse same fragments and be more cacheable
Publication (http://linkeddatafragments.org/publications/iswc2014.pdf) includes evidence that avaialbility data & cost-data per request, this is cheaper

Pushing this to the client side -- Ability to combine from multiple data streams

Trey's Primary use case is "I have stuff, I need labels"
Can have a interface that says, ask me for a subject, I'll always give you the label
Layers of abstraction

Reconciliation:

Now experimenting with full text search
Example: http://data-test.linkeddatafragments.org/dbpedia2014-es?subject=&predicate=&object=*belgium*
Corey: Question about ranking, probabilistic matching.
Ruben, these examples have some rank, since from Elastic Search
Could have interfaces that supported explicitly scored
Corey: Even support "just give me your top match" interfaces
Some LD Frags might take responsibility for the ranking

Reusability:

Support for multiple interfaces, which are composed of interface features
Allows us to keep a lot of this functionality out of Hydra, have a nice clear interface, separation of concerns goodness
Figure out which interfaces are useful to whom
This allows for reusable interfaces

Next steps:

Hydra folks should think about a ruby implemntation.
Spec exists.
Implementations in JavaScript, Java, Perl
Tom's interested in setting up a geospatial frags server & integrating with Two Fishes

Browser not supported