Table of Contents

Where are collections indexed now?

Thoughts on indexing nested collections

At Notre Dame, we have implemented Nested Collections and leveraged the following indexing strategy:

https://github.com/ndlib/curate-indexer (which could be better named)
We have a method for reindexing a relationship (https://github.com/ndlib/curate-indexer/blob/master/lib/curate/indexer.rb#L20) and reindexing the whole repository (https://github.com/ndlib/curate-indexer/blob/master/lib/curate/indexer.rb#L36)
The gem was developed without concern for the persistence layer, instead relying on an adapter (it is tested via an InMemoryAdapter) who's interface is defined in the AbstractAdapter
Our implementation details in CurateND's adapter for indexing are found in Curate::LibraryCollectionIndexingAdapter and added a module for IsMemberOfLibraryCollection.

Potential Pitfalls

Nested collections can create infinite loops (e.g. A is in B is in C is in A). At Notre Dame we adopted a maximum depth (aka time_to_live) in graph traversal (another option is Cycle detection but that might not be performant). Also with nested collections, reindexing everything can take considerable time. Those tasks should be relegated to background jobs.

Browser not supported