Hyrax Analytics User Stories

As an administrator:

  • I can report out the number of collections in my repository.
  • I can report out the total size of collections in my repository (in number of items, files, and filesize)
  • I can report out the total number of unique objects in my repository.
  • I can report out the total views (number of times the unique files were accessed).
  • I can report out the total number of queries conducted (during the reporting period).
  • I can view the number of unique downloads for objects in any given collection.

  • I can integrate statistics from a variety of sources into Hyrax (beyond Google Analytics).
  • I can view the number of times my collections were accessed by month. 
  • I can view the number of unique views of an object (separate out user vs. administrator, but give option to view either).
  • View graphics of visibility, in progress (ie in workflow), other such administrative content on a dashboard
  • I can use a calendar 'widget' to generate statistics (month, group of months, by fiscal year, etc.)
  • Access analytics via a REST endpoint - json output of most viewed author, most downloaded, and other such information (to contribute to a larger scale reporting system - e.g. Mexico National, ARL, etc.)
  • Aggregation based on filetype (mime type) or resource type



As an end-user:

  • I can execute a search/browse or navigate to a collection (user or administrative) and, based on the results, download a file of pageviews and downloads for those items.
  • I can execute a search/browse or navigate to a collection (user or administrative) and, based on the results, view a table and/or figure of pageviews and downloads for these items.
  • I can save a search/report as part of my user profile - 
  • I can see the Top 100 most viewed works (show as thumbnails & number of views).
  • I can see, on my user profile, pageviews, downloads, and any other available statistics about My Works.
  • As a Depositor of one or more works in the repository, I need access to statistics about the discovery and use of my works. Statistics of interest about each work include:

    • Total number of times the page with my work has been visited.

    • Number of times a page with my work has been visited over time.

    • Number of unique visitors of my work.

    • Number of times a page with my work has been visited by user.

    • List of referring URLs.

    • Geo-location of visitors (based on IP address or for privacy a partial IP address).

    • Total number of downloads of a file in a work.

    • Time series of downloads.


Item-Level Statistics:

  • I can see a table and/or chart of monthly pageviews and downloads for the item.
  • I can download the daily (or some other more granular timestep) pageviews and downloads for the item.
  • As available, I can view other statistical information for the item (AKA altmetrics).


Collection-Level Statistics:

  • I can embed the x most viewed and downloaded items in this collection on the collection landing page


File Downloads:

  • Where downloads of statistics data are downloadable, downloads are available on a monthly or daily basis.
  • Export figures and tables from Hyrax.
  • File download statistics should include "direct-link" visits like the bookmarked uri for a pdf that won't open a html page (it will impact Google Analytics particularly)  


Setup and Configuration of Analytics in Hyrax:

  • As a Repository Manager configuring my single tenant repository instance, I need to be able to enter the Google Analytics ID associated with my Google Analytics account so that repository usage activities are recorded in Google Analytics. (Gary notes this will be added to general settings in Hyku).

  • As a Repository Manager using a hosted repository in a multi-tenant context, I need to only access the usage data about the content in my repository, and not the usage data of other tenants of the service.

  • As a Repository Manager configuring my single tenant repository instance, I need to be able to enter the OAuth information associated with my Google Analytics account so that repository usage activities are displayed in the UI.

  • Statistics and related UI should be flippable so that a repository administrator can turn them off as needed.

  • As a Repository Manager, I should be able to configure a page or pages on which analytics will not be collected.  This is particularly useful when using external monitoring software that requests a page from the repository and compares it to a stored value to confirm the stack is working as expected.


Types of Metrics of Interest:

  • Pageviews
  • Downloads
  • Citations
  • Additions to collections
  • Other activity (tweets, shares, etc.)
  • Usage map
  • Granularity:


Questions and Definitions:

  • What is a download?
  • Clarity on what makes a download event happen? 
    • A view should not = a download
  • Analytics for items with child works - do we want/report aggregated statistics for nested works? 
  • Clarity on what aggregated statistics mean (where are the data coming from?)
  • Is it possible that this not be specific to Google products and perhaps supports something like PWIK? ← yes!
  • Are we going to rely somewhat on local server logs?


Resources: