Improving the Bulk Import Experience for Hyrax and Hyku: Community-Funded Bulkrax Sprints
Bulk import is the primary pathway for adding collections to Hyrax and Hyku repositories at scale. The component that handles this function, Bulkrax, is shared across both platforms. Whether migrating legacy collections, ingesting faculty research, or processing large donations, the ability to efficiently load content determines how quickly a repository delivers value to its institution and users.
The current Bulkrax experience in Hyrax and Hyku is a barrier to that value. The Hyku community has worked to define what Bulkrax requires to be easy to use, and has contracted for the first Phase of work. Hyku and Hyrax user communities are encouraged to contribute funding to extend the contract through Phase 2 and Phase 3, completing all of the issues identified to release a Bulkrax that is easy to use.
Link to Executive Summary / 1-pager for fundraising
The Problem
Repository managers and staff currently face a frustrating cycle when performing bulk imports in Hyrax and Hyku:
Errors are discovered too late. Problems with import files aren't surfaced until after a job has started—sometimes hours into processing. Staff must then diagnose failures by clicking through individual records, often without clear guidance on what went wrong.
First attempts rarely succeed. Based on community feedback, fewer than 30% of bulk imports succeed on the first try. Each failed attempt requires investigation, correction, and resubmission—a cycle that can consume hours of skilled staff time.
The learning curve is steep. Institutions cannot easily delegate bulk import tasks to student workers, interns, or new staff. The current interface assumes familiarity with the internal workings of the software, limiting who can contribute to repository growth.
The result: staff spend more time troubleshooting imports than building collections.
The Solution
A community initiative is underway to transform the Bulkrax experience through pre-import validation—quickly catching problems before processing begins, not after a time-consuming import attempt.
What changes:
Current | Planned |
|---|---|
Errors discovered post-import | Errors caught before import starts |
Cryptic messages requiring investigation | Clear, actionable feedback with fix suggestions |
Multiple failed attempts | Higher first-attempt success rate |
Expert-only workflow | Delegable to trained staff |
The improved system will validate import files immediately upon upload, clearly distinguish between blocking errors and minor warnings, provide specific guidance for resolution, and prevent submission until critical issues are addressed.
Projected Outcomes
Metric | Current State | Target |
|---|---|---|
First-attempt success rate | ~30% | >70% |
Time to diagnose errors | 15–30 minutes | <2 minutes |
Support requests for import failures | Frequent | Reduced by 50% |
Why This Matters
For Staff: Less time troubleshooting means more time for meaningful repository work—metadata quality, collection development, user engagement.
For Institutions: Faster content ingestion accelerates time-to-value for digitization projects, grant deliverables, and strategic initiatives.
For Sustainability: A more approachable interface expands the pool of staff or students who can contribute, reducing single points of failure and key-person dependencies.
Phases to Complete this Work
Phase 1 of this work is Header-Level Validation, catching structural CSV problems before import (a stretch goal of making progress on Phase 2 is also included in the initial plan). The Hyku Sustainability Grant has funded half of Phase 1 of this work, and Notch8 has agreed to match their funds.
Phase 2 and 3 require funding from other institutions to unlock deeper validation capabilities and improved user experience:
Bulkrax Work Phase | Capability | Funding Required |
|---|---|---|
Phase 1 | Header validation, missing columns, typo suggestions | $50,000 (✓ Funded) |
Phase 2 | Row-level validation, duplicate detection, parent-child checks | $27,000 ($14,000 funded, $13,000 pledge waiting on final approval) |
Phase 3 | Value-level validation, controlled vocabulary checks, date formats | $31,000 Requires community funding, targeting second half of 2026 |
Why is This Work a Community Co-Investment?
The proposed Bulkrax importer usability improvements are substantial and will greatly benefit both the Hyrax and Hyku user communities. The Notch8 development team has been contracted for the Phase 1 portion of this work, and has the expertise to complete Phase 2 and 3 in subsequent sprints. The team worked with the Hyku community to describe the features Bulkrax needs to be delightful to use, and invested time and effort in outlining the technical work needed to reach that state.
All contributing institutions will have direct input on feature prioritization and early access to development builds for testing.
Timeline & Budget
Target release: Q1 2026
Phase 1: $50,000 (funded by IMLS grant funds match by Notch8)
Phase 2: $27,000 ($14,000 funded, $13,000 remains to be funded)
Phase 3: $31,000
If 10 institutions each pledge $5,800 toward this effort, we will fully fund all three phases of work. Some institutions may be able to contribute more if needed to help institutions who can’t reach the full contribution goal, so contribution pledges of any amount are encouraged.
Pledged Contributions
Hyku Sustainability Grant - $25,000 for Phase 1
Notch8 - $25,000 completing Phase 1
Amigos Library Services - $9,000 for Phase 2
University of North Carolina, Chapel Hill - $5,000 for Phase 2
By distributing the cost across our community, we can move forward quickly without burdening any single institution. Your contribution, regardless of size, helps maintain the collaborative spirit that makes the Samvera Community special.
If we're not able to reach the target budget, this work will still need to be completed, but it will happen more slowly and without the momentum of dedicated sprints.
How to Pledge
Email heather@samvera.org with your pledge amount, and when you would like to be invoiced.