R3 Recycling: Concluding Notes
Creating a protocol for the R3 repository revealed the value of establishing repository goals and inventorying relevant data. It became clear early in the development process that decisions regarding design and policy options could be meaningfully made only with clear aims for the repository in mind. Continually refining R3’s goals throughout this process, and eventually prioritizing them in order of importance, all with an eye towards end-user and depositor needs, allowed us to see when compromising on ideal practices might be of value to our community.
Inventorying extant recycling-related data provided insight into the feasibility of potential curation actions, particularly as examining domain datasets revealed that some of our initial assumptions about the data relevant to our repository, such as those relating to licensing and transformation, were incorrect. Moreover, examining real datasets also allowed us to grasp the extent of the relevant data that could be excluded from the repository as a result of policy decisions, forcing us to carefully consider what trade-offs were appropriate.
Looking to the future, we anticipate that scaling the R3 pilot project to a regional or national recycling data repository would require significant reassessment of our procedures and policies. The repository protocol was designed around a limited geographic scope, which inherently restricts the population of potential end-users and depositors. Similarly, the R3 protocol reflects a narrow focus on a domain in which only moderate volumes of data are currently produced. Should either of those factors change, either through deliberate expansion of the repository or unanticipated growth in the amount of relevant recycling data, many aspects of the protocol may need adjustment. Maintaining the repository on a strictly submission basis, for instance, might prove overly burdensome, necessitating a shift to a structured self-deposit process or to submission or user fees that could support additional staff. The current metadata schema may require expansion, not only to adapt the contents of geographically-bound controlled vocabularies, such as the one developed for the dct:spatial
field, but also to assess how findability will be impacted should the repository encompass many more datasets on similar topics. Analysis of how users interact with the repository and, in particular, of the metadata elements they rely upon to locate datasets of interest, may suggest how the R3 schema can be expanded to accommodate more data and more users without compromising the search experience.