HOW WE ARCHIVE

METHODS & TOOLS

Methodology

The Rohingya Genocide Archive strives to employ a methodology and workflow that is transparent, replicable, and collaborative. The purpose of this page is to provide general information about RGA’s approach to collecting and archiving. If you are interested in learning about our workflow in more detail (including step-by-step instructions on how to do it yourself), we welcome you to enroll in our forthcoming e-learning course.

Here is a basic overview of RGA’s collection methodology and the tools, including some “archiving tips” aimed at groups who may be considering starting their own archival initiatives.

Selection
Capture / Ingest
Packaging
Cataloging
Storage
Access

Selection

Selection of content for the archive is performed by archivists who have expert knowledge of the Rohingya context. Their selection decisions are based on:

  • The RGA mission
  • Selection guidelines that specify the collection goals, and the subject matter, temporal scope, geographic scope, and format/types that the RGA will collect.
  • Criteria for assessing and prioritizing sources.

These guidelines and criteria were first developed in 2018 during in-person meetings between Rohingya Vision and WITNESS, and updated in 2023. 

Capture / Ingest

Archivists aim to capture video content from the original source, or from as close to the original source as they can. They also aim to capture content in its native format if possible, or otherwise in the highest quality and most sustainable format available. Archivists use open-source tools youtube-dl or yt-dlp to download video content along with a JSON info file, or a platform-provided download feature in the case of chat apps.

When capturing selected content, archivists also capture additional information that strengthens the ability to identify, understand, and establish the authenticity of the selected content. Starting in 2018, archivists collected additional information using platform APIs / API tools, such as Facebook Graph and twurl. Over time, however, platforms started restricting access and use of these tools. In 2022, OHCHR and the Human Rights Center at the University of California, Berkeley, School of Law published The Berkeley Protocol on Digital Open Source Investigations, which provided guidance for the first time on what investigators should collect when capturing open source information. RGA updated its protocols on collecting additional information in 2023 based on the Berkeley Protocol, with additional consideration to technical constraints and the security needs of its archivists. These protocols apply to new records only, and will not be retroactively applied to already archived content. 

Packaging

To prepare it for storage and preservation, an archivist packages a collected video and its additional information into a “Bag,” a self-describing container that adheres to the BagIt specification (BagIt is a file packaging standard that facilitates reliable storage and transfer and is widely used in archives). To do this, RGA uses the Digital Archivists’ Resource Tool (DART), an open-source application developed by the APTrust consortium that provides a GUI interface for packaging files into bags. Following a widely adopted specification such as BagIt enables RGA packages to be easily created and validated with community-supported tools. Previously, RGA used a similar open-source BagIt tool called Exactly, which is no longer being actively developed. 

The standard RGA bag adheres to the BagIt specification as well as to the RGA BagIt Profile, which describes a few simple descriptive tags that are to be included in each RGA package.

Cataloging

Archivists catalog each video in the collection according to a simple data model first developed in 2018 and updated in 2023. The catalog primarily includes provenance information, descriptive information taken from the source, and geolocation information. In 2023, RGA added a small number of additional attributes to facilitate preview, search, and navigation within its new cataloging platform. 

The catalog was first built in a spreadsheet starting in 2018. In 2023, RGA partnered with HURIDOCS to migrate its catalog to its open-source web-based database application Uwazi. The migration to Uwazi improves navigation of the collection and enables RGA to facilitate more collaboration on a secure managed platform going forward.

Storage

The RGA collection was stored offline starting in 2018. In 2023, RGA began the process of migrating the collection to secure cloud storage with online and offline backup.

Access

The RGA collection is currently not publicly accessible. To date, access to catalog records and videos in the collection has occurred on a one-on-one basis to vetted users for specific justice and accountability purposes. To reach someone at RGA, please contact : info@rohingyagenocidearchive.org

Archive Management, Policies, and Planning

RGA was founded as a joint project between Rohingya Vision and WITNESS, and is currently managed by RVision with WITNESS support. Its archival policies were developed through a collaborative process between RVision and WITNESS team members. The technical workflows were designed by an archivist on the WITNESS team who provided training and developed training materials for the RGA team. 

Our vision is for RGA to be a sustainable Rohingya-led initiative whose policies, plans, and workflows continue to evolve to meet the needs of the Rohingya community and their fight for justice and accountability.