Skip to content

Indexing EAD in ArcLight

Darren Hardy edited this page May 10, 2017 · 26 revisions

Now that you have your ArcLight application up and running, we need to index data into it.

Download sample EAD

First we need to download or access our EAD's. Let's create a directory where we can store these within our application.

$ mkdir eads

Now let's add some data there.

# This command will save one of our test datasets to the directory you just created
$ wget -P eads/ https://raw.githubusercontent.com/sul-dlss/arclight/master/spec/fixtures/ead/nlm/alphaomegaalpha.xml

Repository configuration

Next we need to run our indexing task and tell the task which "Repository" the EAD file is linked to. By default, your ArcLight application should have a file config/repositories.yml that was generated. This file contains information about the repositories for your instance. For example, in the EAD alphaomegaalpha.xml we want to link it to the first repository in that file, nlm:

nlm:
  name: 'National Library of Medicine. History of Medicine Division'
  description: 'NLM’s History of Medicine Division collects, preserves, makes available, and interprets for diverse audiences one of the world’s richest collections of historical material related to human health and disease.'
  building: 'Building 38, Room 1E-21'
  address1: '8600 Rockville Pike'
  address2: ''
  city: 'Bethesda'
  state: 'MD'
  zip: '20894'
  country: 'USA'
  phone: ''
  contact_info: '[email protected]'
  thumbnail_url: "https://collections.nlm.nih.gov/pageturnerserver/ajaxp?theurl=http://localhost:8080/fedora/get/nlm:nlmuid-101421040-img/THUMB"

We recommend that your config/repositories.yml contain only the repositories for which you have EADs to index.

Indexing a single file

We can now use the arclight:index task in ArcLight to index our EAD.

$ FILE=./eads/alphaomegaalpha.xml REPOSITORY_ID=nlm bundle exec rake arclight:index
Loading ./eads/alphaomegaalpha.xml into index...
Indexed ./eads/alphaomegaalpha.xml (in 0.837 secs).

Adding more finding aids and repositories

You can add new repositories to the config/repositories.yml file. The key that begins a repository is the same value you will use as the REPOSITORY_ID in the indexing rake task.

We recommend that you organize EADs by repository and put them all in a directory using the repository's key. Then, run the rake arclight:index_dir using the DIR and REPOSITORY_ID environment variables to index files all to the same repository:

# this assumes there's a directory with EAD files called /tmp/sul-spec, and a repository configured with the ID "spec"
$ DIR=/tmp/sul-spec REPOSITORY_ID=spec bundle exec rake arclight:index_dir