Skip to content

Carnegie Hall Rose Archives maintains a series of scripts to transform its historical performance history data from its source in a SQL database into the Resource Description Framework (RDF) for publication as linked open data.

License

Notifications You must be signed in to change notification settings

CarnegieHall/linked-data

Repository files navigation

linked-data

OVERVIEW

The purpose of this repository is to share Carnegie Hall's performance history as linked open data, and resources related to its creation and maintenance. For updates in 2019, follow our progress here.

🔴 Explore Carnegie Hall's linked open data here.

CONTENTS

The Carnegie Hall Rose Archives believes in showing its work. To that goal, this repository includes:

CARNEGIE HALL PERFORMANCE HISTORY AS LINKED OPEN DATA

About the Data Set

The initial release encompassed performance history data from 1891 through the end of the 2015-16 concert season (July 15, 2016). Beginning in August 2019, the data is updated on a weekly basis, and now encompasses performance history data from 1891 – present.

What Does "Performance History" Mean at CH?

Since it opened in 1891, Carnegie Hall has been a center of cultural and political expression, presenting and providing a venue for many different types of music and culture across multiple performance spaces. Since its transition to a not-for-profit institution in 1960, Carnegie Hall has continued to deepen its commitment to music education and community outreach by presenting concerts and events in neighborhoods throughout New York City, across the United States, and worldwide.

The Carnegie Hall Rose Archives maintains a database, the Orchestra Planning and Administration System (OPAS), with a goal to track every event – musical and nonmusical – that has occurred in the public performance spaces of CH since 1891. Since our archives were not established until 1986, there are some gaps in these records, which we continue to fill using sources like digitized newspaper listings and reviews; many missing pieces – concert programs, posters, etc. – are donated to us, or we buy them on eBay. This database now covers more than 55,000 events across nearly all musical genres, as well as theatrical, dance and spoken word events, meetings, lectures, civic rallies, and political conventions. It also includes corresponding records for more than 110,000 artists, 25,000 composers and over 100,000 musical works.

Starting in 2013, Carnegie Hall began publishing some of these records to our online Performance History Search. The Performance History Search has records for more than 49,000 events from 1891 to the present. Data cleanup efforts are ongoing, and new records are published each month to that HMTL presentation. The Carnegie Hall linked data prototype uses this published data set.

Data Structure

How is the Carnegie Hall (CH) performance history represented as linked open data? Characteristics about CH performance events fall into two categories:

  1. Information that applies to the entire event.
  2. Information that applies to each presentation of a work during an event (a work performance).

The separation of a work performance from the event enables us to provide specificity. Statements link performers to a particular work performance, rather than generically to an entire event. Let's explore the event data structure further:

  1. Each event has its own Uniform Resource Indentifier (URI) and includes metadata related to:

    • Date/Time (ISO 8601 date/time string)
    • Venue
    • Title (who performed or what took place)
    • Entities who participate in the entirety of the program, like a conductor and/or an orchestra.
  2. Components of an event, e.g. each work performed, is a sub-event with its own URI. Work performance metadata includes:

    • Works (musical and non-musical)
    • Performers

Interested in the CH LOD data model, namespaces, URI schemas, vocabularies, and ontologies? Check out CH's in-depth data structure and schema documentation in this repository.

Potential Future Work

Although the CH LOD includes about 4.5 million triples, there is still information missing from or out of scope of this initial release. Below is a sample of excluded content and topics. See how to get involved if you have feedback about the list of information not currently in the dataset.

  • Some past performance records are missing; such data will be added as it becomes available.
  • Complete, accurate biographical data is not always available for performers and composers. To the extent that this information has been provided to Carnegie Hall or is available from published authority sources, it has been added to the dataset. Existing Carnegie Hall URIs will remain stable, but additional or revised statements (e.g. newly acquired birth/death dates, corrected spellings, etc.) may be added at any time.
  • Additional external authority IDs - we plan to add more external authority IDs for entities and creative works
  • Credited non-performing roles, e.g. choral/ensemble preparation, technical roles, etc., are not included in the initial release
  • Building LOD at Carnegie Hall - How did the Carnegie Hall Archives get from an internal database to 4.5 million triples containing open data from a dozen ontologies and vocabularies?

GET INVOLVED

Provide Feedback or Report Issues

Want to help Carnegie Hall improve the performance history data? Use the Issues page to share:

  • Feedback - What was useful about the data or the resources in this repository?
  • Recommendations - Have a great idea for the content, structure, or resources we describe you'd like to share with us?
  • Sample Queries - Did you write an interesting SPARQL query others might find useful?
  • Issues or Inaccuracies - Notice something out of place or incorrect?

Build Something & Share It

Did you use the CH LOD to build an interesting visualization, or port it into a new project? We'd love to see it! Submit a link and a description of how you utilized the data set.

USAGE AND LICENSE

USAGE GUIDELINES

DATA

Carnegie Hall offers the CH Performance History as Linked Open Data dataset as-is and makes no representations or warranties of any kind concerning the contents. Please see the data license statement below.

If you have questions about the dataset or its usage, please submit a new 'Issue' or email archives at carnegiehall dot org.

SCRIPTS

This code is provided “as is” and for you to use at your own risk. The information included in the contents of this repository is not necessarily complete. Carnegie Hall offers the scripts as-is and makes no representations or warranties of any kind.

We plan to update the scripts regularly. We welcome any feedback. Please let us know if you have found the contents of this repository useful!

PUBLIC DEDICATION AND LICENSE

DATA DEDICATION

Carnegie Hall is releasing this performance history dataset with a Creative Commons CC0 1.0 Universal Public Domain Data Dedication.

The Carnegie Hall Performance History dataset includes data from the GeoNames geographical database, which is licensed under a Creative Commons Attribution 3.0 License.

REPOSITORY AND SCRIPTS LICENSE

The MIT License (MIT)

Copyright (c) 2017 Carnegie Hall

All contents are released under the terms described in the MIT License included in this repository.

ACKNOWLEDGEMENTS

Thank you to Matt Miller and Gabe Mangiante for their contributions to this project.

Thank you to the following organizations for inspiration and commitment to the open data community:

About

Carnegie Hall Rose Archives maintains a series of scripts to transform its historical performance history data from its source in a SQL database into the Resource Description Framework (RDF) for publication as linked open data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages