Skip to content

datarootsio/vesuvius-ink-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Vesuvius Ink detection Challenge

papyrus

This repository contains scripts, notebooks and code for the Vesuvius ink detection challenge.

The Ink Detection Challenge is a subproblem of the bigger Vesuvius challenge., where a prize of $700k is offered to the first team able to read an unopened Herculaneum scroll that was carbonised during the eruption of Mount Vesuvius through 3D X-ray scans.

The Ink Detection Challenge focuses on the detection of ink, based on 3D X-ray scans. Training data for Ink detection training is provided from 3 broken-off fragments of scrolls that where opened physically. As ground truth data, labels where annotated manually for each pixel. To have a broad range of people work and collaborate on this, a Kaggle competition was launched.

More information is available on the notion page.
Additionally, HPC code for this challenge can be found here.

Structure

images: Different images used in the README.md.

notebooks: This folder contains the different notebooks we created for the Ink detection challenge. They are designed to be simple and comprehensible to get used to the ink detection problem and the solution we propose. Feel free to fork the repo and modify some notebooks if you want to try your own solution!

kaggle-github connection: This folder contains scripts and information to push you notebooks directly from kaggle into this github repository.

Notebooks

A variety of notebooks is provided in the notebooks folder. All of these are adapted from Kaggle notebooks (links to the Kaggle notebooks are provided as well). The easiest way to run these is to run them on Kaggle. If you want to run them either locally or any other remote resource you will first have to set up a compatible environment and download the used datasets. (Kaggle docker images are available here but are very bulky (~45GB) uncompressed and are not compatible with Apple silicon so creating your own python venv might be more advisable, especially for a local setup)

Downloading datasets

  1. Install the kaggle API
pip install kaggle
  1. Go to your kaggle account page and click the "Create New Token" button
    account
  2. Move the downloaded "kaggle.json" to ~/.kaggle/kaggle.json
  3. Download the dataset
# example for competition dataset
kaggle competitions download -c vesuvius-challenge-ink-detection

# example for other datasets
kaggle datasets download -d thenoodleninja/vesuvius-flattened

EDA notebook (kaggle notebook)

This notebook performs basic EDA on the vesuvius-challenge-ink-detection dataset. This includes basic visualizations and some statistical analysis.

Fragment flattening (kaggle notebook)

This notebooks attempts to flatten the papyrus fragments from the vesuvius-challenge-ink-detection dataset. A comparison between different classical CV techniques is performed and code to flatten the fragments, based on a height map generated with the Sobel filter, is provided as well.

Resnet18d training (kaggle notebook)

The notebook use to train the 3D Unet. Basic configuration of the model and training is done in the ModelConfig class.

Resnet18d inference (kaggle notebook)

This notebook is used to perform inference on any data in the /kaggle/input/vesuvius-challenge-ink-detection/test folder and generate a run-lenght-encoded prediction. This can be used to submit a pretrained model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published