Skip to content
Ben Darwin edited this page Jul 6, 2017 · 20 revisions

Overview

Pydpiper is a Python libraries and associated set of executable files intended primarily for running image processing pipelines on compute grids. It provides a domain-specific language (DSL) for constructing pipelines out of smaller components, wrappers for numerous command-line tools (currently largely MINC-centric, but currently expanding to some NIFTI- and ITK-based tools), code for constructing common pipeline topologies, and command-line wrappers to run some core pipelines.

Conceptual overview

Pydpiper code can be used from within Python or packaged into an application and called from the shell. Roughly speaking, the process is as follows: first, executing Pydpiper code determines the overall topology of a pipeline and the filenames of the input and output files of each step, compiling a graph of "stages" to be scheduled for execution; second, the Pydpiper server creates 'executors' (either remote jobs on a compute grid or subprocess on a local machine) which get stages (usually shell commands) from the server as their dependencies are satisfied and run them.

Monitoring an executing pipeline

Running the included check_pipeline_status.py script with a pipeline's <pipeline_name>_uri file as argument will provide a summary of running and finished stages, number of running executors, and other information.

An important source of truth is the pipeline.log file created in the pipeline's output directory. You can control the logging l

Clone this wiki locally