Overall Design and Goals

Niassa/SeqWare started off as 3 things: a LIMS, a workflow engine, and a metadata tracking system. We have replaced the LIMS with our Pinery interface over MISO and, previously, GeoSpiza. We have outsourced the workflow engine with the Broad’s Cromwell. Víðarr is our replacement for the metadata tracking.

It schedules workflows using a workflow engine and collects the output from these workflows and stores the metadata about these files and connections to the Pinery LIMS information. It supports using Cromwell as a workflow engine, with the future goal to support other workflow engines. It has to support provisioning out other kinds of data and associate LIMS metadata, with the future goal to provision out logs, QC information, and database fragments (i.e., ETL data). It supports communicating with multiple workflow engines at once so it can schedule on a local HPC and a cloud instance.

Design Summary

The Víðarr analysis tracking system has:

Incrementally fetching analysis records - this allows getting new analysis results available for downstream workflows more quickly.
Support for multiple workflow execution engines - allowing new engines to be swapped in and laying the framework to extend compute capacity.
A simplified workflow packaging and installation process - import workflows via a definition file.
Updated software stack with minimal core dependencies and use of modules to limit dependency complexity - more easily update and maintain project while reducing breaking changes.

Components of Víðarr include:

a web service to track analysis
a command line interface for testing and development
a base workflow engine (Cromwell)

Additional components include:

Tools to generate workflow definitions
Cerberus create file provenance by integrating Víðarr output with Pinery

The goal for Víðarr web service is to:

Take workflow requests.
Check if this request is already completed.
Delegate the workflow to a workflow engine.
Provision the output and record the output in a database.
Provide an interface to this database similar to provide a history of completed workflows.

Shesmu Integration

Shesmu needs file provenance and can use Cerberus as a plugin. Shesmu also needs to run workflows and can use the /api/submit endpoint for that purpose. It needs to know what it can run and uses /api/workflows to get the known workflow and /api/targets to know where it can run them; it generates an action definition for every valid workflow version and target combination. The submit action in Shesmu has commands to re-run failed workflows (Reattempt) and unload-and-rerun (Reprocess). For more details, see the Víðarr plugin for Shesmu.

How a workflow runs

Víðarr takes a submission request that includes:

the workflow name and version
the target name
the parameters to the workflows
the external keys (LIMS keys) associated with the workflow
the output metadata, that indicates what output is attached to which output
any labels required by the workflow
any parameters to the workflow engine
consumable resource values

Once received:

The target and workflow name and version are checked. If not found or the workflow is not supported by the workflow engine, it is rejected.
The parameters, metadata, labels, consumable resource values, and engine parameters are type checked.
References to other Víðarr files used as input are resolved.
External keys are checked.
The identifier is calculated.
The database is checked for the identifier. If it is found and in a failed state and the request asked for reattempting, the workflow is started. If it is found, the identifier is returned to the client. If the identifier is not found, the workflow run is started.

Once a workflow is started (or restarted), it goes through phases:

the workflow waits for any consumable resource to allow it to continue
the workflow does a preflight check where it ensures that configuration in the output metadata is valid
it provisions in any input files
it runs the workflow
it provisions out the results of the workflow
it performs any clean up required

We keep running into issue of running out of temp disk space. Shesmu can throttle, but it’s often too late. The Víðarr server has a number of consumable resources (e.g., scratch disk) and then workflow runs can reserve a chunk of that resource. The reservation decreases the available count of that resource until the workflow is finished running. If there isn’t enough of the resource, it must wait until later. Consumable resources can also be used to block starting workflows in ways similar to Shesmu throttlers. The implementation is intentionally flexible.

vidarr

Analysis provenance tracking server

Overall Design and Goals

Design Summary

Shesmu Integration

How a workflow runs