DARE Platform Architecture

Platform Architecture

DARE Components

dispel4py

dispel4py is a Python library for describing abstract stream-based workflows for distributed data-intensive applications. It enables users to focus on their scientific methods, avoiding distracting details and retaining flexibility over the computing infrastructure they use.

It delivers mappings to diverse computing infrastructures, including cloud technologies, HPC architectures and specialised data-intensive machines, to move seamlessly into production with large-scale data loads.

s-ProvFlow

s-ProvFlow implements the P4 aspects of the DARE platform. It is a provenance framework for storage and access of data-intensive streaming lineage. It offers a a web API and a range of dedicated visualisation tools based on the underlying provenance model, S-PROV, which utilises and extends PROV and ProvONE models.

S-PROV addresses aspects of mapping between logical representation and concrete implementation of a workflow until its enactment onto a target computational resource. The model captures aspects associated with the distribution of the computation, runtime changes and support for flexible metadata management and discovery for the data products generated by the execution of a data-intensive workflow.

Complete Documentation for the component can be found at the relevant repository.

Processing Elements registry

The dispel4py Registry is a RESTful Web service providing functionality for registering workflow entities, such as processing elements (PEs), functions and literals, while encouraging sharing and collaboration via groups and workspaces.

DARE Execution API

The DARE Execution API enables the distributed and scalable execution of numerical simulations (now using SPECFEM3D code) and dispel4py workflows (e.g. used to describe the steps of RA except for simulations), which can be extended to other execution contexts. Execution API also offers services such as uploading/downloading and referencing of data and process monitoring. More information can be found at the relevant wiki pages.

Data catalogue - Semantic Data Discovery

The Data catalogue is part of the DARE Knowledge Base and manages information related to the data elements processed via the platform. It exposes a RESTful API for registering new data sources and retrieving information on data previously registered or data results produced by processes executed over DARE. For more information visit the respective repository

Testing environment

DARE platform provides a “playground” - testing environment to research developers. During the testing phase of the workflow development, users can simulate a dispel4py workflow execution as well as simulate a local dispel4py run using the playground module of the DARE platform. Additional information and API description is available in the corresponding repository