Skip to content

Architecture Overview

OpenECPDS is a distributed system composed of cooperating services that together acquire, store, and disseminate data. Unlike a conventional data store, OpenECPDS does not necessarily store data physically in its persistent repository — instead it works like a search engine, crawling and indexing metadata from data providers, while optionally caching content in its Data Store.

High-level design

Example of OpenECPDS Deployment

Data can be fed into the Data Store via:

  • The Data Acquisition service, discovering and fetching data from data providers.
  • Data providers actively pushing data through the Data Portal.
  • Data providers using the OpenECPDS API to register metadata, allowing asynchronous data retrieval.

Data products can be searched by name or metadata and either pushed by the Data Dissemination service or pulled from the Data Portal by users. OpenECPDS streams data on the fly or sends it from the Data Store if it was previously fetched.

Core components

Component Responsibility
Master Server Central coordinator: authentication, metadata registration, scheduling, and Data Mover allocation.
Mover Server (Data Mover) Connects to remote systems via transfer modules, stores and streams file content.
Monitor Server Web-based monitoring and management interface.
Data Portal Passive, incoming access (FTP/HTTPS/S3) for remote sites.
Database Persists destinations, hosts, transfers, and history.

See Components for a detailed description of each.

Key cross-cutting mechanisms

OpenECPDS Data Flows

Modularity & protocols

The OpenECPDS software is modular, supporting new protocols through extensions. It interacts with a variety of environments and supports multiple standard protocols:

  • Outgoing connections (Data Acquisition & Dissemination): FTP, SFTP, FTPS, HTTP/S, Amazon S3, Azure and Google Cloud Storage.
  • Incoming connections (Data Portal): FTP, HTTPS, S3 (SFTP and SCP are available exclusively through a Commercial API).

See Protocols & Connections and the Transfer Modules reference for details.

Object storage

OpenECPDS stores data as objects, combining data, metadata, and a globally unique identifier. It employs a file-system-based solution with replication across multiple locations to ensure continuous data availability. The object storage system is hierarchy-free but can emulate directory structures when necessary. See Object Storage.