is a PaaS that enables your organization to add synthetic data as an enterprise capability so that you can access data that is engineered for your problem sets, as often as you need it and as much as you need, to help with AI and ML workflows for training, tuning, bias detection, and innovation.

What is is a Platform as a Service (PaaS) that enables data scientists, data engineers, and developers to create custom pipelines for generating synthetic data, called channels, and to run those pipelines to generate as much data as they want. Most of the data that is generated using is imagery data that can be used for Computer Vision (CV) AI-based workflows. Other types of synthetic data can be generated by the Platform and that is completely customizable by the channel developer.

Benefits of a PaaS

A PaaS removes the need for a user to maintain their own hardware, infrastructure, and application execution environment to run software systems and custom code. In the case of, we provide access to identity, cloud compute, data storage, and an SDK to execute your custom synthetic data pipeline in a cloud-based, high performance compute environment without requiring you to manage and maintain all of the infrastructure and software.

Why do I need a PaaS for synthetic data?

Many organizations who initially discover synthetic data believe that they can generate or purchase single datasets for their AI workflows. However, these users soon discover that when they need more data for additional bias detection, to train for unexpected rare entities or scenarios, or when they want to apply AI to a whole new product, they will be need to acquiring more one-off datasets and may have lost domain knowledge or artifacts from their original investigations.

The PaaS enables users to access synthetic data as a service. Users can create, use, and store their domain knowledge and techniques as content and channels in, then update or branch them as their needs change. If a user then needs to create a new dataset for an entirely different AI problem set, they can do so, even incorporating access to datasets into automated workflows and remote systems through the SDK which allows access to the cloud-hosted PaaS.

Major components of the PaaS has three main experiences for data scientists, data engineers, and developers:

  • A development environment and samples to create synthetic data pipelines

  • Web-based experience for configuration and job execution

  • An SDK for remote or integrated job execution and automation

Development environment and Example code provides a Docker-based development environment and example code to help data engineers and data scientists with development experience to customize synthetic data channels of their own.

Web-based experience for configuration and job execution

The main interface for interacting with to run and configure jobs is a web-based user interface that allows data scientists to run sample channels provided to emulate satellite RGB imagery, configure graphs in a no-code experience to configure synthetic data generation, and to run jobs and manage the output datasets.

An SDK for job execution and automation provides an SDK for developers to remotely create, execute, and access the output of synthetic data channels to accomplish batch processing and automated data provisioning to AI workflows.

Billing, account information, and members

The same web interface that is used to configure graphs and run jobs also has settings and configuration for users to set and monitor billing information, upgrade, and manage and invite collaborators who may be members of the organization or guest users with more limited access.