Data Flow - Introduction

1. Description

The Data Flow module allows users to:

  • Create automated data ingestion streams from various formats
  • Monitor data import by asset and area of interest
  • Link unstructured data to the digital twin: automatic data stream transformation and contextualization into a unified state representation to be used for visualization, system state updates, and AI model input

4 steps workflow:

  • Define the import and contextualization parameters.

Create a template defining the import and contextualization parameters to be applied to the data, such as the data source type or ingestion and transformation parameters.

  • Create a data stream to automate the data import process.

Create a data stream that implements your defined template to connect a data source to the Aether platform and automate the import process.

  • Monitor the ingestion, contextualization and transformation process at the file level

View your data streams and their processing status, and monitor the ingestion, contextualization, and transformation process at the file level.

Re-synchronize your data sources to aggregate new data using our API.

  • Visualize, aggregate, or analyze the data on Aether by leveraging Alteia business modules.

Using our business modules, explore your assets and their properties from 2D and 3D views with Aether Insight, or extract the dynamic properties of your assets and drive your operations thanks to the Analysis and Operations modules.

2. Workflow Overview

2.1 Configure the Ingestion Process

Define the import, contextualization, and transformation parameters to be applied to the data, such as the data source type or ingestion and transformation parameters.

The template relates to a specific use case (for example Lidar data ingestion), create one or more datastream templates covering your business needs.

Push your template to the Aether using Alteia CLI‍, the template will be available on the data stream creation step to be used as data fusion requirements.

2.2 Create a Data Stream

Create a new data stream on Aether: select the template to be applied to the batch of data, name your datastream, link it to a company and project, and add a data source

Linking a company and a project to a data stream allows you to manage user permissions and access to your data; data is only available to users who have access to the project

The module supports the ingestion of any type of file, from raster to mesh and any other format

Use the Data Flow module to convert any type of data into a format supported by Aether, allowing users to leverage the Aether platform’s tooling on previously unsupported file format.

See Paragraph 4. the site of supported files

2.3 Monitor the Data Stream Process

Easily monitor the data stream process: view the data streams and their processing status on the dashboard and access your data stream detailed view.

Monitor ingestion, contextualization, and transformation process at the file level, re-synchronize the data sources after adding new data, and aggregate the new data imported using Alteia's APIs

Close a data stream to close the connection between the data source and the platform and automate the resynchronization and aggregation of any data that has not yet been imported.

The imported data is then linked to the digital twin of the assets for visualization, system state updates, and AI model input.

2.4 Visualize, Aggregate, or Analyse the Data on Aether

Visualize the data, compare datasets, and run different types of analysis such as photogrammetry or change maps.

Use 2D/3D measurement tools to calculate distance, height, area, or volume

Explore the assets and their properties from 2D and 3D views with the Insight module, or extract the dynamic properties of the assets and drive the operations thanks to the Analysis and Operations modules.

3. Scope

Today, the Data Flow Module handles the following use cases: 

  • The vegetation encroachment analysis: 
    • Import of lidar (.las) data from object storage (Amazon or Azure) or satellite images from SpatioTemporal Asset Catalogs
    • Contextualization: link of the lidar data (point cloud) with feeder id
    • Transformation: point cloud classification
    • After processing, the data stream can be used within the Analysis Module (Analysis - Introduction)‍   and Asset Viewer Module (Insight - Introduction)‍ 
  • Bulk import of data to Your Sites Module:
    • Import of raster (example satellite tiles), vector, mesh files, and Point Cloud in a specific Site of the Data Studio Module from object storage (Amazon or Azure)
    • No AutomaticContextualization and Transformation
    • After the data import, the data stream data can be visualized and used as inputs of analytics within the "Data Studio" Module.

4. Compatible Inputs and Data Sources

Compatible data source:

  • Amazon S3 Object storage
  • Azure Blob Storage

Compatible file types:

TYPE FORMATS
Vector kml , json , geojson , topojson , zip shapefile , dxf (single-layer)
Raster tiff, tif, jp2
Point Cloud las, laz
Mesh obj (+ mtl) , glb
File all

5. Use case

Data Flow for power grids data ingestion at the state scale

5.1 prerequisites

LiDAR data acquisition has been ordered for the entire network and the tiles have been delivered to object storage (eg. Azure S3)

5.2 Results

Reduction of time spent on data ingestion in the benefit of data visualization, data analysis, and operations management - high-value tasks

5.3 Workflow

Data Flow for power grids data ingestion at the state scale

Step 1 - The Data Ops team creates a template defining the data source type along with the import, contextualization, transformation, and aggregation parameters for the LiDAR tiles

Delete

Template

Import - data type, CRS, etc.

Contextualization -  type of contextualization (eg. geographic) and asset schemas to be used (e.g. feeder or lateral assets)

Transformation - which analytic to apply to the imported data and its parameters, if any (e.g. PCL segmentation)

Aggregation - e.g. Classified PCL aggregation

Step 2 - The Data Ops team creates a datastream linked to the template created for the LiDAR tiles ingestion on the platform. Each tile delivered on the S3 by the operators is automatically ingested on the platform according to the parameters that are defined. The Data Ops team can monitor the ingestion process thanks to the data stream dashboard

Step 3 - The Florida Grid Network LiDAR tiles are ingested into Aether and are available for analysis. Each tile is contextualized with the assets it refers to, allowing data analysis to be ordered for specific assets.

6. Benefits

Firstly, such a module streamlines and accelerates workflows, saving valuable time and reducing human error associated with manual data entry. 

By seamlessly integrating with various data sources and systems, it enhances data accuracy and consistency, thereby fostering informed decision-making. 

Additionally, the module enhances accessibility by providing a user-friendly interface that allows authorized personnel to retrieve, visualize, analyze, and share data effortlessly. 

This not only empowers users across different levels of expertise but also promotes collaboration and transparency within teams. 

Furthermore, the automation of data upload and accessibility ensures real-time updates, enabling organizations to respond promptly to dynamic situations and evolving requirements. 

Ultimately, this innovation optimizes operational efficiency, facilitates data-driven insights, and fortifies an organization's competitive edge in today's data-driven landscape.