Pipelines Manager

The Pipelines Manager is where you can build and manage pipelines.

Tip

A pipeline sends data through a machine learning model and outputs results for analysis.

Pipelines Manager


Streaming vs. Batch

There are two types of pipeline:

Name Characteristics
Streaming * Receives streaming data from Arundo Edge
* Runs continuously to output real-time data
Batch * Receives data from Arundo Fabric warm storage or a comma-separated values (CSV) file
* Runs on-demand or at a preset interval to output data for a defined period

These two pipeline types give you the flexibility to output data for analysis whenever you need.


Getting There

Access Arundo Fabric and select Pipelines on the Navigation bar.


Pipelines List

The Pipelines list displays details for the pipelines currently in use.

Pipelines list

The total number of pipelines currently built displays in the top right corner of the Pipelines list.

Fields

The following information displays for each pipeline:

Field Description
Name Name of the pipeline
Creator User who added the pipeline
Created Date the pipeline was added
Status Indicates whether the pipeline is active
Actions

The following actions are available for each pipeline:

Action Description
Status Toggle the switch to activate/deactivate the pipeline
Delete Delete the pipeline

Pipeline Builder

The Pipeline Builder displays when you click a pipeline name in the Pipelines list. Use the Pipeline Builder to build or edit a pipeline.

Example

The following example shows a pipeline with four inputs, one machine learning model, and one output. This pipeline streams sensor data from a pump through a model to calculate pump efficiency.

Pipeline Builder

The color of the connector between an input/output and a model indicates the following:

  • A green connection indicates that data is flowing correctly.
  • A gray connector indicates that data is not flowing correctly and there may be an issue.

Building Pipelines

The following guides explain how to build pipelines.

Building a Streaming Pipeline

A streaming pipeline receives streaming data from Arundo Edge. Streaming pipelines run continuously to output real-time data.

Prerequisites

  1. Build a model for use in the pipeline
  2. Publish the model in Arundo Composer
  3. Add input tags for the model in Arundo Edge or the Tags Manager

Instructions

  1. Click + Create New Pipeline.
  2. Select Streaming.
  3. Locate the model to use for the pipeline and click + Add.

    The model displays on the diagram, with links indicating required inputs and outputs.

    Model example

  4. Click the links on the model and specify inputs and outputs.

    • You can only use existing tags as inputs.
    • You can use existing tags, new tags, or CSV files as outputs.
  5. Click Rename, enter a name for the pipeline, and click Rename.

  6. Activate the pipeline:

    Activate pipeline

Building a Batch Pipeline

A batch pipeline receives data from Arundo Fabric warm storage or a CSV file. Batch pipelines run on-demand or at a preset interval to output data for a defined period.

Note

Depending on the location of the data sources for a batch pipeline, there are multiple ways to build the model used in the pipeline.

The following examples show how to add batch pipelines in common scenarios.

Example 1

The following example shows how to add a batch pipeline that receives data from Arundo Fabric warm storage and outputs data into a CSV file.

  1. Build a model workspace.
  2. Modify the model's app.py file:

    1. Import the csv package:

      1
      import csv
      
      1. Include the following @argument:

      1
      @argument("tag_csv_file", type=fields.File)
      
      1. Include an @option decorator for each input tag:

      1
      @option("input_1", type=float)
      
      1. Include the following @returns decorator, where "XXX" is the name of the CSV file:

      1
      @returns("file_XXX", type=fields.File)
      
      1. Define a function like the following, where "XXX" is the name of the CSV file:

       1
       2
       3
       4
       5
       6
       7
       8
       9
      10
      def TagCsvIn_FileOut(tag_csv_file, input_1=1, input_2=2):
      """
      This endpoint takes data from warm storage and returns summary information about the tags specified as inputs.
      """
      csv_data=pd.read_csv(tag_csv_file) output_mean=float(csv_data["Value"].mean()) output_min=float(csv_data["Value"].min()) output_max=float(csv_data["Value"].max())
      mylist=[['min',output_min], ['mean',output_mean], ['max',output_max]]
      f = StringIO()
      csv.writer(f).writerows(mylist)
      file_XXX = f.getvalue()
      return file_XXX
      
  3. Publish the model in Arundo Composer.

  4. Access Arundo Fabric.
  5. Upload the csv file to use as the input for the pipeline.
  6. Open the Pipelines Manager.
  7. Click + Create New Pipeline.
  8. Select Batch.
  9. Locate the model to use for the pipeline and click + Add.

    The model displays on the diagram.

  10. Select input tags for the model.

  11. Select the output CSV file for the model.
  12. Click Rename, enter a name for the pipeline, and click Rename.
  13. Select one of the following options:

    • To run the pipeline on-demand, proceed to Step 18.
    • To run the pipeline at a preset interval, proceed to Step 14.
  14. Click Trigger.

  15. Under Schedule, select when you want to run the pipeline.
  16. Under Data Set, select Warm Storage and then select the period for which to collect data.
  17. Click Save.
  18. Activate the pipeline:

    Activate pipeline

Example 2

The following example shows how to add a batch pipeline that receives data from a CSV file and outputs data into another CSV file.

  1. Build a model workspace.
  2. Modify the model's app.py file:

    1. Import the csv package:

      1
      import csv
      
    2. Include the following @argument, where "XXX" is the name of the input CSV file:

      1
      @argument("file_XXX", type=fields.File)
      
    3. Include the following @returns decorator, where "XXX" is the name of the output CSV file:

      1
      @returns("file_XXX", type=fields.File)
      
    4. Define a function like the following, where "XXX" is the name of the output CSV file:

      1
      2
      3
      4
      5
      6
      7
      8
      9
      def fileIn_fileOut(file_in):
      """
      This endpoint takes a file and returns summary information about the tags specified as inputs.
      """ csv_data=pd.read_csv(file_in) output_mean=float(csv_data["Value"].mean()) output_min=float(csv_data["Value"].min()) output_max=float(csv_data["Value"].max())
      mylist=[['min',output_min], ['mean',output_mean], ['max',output_max]]
      f = StringIO()
      csv.writer(f).writerows(mylist)
      file_XXX = f.getvalue()
      return file_XXX
      
  3. Publish the model in Arundo Composer.

  4. Access Arundo Fabric.
  5. Upload the csv files to use as the input and output for the pipeline.
  6. Open the Pipelines Manager.
  7. Click + Create New Pipeline.
  8. Select Batch.
  9. Locate the model to use for the pipeline and click + Add.

    The model displays on the diagram.

  10. Select the input and output CSV files for the model.

  11. Click Rename, enter a name for the pipeline, and click Rename.
  12. Select one of the following options:

    • To run the pipeline on-demand, proceed to Step 17.
    • To run the pipeline at a preset interval, proceed to Step 13.
  13. Click Trigger.

  14. Under Schedule, select when you want to run the pipeline.
  15. Under Data Set, select Managed by the Model.
  16. Click Save.
  17. Activate the pipeline:

    Activate pipeline


Managing Pipelines

The following guides explain how to manage pipelines.

Running a Batch Pipeline
  1. Locate the pipeline in the Pipelines list.
  2. Click the Name.
  3. Click Run.

    The Output panel displays a visualization of data output by the pipeline.

Renaming a Pipeline
  1. Locate the pipeline in the Pipelines list.
  2. Click the Name.
  3. Click Rename.
  4. Enter a new name.
  5. Click Rename.
Deleting a Model from a Pipeline
  1. Locate the pipeline in the Pipelines list.
  2. Click the Name.
  3. Click X in the top right corner of the model.
Deactivating a Pipeline
  1. Locate the pipeline in the Pipelines list.
  2. Click the Name.
  3. Deactivate the pipeline:

    Deactivate pipeline

Deleting a Pipeline
  1. Locate the pipeline in the Pipelines list.
  2. Deactivate the pipeline:

    Deactivate pipeline

  3. Click Delete next to the pipeline.

  4. Click Yes to confirm.