Overview
When a node receives file or folder output from an upstream node, you can distribute that input so the downstream node runs as multiple jobs instead of one. Each job gets a portion of the data (one file from a folder, one line from a file, or one batch of a file), and the platform runs these jobs in parallel. Distribution is available based on the types of the connected inputs and outputs; you do not need to run the workflow first to see or use it. The node that receives the distributable input is called the destination node. You trigger distribution from that node using the Distribute button. The platform offers different distribution options depending on whether the destination expects file or folder input, string input, or file input (in which case distribution by batches is available).Where Distribution Applies
Distribution applies when:- An upstream node produces file or folder output.
- That output is connected to a downstream node (the destination).
- The destination node has an input that can accept the distributed form (e.g. one file at a time, one string per line, or one batch of a file).
Distributing by File Lines (One Job per Line)
When the upstream node provides a file and the destination node has a string input, the platform can split the file line by line. Each line is passed as a separate string to the chosen parameter, and one job runs per line, each on a separate machine in parallel. In the distribution flow:- The platform explains that the output will be split line by line and that each line runs on a separate machine in parallel.
- Choose which parameter on the destination node receives the distributed values (e.g. the string input).
- Confirm the distribution.
Distributing by Folder (One Job per File)
When the upstream node provides a folder and the destination node has a file or folder input, you can distribute so that one job runs per file in that folder. Each job receives a single file on the same input. In the Distribute dialog:- Under Select input to distribute, choose the folder input from the upstream node. Only inputs that can be distributed are listed.
- Review the Preview to see what will be split (the files or items in the folder).
- Click Split and run to confirm.
Distributing by File Batches (Files in Groups)
When the upstream node provides a file (either a single file or file output from a previous node) and the destination node has a file input, you can distribute so that the file is split into relatively equal batches. The platform groups the file content into multiple batch files; the number of batches is at most the number of machines the workflow runs on, and that number is the scale of the jobs. One job runs per batch in parallel, each instance processing a portion of the file. In the Distribute dialog:- Under Distribution mode, select Files in groups (“Group files into parallel batches”).
- Optionally choose which file input on the destination node receives the batched data if more than one is available.
- Click Split and run to confirm.
After Distribution
Once distribution is enabled, the destination node shows Distributed on the workflow canvas. When you run the workflow, the execution view shows multiple tasks (one per distributed unit). You can inspect each task’s inputs, console output, and files separately. The same upstream output can feed one node with line distribution (e.g. one job per line), another with folder distribution (e.g. one job per file), or batch distribution (e.g. one job per batch of a file), depending on each destination node’s input types.Enabling Distribution (Summary)
Connect an upstream node
Ensure the upstream node produces file or folder output and that this output is connected to the input of the destination node.
Open the Distribute dialog
Select the destination node and click Distribute. The dialog or configuration flow shows the distribution options available for that connection (by file lines, by folder/files, or by file batches), based on the destination’s input types.