Package com.hazelcast.jet.pipeline
Interface Pipeline
public interface Pipeline
Models a distributed computation job using an analogy with a system of
interconnected water pipes. The basic element is a stage which
can be attached to one or more other stages. A stage accepts the data
coming from its upstream stages, transforms it, and directs the
resulting data to its downstream stages.
The Pipeline
object is a container of all the stages defined on
a pipeline: the source stages obtained directly from it by calling readFrom(BatchSource)
as well as all the stages attached (directly or
indirectly) to them.
Note that there is no simple one-to-one correspondence between pipeline stages and Core API's DAG vertices. Some stages map to several vertices (e.g., grouping and co-grouping are implemented as a cascade of two vertices) and some stages may be merged with others into a single vertex (e.g., a cascade of map/filter/flatMap stages can be fused into one vertex).
- Since:
- 3.0
-
Method Summary
Modifier and Type Method Description static Pipeline
create()
Creates a new, empty pipeline.<T> BatchStage<T>
readFrom(BatchSource<? extends T> source)
Returns a pipeline stage that represents a bounded (batch) data source.<T> StreamSourceStage<T>
readFrom(StreamSource<? extends T> source)
Returns a pipeline stage that represents an unbounded data source (i.e., an event stream).DAG
toDag()
Transforms the pipeline into a Jet DAG, which can be submitted for execution to a Jet instance.String
toDotString()
Returns a DOT format (graphviz) representation of the Pipeline.<T> SinkStage
writeTo(Sink<? super T> sink, GeneralStage<? extends T> stage0, GeneralStage<? extends T> stage1, GeneralStage<? extends T>... moreStages)
Attaches the supplied sink to two or more pipeline stages.
-
Method Details
-
create
Creates a new, empty pipeline.- Since:
- 3.0
-
readFrom
Returns a pipeline stage that represents a bounded (batch) data source. It has no upstream stages and emits the data (typically coming from an outside source) to its downstream stages.- Type Parameters:
T
- the type of source data items- Parameters:
source
- the definition of the source from which the stage reads data
-
readFrom
Returns a pipeline stage that represents an unbounded data source (i.e., an event stream). It has no upstream stages and emits the data (typically coming from an outside source) to its downstream stages.- Type Parameters:
T
- the type of source data items- Parameters:
source
- the definition of the source from which the stage reads data
-
writeTo
@Nonnull <T> SinkStage writeTo(@Nonnull Sink<? super T> sink, @Nonnull GeneralStage<? extends T> stage0, @Nonnull GeneralStage<? extends T> stage1, @Nonnull GeneralStage<? extends T>... moreStages)Attaches the supplied sink to two or more pipeline stages. Returns theSinkStage
representing the sink. You need this method when you want to drain more than one stage to the same sink. In the typical case you'll useGeneralStage.writeTo(Sink)
instead.- Type Parameters:
T
- the type of data being drained to the sink
-
toDag
Transforms the pipeline into a Jet DAG, which can be submitted for execution to a Jet instance. -
toDotString
Returns a DOT format (graphviz) representation of the Pipeline.
-