public final class HdfsSources extends Object
|Modifier and Type||Method and Description|
Returns a source that reads records from Apache Hadoop HDFS and emits the results of transforming each record (a key-value pair) with the supplied mapping function.
@Nonnull public static <K,V,E> BatchSource<E> hdfs(@Nonnull org.apache.hadoop.mapred.JobConf jobConf, @Nonnull BiFunctionEx<K,V,E> projectionFn)
This source splits and balances the input data among Jet processors, doing its best to achieve data locality. To this end the Jet cluster topology should be aligned with Hadoop's — on each Hadoop member there should be a Jet member.
Default local parallelism for this processor is 2 (or less if less CPUs are available).
This source does not save any state to snapshot. If the job is restarted, all entries will be emitted again.
K- key type of the records
V- value type of the records
E- the type of the emitted value
jobConf- JobConf for reading files with the appropriate input format and path
projectionFn- function to create output objects from key and value. If the projection returns a
nullfor an item, that item will be filtered out
Copyright © 2019 Hazelcast, Inc.. All rights reserved.