Interface Job
public interface Job
DAG
or Pipeline
. Once submitted, Jet starts executing the job automatically.- Since:
- 3.0
-
Method Summary
Modifier and Type Method Description void
cancel()
Makes a request to cancel this job and returns.JobStateSnapshot
cancelAndExportSnapshot(String name)
Exports and saves a state snapshot with the given name, and then cancels the job without processing any more data after the barrier (graceful cancellation).JobStateSnapshot
exportSnapshot(String name)
Exports a state snapshot and saves it under the given name.JobConfig
getConfig()
Returns the configuration this job was submitted with.CompletableFuture<Void>
getFuture()
Gets the future associated with the job.long
getId()
Returns the ID of this job.default String
getIdString()
Returns the string representation of this job's ID.JobMetrics
getMetrics()
Returns a snapshot of the current values of all job-specific metrics.default String
getName()
Returns the name of this job ornull
if no name was supplied.JobStatus
getStatus()
Returns the current status of this job.long
getSubmissionTime()
Returns the time when the job was submitted to the cluster.JobSuspensionCause
getSuspensionCause()
Return adescription of the cause
that has led to the suspension of the job.default void
join()
Waits for the job to complete and throws an exception if the job completes with an error.void
restart()
Gracefully stops the current execution and schedules a new execution with the current member list of the Jet cluster.void
resume()
Resumes a suspended job.void
suspend()
Gracefully suspends the current execution of the job.
-
Method Details
-
getId
long getId()Returns the ID of this job.- Throws:
IllegalStateException
- if the job has not started yet, and thus has no ID.
-
getIdString
Returns the string representation of this job's ID. -
getConfig
Returns the configuration this job was submitted with. Changes made to the returned config object will not have any effect. -
getName
Returns the name of this job ornull
if no name was supplied.Jobs can be named through
JobConfig.setName(String)
prior to submission. -
getSubmissionTime
long getSubmissionTime()Returns the time when the job was submitted to the cluster.The time is assigned by reading
System.currentTimeMillis()
of the master member that executes the job for the first time. It doesn't change on restart. -
getStatus
Returns the current status of this job. -
getSuspensionCause
Return adescription of the cause
that has led to the suspension of the job. Throws anIllegalStateException
if the job is not currently suspended.- Since:
- 4.3
-
getMetrics
Returns a snapshot of the current values of all job-specific metrics.While the job is running the metric values are updated periodically (see metrics collection frequency), assuming that both global metrics collection and per-job metrics collection are enabled. Otherwise empty metrics will be returned.
Keep in mind that the collections may occur at different times on each member, metrics from various members aren't from the same instant.
When a job is restarted (or resumed after being previously suspended) the metrics are reset too, their values will reflect only updates from the latest execution of the job.
Once a job stops executing (successfully, after a failure, cancellation, or temporarily while suspended) the metrics will have their most recent values (i.e. the last metric values from the moment before the job completed), assuming that
metrics storage
was enabled. Otherwise empty metrics will be returned.- Since:
- 3.2
-
getFuture
Gets the future associated with the job. The returned future is not cancellable. To cancel the job, thecancel()
method should be used.- Throws:
IllegalStateException
- if the job has not started yet.
-
join
default void join()Waits for the job to complete and throws an exception if the job completes with an error. Does not return if the job gets suspended. Never returns for streaming (unbounded) jobs, unless they fail or are cancelled.Shorthand for
job.getFuture().join()
.- Throws:
CancellationException
- if the job was cancelled
-
restart
void restart()Gracefully stops the current execution and schedules a new execution with the current member list of the Jet cluster. Can be called to manually make use of added members, if auto scaling is disabled. Only a running job can be restarted; a suspended job must be resumed.Conceptually this call is equivalent to
suspend()
&resume()
.- Throws:
IllegalStateException
- if the job is not running, for example it has already completed, is not yet running, is already restarting, suspended etc.
-
suspend
void suspend()Gracefully suspends the current execution of the job. The job's status will becomeJobStatus.SUSPENDED
. To resume the job, callresume()
.You can suspend a job even if it's not configured for snapshotting. Such a job will resume with empty state, as if it has just been started.
This call just initiates the suspension process and doesn't wait for it to complete. Suspension starts with creating a terminal state snapshot. Should the terminal snapshot fail, the job will suspend anyway, but the previous snapshot (if there was one) won't be deleted. When the job resumes, its processing starts from the point of the last snapshot.
NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of being suspended, it may end up getting immediately restarted. Call
getStatus()
to find out and possibly try to suspend again.- Throws:
IllegalStateException
- if the job is not running
-
resume
void resume()Resumes a suspended job. The job will resume from the last known successful snapshot, if there is one.If the job is not suspended, it does nothing.
-
cancel
void cancel()Makes a request to cancel this job and returns. The job will complete after its execution has stopped on all the nodes. If the job is already suspended, Jet will delete its runtime resources and snapshots and it won't be able to resume again.NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of cancellation, it may end up getting restarted after the cluster has stabilized and won't be cancelled. Call
getStatus()
to find out and possibly try to cancel again.The job status will be
JobStatus.FAILED
after cancellation,join()
will throw aCancellationException
.See
cancelAndExportSnapshot(String)
to cancel with a terminal snapshot.- Throws:
IllegalStateException
- if the cluster is not in a state to restart the job, for example when coordinator member left and new coordinator did not yet load job's metadata.
-
cancelAndExportSnapshot
Exports and saves a state snapshot with the given name, and then cancels the job without processing any more data after the barrier (graceful cancellation). It's similar tosuspend()
followed by acancel()
, except that it won't process any more data after the snapshot.You can use the exported snapshot as a starting point for a new job. The job doesn't need to execute the same Pipeline as the job that created it, it must just be compatible with its state data. To achieve this, use
JobConfig.setInitialSnapshotName(String)
.Unlike
exportSnapshot(java.lang.String)
method, when a snapshot is created using this method Jet will commit the external transactions because this snapshot is the last one created for the job and it's safe to use it to continue the processing.If the terminal snapshot fails, Jet will suspend this job instead of cancelling it.
You can call this method for a suspended job, too: in that case it will export the last successful snapshot and cancel the job.
The method call will block until it has fully exported the snapshot, but may return before the job has stopped executing.
For more information about "exported state" see
exportSnapshot(String)
.The job status will be
JobStatus.FAILED
after cancellation,join()
will throw aCancellationException
.- Parameters:
name
- name of the snapshot. If name is already used, it will be overwritten- Throws:
JetException
- if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.
-
exportSnapshot
Exports a state snapshot and saves it under the given name. You can start a new job using the exported state usingJobConfig.setInitialSnapshotName(String)
.The snapshot will be independent from the job that created it. Jet won't automatically delete the IMap it is exported into. You must manually call snapshot.destroy() to delete it. If your state is large, make sure you have enough memory to store it. The snapshot created using this method will also not be used for automatic restart - should the job fail, the previous automatically saved snapshot will be used.
For transactional sources or sinks (that is those which use transactions to confirm reads or to commit writes), Jet will not commit the transactions when creating a snapshot using this method. The reason for this is that such connectors only achieve exactly-once guarantee if the job restarts from the latest snapshot. But, for example, if the job fails after exporting a snapshot but before it creates a new automatic one, the job would restart from the previous automatic snapshot and the stored internal and committed external state will be from a different point in time and a data loss will occur.
If a snapshot with the same name already exists, it will be overwritten. If a snapshot is already in progress for this job (either automatic or user-requested), the requested one will wait and start immediately after the previous one completes. If a snapshot with the same name is requested for two jobs at the same time, their data will likely be damaged (similar to two processes writing to the same file).
You can call this method on a suspended job: in that case it will export the last successful snapshot. You can also export the state of non-snapshotted jobs (those with
ProcessingGuarantee.NONE
).If you issue any graceful job-control actions such as a graceful member shutdown or suspending a snapshotted job while Jet is exporting a snapshot, they will wait in a queue for this snapshot to complete. Forceful job-control actions will interrupt the export procedure.
You can access the exported state using
JetInstance.getJobStateSnapshot(String)
.The method call will block until it has fully exported the snapshot.
- Parameters:
name
- name of the snapshot. If name is already used, it will be overwritten- Throws:
JetException
- if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.
-