Interface Job


public interface Job
A Jet computation job created by submitting a DAG or Pipeline. Once submitted, Jet starts executing the job automatically.
Since:
3.0
  • Method Details

    • getId

      long getId()
      Returns the ID of this job.
      Throws:
      IllegalStateException - if the job has not started yet, and thus has no ID.
    • getIdString

      @Nonnull default String getIdString()
      Returns the string representation of this job's ID.
    • getConfig

      @Nonnull JobConfig getConfig()
      Returns the configuration this job was submitted with. Changes made to the returned config object will not have any effect.
    • getName

      @Nullable default String getName()
      Returns the name of this job or null if no name was supplied.

      Jobs can be named through JobConfig.setName(String) prior to submission.

    • getSubmissionTime

      long getSubmissionTime()
      Returns the time when the job was submitted to the cluster.

      The time is assigned by reading System.currentTimeMillis() of the master member that executes the job for the first time. It doesn't change on restart.

    • getStatus

      @Nonnull JobStatus getStatus()
      Returns the current status of this job.
    • getSuspensionCause

      @Nonnull JobSuspensionCause getSuspensionCause()
      Return a description of the cause that has led to the suspension of the job. Throws an IllegalStateException if the job is not currently suspended.
      Since:
      4.3
    • getMetrics

      @Nonnull JobMetrics getMetrics()
      Returns a snapshot of the current values of all job-specific metrics.

      While the job is running the metric values are updated periodically (see metrics collection frequency), assuming that both global metrics collection and per-job metrics collection are enabled. Otherwise empty metrics will be returned.

      Keep in mind that the collections may occur at different times on each member, metrics from various members aren't from the same instant.

      When a job is restarted (or resumed after being previously suspended) the metrics are reset too, their values will reflect only updates from the latest execution of the job.

      Once a job stops executing (successfully, after a failure, cancellation, or temporarily while suspended) the metrics will have their most recent values (i.e. the last metric values from the moment before the job completed), assuming that metrics storage was enabled. Otherwise empty metrics will be returned.

      Since:
      3.2
    • getFuture

      Gets the future associated with the job. The returned future is not cancellable. To cancel the job, the cancel() method should be used.
      Throws:
      IllegalStateException - if the job has not started yet.
    • join

      default void join()
      Waits for the job to complete and throws an exception if the job completes with an error. Does not return if the job gets suspended. Never returns for streaming (unbounded) jobs, unless they fail or are cancelled.

      Shorthand for job.getFuture().join().

      Throws:
      CancellationException - if the job was cancelled
    • restart

      void restart()
      Gracefully stops the current execution and schedules a new execution with the current member list of the Jet cluster. Can be called to manually make use of added members, if auto scaling is disabled. Only a running job can be restarted; a suspended job must be resumed.

      Conceptually this call is equivalent to suspend() & resume().

      Throws:
      IllegalStateException - if the job is not running, for example it has already completed, is not yet running, is already restarting, suspended etc.
    • suspend

      void suspend()
      Gracefully suspends the current execution of the job. The job's status will become JobStatus.SUSPENDED. To resume the job, call resume().

      You can suspend a job even if it's not configured for snapshotting. Such a job will resume with empty state, as if it has just been started.

      This call just initiates the suspension process and doesn't wait for it to complete. Suspension starts with creating a terminal state snapshot. Should the terminal snapshot fail, the job will suspend anyway, but the previous snapshot (if there was one) won't be deleted. When the job resumes, its processing starts from the point of the last snapshot.

      NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of being suspended, it may end up getting immediately restarted. Call getStatus() to find out and possibly try to suspend again.

      Throws:
      IllegalStateException - if the job is not running
    • resume

      void resume()
      Resumes a suspended job. The job will resume from the last known successful snapshot, if there is one.

      If the job is not suspended, it does nothing.

    • cancel

      void cancel()
      Makes a request to cancel this job and returns. The job will complete after its execution has stopped on all the nodes. If the job is already suspended, Jet will delete its runtime resources and snapshots and it won't be able to resume again.

      NOTE: if the cluster becomes unstable (a member leaves or similar) while the job is in the process of cancellation, it may end up getting restarted after the cluster has stabilized and won't be cancelled. Call getStatus() to find out and possibly try to cancel again.

      The job status will be JobStatus.FAILED after cancellation, join() will throw a CancellationException.

      See cancelAndExportSnapshot(String) to cancel with a terminal snapshot.

      Throws:
      IllegalStateException - if the cluster is not in a state to restart the job, for example when coordinator member left and new coordinator did not yet load job's metadata.
    • cancelAndExportSnapshot

      JobStateSnapshot cancelAndExportSnapshot​(String name)
      Exports and saves a state snapshot with the given name, and then cancels the job without processing any more data after the barrier (graceful cancellation). It's similar to suspend() followed by a cancel(), except that it won't process any more data after the snapshot.

      You can use the exported snapshot as a starting point for a new job. The job doesn't need to execute the same Pipeline as the job that created it, it must just be compatible with its state data. To achieve this, use JobConfig.setInitialSnapshotName(String).

      Unlike exportSnapshot(java.lang.String) method, when a snapshot is created using this method Jet will commit the external transactions because this snapshot is the last one created for the job and it's safe to use it to continue the processing.

      If the terminal snapshot fails, Jet will suspend this job instead of cancelling it.

      You can call this method for a suspended job, too: in that case it will export the last successful snapshot and cancel the job.

      The method call will block until it has fully exported the snapshot, but may return before the job has stopped executing.

      For more information about "exported state" see exportSnapshot(String).

      The job status will be JobStatus.FAILED after cancellation, join() will throw a CancellationException.

      Parameters:
      name - name of the snapshot. If name is already used, it will be overwritten
      Throws:
      JetException - if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.
    • exportSnapshot

      JobStateSnapshot exportSnapshot​(String name)
      Exports a state snapshot and saves it under the given name. You can start a new job using the exported state using JobConfig.setInitialSnapshotName(String).

      The snapshot will be independent from the job that created it. Jet won't automatically delete the IMap it is exported into. You must manually call snapshot.destroy() to delete it. If your state is large, make sure you have enough memory to store it. The snapshot created using this method will also not be used for automatic restart - should the job fail, the previous automatically saved snapshot will be used.

      For transactional sources or sinks (that is those which use transactions to confirm reads or to commit writes), Jet will not commit the transactions when creating a snapshot using this method. The reason for this is that such connectors only achieve exactly-once guarantee if the job restarts from the latest snapshot. But, for example, if the job fails after exporting a snapshot but before it creates a new automatic one, the job would restart from the previous automatic snapshot and the stored internal and committed external state will be from a different point in time and a data loss will occur.

      If a snapshot with the same name already exists, it will be overwritten. If a snapshot is already in progress for this job (either automatic or user-requested), the requested one will wait and start immediately after the previous one completes. If a snapshot with the same name is requested for two jobs at the same time, their data will likely be damaged (similar to two processes writing to the same file).

      You can call this method on a suspended job: in that case it will export the last successful snapshot. You can also export the state of non-snapshotted jobs (those with ProcessingGuarantee.NONE).

      If you issue any graceful job-control actions such as a graceful member shutdown or suspending a snapshotted job while Jet is exporting a snapshot, they will wait in a queue for this snapshot to complete. Forceful job-control actions will interrupt the export procedure.

      You can access the exported state using JetInstance.getJobStateSnapshot(String).

      The method call will block until it has fully exported the snapshot.

      Parameters:
      name - name of the snapshot. If name is already used, it will be overwritten
      Throws:
      JetException - if the job is in an incorrect state: completed, cancelled or is in the process of restarting or suspending.