1.

Explain Oozie Coordinator?

Answer»

Oozie Coordinator jobs are recurrent Oozie WORKFLOW jobs that are TRIGGERED by time and data availability.Oozie Coordinator can also manage multiple workflows that are dependent on the outcome of subsequent workflows. The outputs of subsequent workflows become the input to the next workflow. This chain is called a 'data application pipeline'.

Oozie processes coordinator jobs in a fixed timezone with no DST (typically UTC ), this timezone is referred as ‘Oozie processing timezone’. The Oozie processing timezone is used to resolve coordinator jobs start/end times, job pause times and the initial-instance of datasets. Also, all coordinator DATASET instance URI templates are RESOLVED to a datetime in the Oozie processing time-zone.

The usage of Oozie Coordinator can be categorized in 3 different segments:

Small: consisting of a single coordinator application with embedded dataset definitions

Medium: consisting of a single shared dataset definitions and a few coordinator applications

Large: consisting of a single or multiple shared dataset definitions and SEVERAL coordinator applications

Oozie Coordinator jobs are recurrent Oozie Workflow jobs that are triggered by time and data availability.Oozie Coordinator can also manage multiple workflows that are dependent on the outcome of subsequent workflows. The outputs of subsequent workflows become the input to the next workflow. This chain is called a 'data application pipeline'.

Oozie processes coordinator jobs in a fixed timezone with no DST (typically UTC ), this timezone is referred as ‘Oozie processing timezone’. The Oozie processing timezone is used to resolve coordinator jobs start/end times, job pause times and the initial-instance of datasets. Also, all coordinator dataset instance URI templates are resolved to a datetime in the Oozie processing time-zone.

The usage of Oozie Coordinator can be categorized in 3 different segments:

Small: consisting of a single coordinator application with embedded dataset definitions

Medium: consisting of a single shared dataset definitions and a few coordinator applications

Large: consisting of a single or multiple shared dataset definitions and several coordinator applications



Discussion

No Comment Found