What happens when a Spark Job is submitted?
The driver
program then connects with the cluster manager and ignores for resources. On
behalf of driver,the cluster manager launches executors on the worker nodes.
Now the driver sends tasks to the cluster manager based on the data placement.
Before executors starts the execution, they first registers itself with the driver program so
that the driver has overview of all the executors. Now the executors start
executing the various tasks that was
assigned by the driver program. Driver program monitors the all the set
of executors that run at every point. It can also schedule the future tasks based
on data placement by checking the location of cached data. If the driver programs main () method exits or when
it calls the stop () method of the Spark Context, executors are terminated and
the resources are released from the cluster.
When the
application code is submitted, the driver implicitly converts the code
containing the transformations and actions into logical directed acyclic graph(DAG).During
this stage, the driver program performs
some optimizations like pipelining transformations and then the logical DAG is converted into physical
execution plan with set of stages.The physical execution plan is now created.
Once the plan is created , small physical execution units which are called tasks under each stage are
created. Tasks will be then bundled will be sent to the Spark Cluster.
0 Comments