why Tasks idle for sometime while running Spark job












2















captured screenshot from Executors tab



I'm running spark job and noticed that after few stages completion, tasks were idle for sometime and again started.



Spark version - 2.2 and Java 1.8



Total Nodes - 3(including master)



Total cores - 16(8 for each datanode)



Total memory - 16 GB(8 for each)



below is spark submit command I used.



spark-submit --master yarn --deploy-mode cluster --executor-memory 1G --executor-cores 2 --num-executors 6 --jars jar1  --class wordcount wordcount.jar


Is there any reason why tasks goes to idle state?. if yes,what could be the reason.



Please find the attached screen shot which shows no active tasks are running for sometime.



Thanks.










share|improve this question



























    2















    captured screenshot from Executors tab



    I'm running spark job and noticed that after few stages completion, tasks were idle for sometime and again started.



    Spark version - 2.2 and Java 1.8



    Total Nodes - 3(including master)



    Total cores - 16(8 for each datanode)



    Total memory - 16 GB(8 for each)



    below is spark submit command I used.



    spark-submit --master yarn --deploy-mode cluster --executor-memory 1G --executor-cores 2 --num-executors 6 --jars jar1  --class wordcount wordcount.jar


    Is there any reason why tasks goes to idle state?. if yes,what could be the reason.



    Please find the attached screen shot which shows no active tasks are running for sometime.



    Thanks.










    share|improve this question

























      2












      2








      2








      captured screenshot from Executors tab



      I'm running spark job and noticed that after few stages completion, tasks were idle for sometime and again started.



      Spark version - 2.2 and Java 1.8



      Total Nodes - 3(including master)



      Total cores - 16(8 for each datanode)



      Total memory - 16 GB(8 for each)



      below is spark submit command I used.



      spark-submit --master yarn --deploy-mode cluster --executor-memory 1G --executor-cores 2 --num-executors 6 --jars jar1  --class wordcount wordcount.jar


      Is there any reason why tasks goes to idle state?. if yes,what could be the reason.



      Please find the attached screen shot which shows no active tasks are running for sometime.



      Thanks.










      share|improve this question














      captured screenshot from Executors tab



      I'm running spark job and noticed that after few stages completion, tasks were idle for sometime and again started.



      Spark version - 2.2 and Java 1.8



      Total Nodes - 3(including master)



      Total cores - 16(8 for each datanode)



      Total memory - 16 GB(8 for each)



      below is spark submit command I used.



      spark-submit --master yarn --deploy-mode cluster --executor-memory 1G --executor-cores 2 --num-executors 6 --jars jar1  --class wordcount wordcount.jar


      Is there any reason why tasks goes to idle state?. if yes,what could be the reason.



      Please find the attached screen shot which shows no active tasks are running for sometime.



      Thanks.







      java scala apache-spark






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 25 '18 at 12:44









      SekharSekhar

      456




      456
























          1 Answer
          1






          active

          oldest

          votes


















          0














          Probably you have some group operation and the result of this operation you see as a reduced number of partitions.



          Also it can be a badly distributed job (bad in terms of data - some nodes / partitions are heavier than others and you need to await their completion to go to the next step).



          Some code examples can help to make any sense of this UI screenshot, but as a possible solution - just walk thorough your code carefully and try to catch group / repartitioning operations, take care about your partitioning scheme - mb it's an expected behaviour in your case, and double check IO operations (yes, it's probably smth that you've already checked - but also happens sometimes).






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53467562%2fwhy-tasks-idle-for-sometime-while-running-spark-job%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            Probably you have some group operation and the result of this operation you see as a reduced number of partitions.



            Also it can be a badly distributed job (bad in terms of data - some nodes / partitions are heavier than others and you need to await their completion to go to the next step).



            Some code examples can help to make any sense of this UI screenshot, but as a possible solution - just walk thorough your code carefully and try to catch group / repartitioning operations, take care about your partitioning scheme - mb it's an expected behaviour in your case, and double check IO operations (yes, it's probably smth that you've already checked - but also happens sometimes).






            share|improve this answer




























              0














              Probably you have some group operation and the result of this operation you see as a reduced number of partitions.



              Also it can be a badly distributed job (bad in terms of data - some nodes / partitions are heavier than others and you need to await their completion to go to the next step).



              Some code examples can help to make any sense of this UI screenshot, but as a possible solution - just walk thorough your code carefully and try to catch group / repartitioning operations, take care about your partitioning scheme - mb it's an expected behaviour in your case, and double check IO operations (yes, it's probably smth that you've already checked - but also happens sometimes).






              share|improve this answer


























                0












                0








                0







                Probably you have some group operation and the result of this operation you see as a reduced number of partitions.



                Also it can be a badly distributed job (bad in terms of data - some nodes / partitions are heavier than others and you need to await their completion to go to the next step).



                Some code examples can help to make any sense of this UI screenshot, but as a possible solution - just walk thorough your code carefully and try to catch group / repartitioning operations, take care about your partitioning scheme - mb it's an expected behaviour in your case, and double check IO operations (yes, it's probably smth that you've already checked - but also happens sometimes).






                share|improve this answer













                Probably you have some group operation and the result of this operation you see as a reduced number of partitions.



                Also it can be a badly distributed job (bad in terms of data - some nodes / partitions are heavier than others and you need to await their completion to go to the next step).



                Some code examples can help to make any sense of this UI screenshot, but as a possible solution - just walk thorough your code carefully and try to catch group / repartitioning operations, take care about your partitioning scheme - mb it's an expected behaviour in your case, and double check IO operations (yes, it's probably smth that you've already checked - but also happens sometimes).







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 26 '18 at 9:30









                DaunnCDaunnC

                9521026




                9521026
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53467562%2fwhy-tasks-idle-for-sometime-while-running-spark-job%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Wiesbaden

                    Marschland

                    Dieringhausen