pyspark got Py4JNetworkError(“Answer from Java side is empty”) when exit python












1















Background:




  • spark standalone cluster mode on k8s

  • spark 2.2.1

  • hadoop 2.7.6

  • run code in python, not in pyspark

  • client mode, not cluster mode


The pyspark code in python, not in pyspark env.
Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().





{{py4j.java_gateway:1038}} INFO - Error while receiving.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
raise Py4JNetworkError("Answer from Java side is empty")
Py4JNetworkError: Answer from Java side is empty
[2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
"Error while receiving", e, proto.ERROR_ON_RECEIVE)
Py4JNetworkError: Error while receiving
[2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
socket_instance.shutdown(socket.SHUT_RDWR)
File "/usr/lib64/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
raise error(EBADF, 'Bad file descriptor')
error: [Errno 9] Bad file descriptor




I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...



Any suggestion?










share|improve this question



























    1















    Background:




    • spark standalone cluster mode on k8s

    • spark 2.2.1

    • hadoop 2.7.6

    • run code in python, not in pyspark

    • client mode, not cluster mode


    The pyspark code in python, not in pyspark env.
    Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().





    {{py4j.java_gateway:1038}} INFO - Error while receiving.
    Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
    Py4JNetworkError: Answer from Java side is empty
    [2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
    Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
    response = connection.send_command(command)
    File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
    Py4JNetworkError: Error while receiving
    [2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
    Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
    socket_instance.shutdown(socket.SHUT_RDWR)
    File "/usr/lib64/python2.7/socket.py", line 224, in meth
    return getattr(self._sock,name)(*args)
    File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
    raise error(EBADF, 'Bad file descriptor')
    error: [Errno 9] Bad file descriptor




    I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...



    Any suggestion?










    share|improve this question

























      1












      1








      1








      Background:




      • spark standalone cluster mode on k8s

      • spark 2.2.1

      • hadoop 2.7.6

      • run code in python, not in pyspark

      • client mode, not cluster mode


      The pyspark code in python, not in pyspark env.
      Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().





      {{py4j.java_gateway:1038}} INFO - Error while receiving.
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
      raise Py4JNetworkError("Answer from Java side is empty")
      Py4JNetworkError: Answer from Java side is empty
      [2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
      response = connection.send_command(command)
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
      "Error while receiving", e, proto.ERROR_ON_RECEIVE)
      Py4JNetworkError: Error while receiving
      [2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
      socket_instance.shutdown(socket.SHUT_RDWR)
      File "/usr/lib64/python2.7/socket.py", line 224, in meth
      return getattr(self._sock,name)(*args)
      File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
      raise error(EBADF, 'Bad file descriptor')
      error: [Errno 9] Bad file descriptor




      I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...



      Any suggestion?










      share|improve this question














      Background:




      • spark standalone cluster mode on k8s

      • spark 2.2.1

      • hadoop 2.7.6

      • run code in python, not in pyspark

      • client mode, not cluster mode


      The pyspark code in python, not in pyspark env.
      Every code can work and get it down. But 'sometimes', when the code finish and exit, below error will show up even time.sleep(10) after spark.stop().





      {{py4j.java_gateway:1038}} INFO - Error while receiving.
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1035, in send_command
      raise Py4JNetworkError("Answer from Java side is empty")
      Py4JNetworkError: Answer from Java side is empty
      [2018-11-22 09:06:40,293] {{root:899}} ERROR - Exception while sending command.
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 883, in send_command
      response = connection.send_command(command)
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 1040, in send_command
      "Error while receiving", e, proto.ERROR_ON_RECEIVE)
      Py4JNetworkError: Error while receiving
      [2018-11-22 09:06:40,293] {{py4j.java_gateway:443}} DEBUG - Exception while shutting down a socket
      Traceback (most recent call last):
      File "/usr/lib/python2.7/site-packages/py4j-0.10.4-py2.7.egg/py4j/java_gateway.py", line 441, in quiet_shutdown
      socket_instance.shutdown(socket.SHUT_RDWR)
      File "/usr/lib64/python2.7/socket.py", line 224, in meth
      return getattr(self._sock,name)(*args)
      File "/usr/lib64/python2.7/socket.py", line 170, in _dummy
      raise error(EBADF, 'Bad file descriptor')
      error: [Errno 9] Bad file descriptor




      I guess the reason is parent process python try to get log message from terminated child process 'jvm'. But the wired thing is the error not always raise...



      Any suggestion?







      python apache-spark pyspark apache-spark-standalone






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 23 '18 at 3:26









      Jayce LiJayce Li

      63




      63
























          1 Answer
          1






          active

          oldest

          votes


















          0














          This root-cause is 'py4j' log-level.



          I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.



          So setting python log-level to INFO or more higher level will resolve this problem.



          ref: Gateway raises an exception when shut down



          ref: Tune down the logging level for callback server messages



          ref: PySpark Internals






          share|improve this answer























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440309%2fpyspark-got-py4jnetworkerroranswer-from-java-side-is-empty-when-exit-python%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0














            This root-cause is 'py4j' log-level.



            I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.



            So setting python log-level to INFO or more higher level will resolve this problem.



            ref: Gateway raises an exception when shut down



            ref: Tune down the logging level for callback server messages



            ref: PySpark Internals






            share|improve this answer




























              0














              This root-cause is 'py4j' log-level.



              I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.



              So setting python log-level to INFO or more higher level will resolve this problem.



              ref: Gateway raises an exception when shut down



              ref: Tune down the logging level for callback server messages



              ref: PySpark Internals






              share|improve this answer


























                0












                0








                0







                This root-cause is 'py4j' log-level.



                I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.



                So setting python log-level to INFO or more higher level will resolve this problem.



                ref: Gateway raises an exception when shut down



                ref: Tune down the logging level for callback server messages



                ref: PySpark Internals






                share|improve this answer













                This root-cause is 'py4j' log-level.



                I set python log-level to DEBUG, this let the 'py4j' client & 'java' raise connection error when close pyspark.



                So setting python log-level to INFO or more higher level will resolve this problem.



                ref: Gateway raises an exception when shut down



                ref: Tune down the logging level for callback server messages



                ref: PySpark Internals







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Dec 3 '18 at 10:16









                Jayce LiJayce Li

                63




                63






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440309%2fpyspark-got-py4jnetworkerroranswer-from-java-side-is-empty-when-exit-python%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    Tonle Sap (See)

                    I get strange results when I access the Sqlitedatabase with Unity C# via XAMPP

                    Guatemaltekische Davis-Cup-Mannschaft