Prepend header record (or a string / a file) to large file in Scala / Java











up vote
0
down vote

favorite
1












What is the most efficient (or recommended) way to prepend a string or a file to another large file in Scala, preferably without using external libraries? The large file can be binary.



E.g.



if prepend string is:
header_information|123.45|xyzn



and large file is:



abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...


I would expect to get:



header_information|123.45|xyz
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...









share|improve this question


















  • 1




    Why not plain unix?
    – erip
    Nov 20 at 2:45










  • @erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
    – Andrey Dmitriev
    Nov 20 at 9:12















up vote
0
down vote

favorite
1












What is the most efficient (or recommended) way to prepend a string or a file to another large file in Scala, preferably without using external libraries? The large file can be binary.



E.g.



if prepend string is:
header_information|123.45|xyzn



and large file is:



abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...


I would expect to get:



header_information|123.45|xyz
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...









share|improve this question


















  • 1




    Why not plain unix?
    – erip
    Nov 20 at 2:45










  • @erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
    – Andrey Dmitriev
    Nov 20 at 9:12













up vote
0
down vote

favorite
1









up vote
0
down vote

favorite
1






1





What is the most efficient (or recommended) way to prepend a string or a file to another large file in Scala, preferably without using external libraries? The large file can be binary.



E.g.



if prepend string is:
header_information|123.45|xyzn



and large file is:



abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...


I would expect to get:



header_information|123.45|xyz
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...









share|improve this question













What is the most efficient (or recommended) way to prepend a string or a file to another large file in Scala, preferably without using external libraries? The large file can be binary.



E.g.



if prepend string is:
header_information|123.45|xyzn



and large file is:



abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...


I would expect to get:



header_information|123.45|xyz
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
abcdefghijklmnopqrstuvwxyz0123456789
...






scala io prepend






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 20 at 1:09









Andrey Dmitriev

1571317




1571317








  • 1




    Why not plain unix?
    – erip
    Nov 20 at 2:45










  • @erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
    – Andrey Dmitriev
    Nov 20 at 9:12














  • 1




    Why not plain unix?
    – erip
    Nov 20 at 2:45










  • @erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
    – Andrey Dmitriev
    Nov 20 at 9:12








1




1




Why not plain unix?
– erip
Nov 20 at 2:45




Why not plain unix?
– erip
Nov 20 at 2:45












@erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
– Andrey Dmitriev
Nov 20 at 9:12




@erip Because in this case it will be workaround and second it will not necessarily always be unix filesystem, it can be AWS S3 or something else.
– Andrey Dmitriev
Nov 20 at 9:12












1 Answer
1






active

oldest

votes

















up vote
0
down vote



accepted










I come up with the following solution:




  1. Turn prepend string/file into InputStream

  2. Turn large file into InputStream

  3. "Combine" InputStreams together using java.io.SequenceInputStream


  4. Use java.nio.file.Files.copy to write to target file



    object FileAppender {
    def main(args: Array[String]): Unit = {
    val stringToPrepend = new ByteArrayInputStream("header_information|123.45|xyzn".getBytes)
    val largeFile = new FileInputStream("big_file.dat")
    Files.copy(
    new SequenceInputStream(stringToPrepend, largeFile),
    Paths.get("output_file.dat"),
    StandardCopyOption.REPLACE_EXISTING
    )
    }
    }



Tested on ~30GB file, took ~40 seconds on MacBookPro (3.3GHz/16GB).



This approach can be used (if necessary) to combine multiple partitioned files created by e.g. Spark engine.






share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53384829%2fprepend-header-record-or-a-string-a-file-to-large-file-in-scala-java%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote



    accepted










    I come up with the following solution:




    1. Turn prepend string/file into InputStream

    2. Turn large file into InputStream

    3. "Combine" InputStreams together using java.io.SequenceInputStream


    4. Use java.nio.file.Files.copy to write to target file



      object FileAppender {
      def main(args: Array[String]): Unit = {
      val stringToPrepend = new ByteArrayInputStream("header_information|123.45|xyzn".getBytes)
      val largeFile = new FileInputStream("big_file.dat")
      Files.copy(
      new SequenceInputStream(stringToPrepend, largeFile),
      Paths.get("output_file.dat"),
      StandardCopyOption.REPLACE_EXISTING
      )
      }
      }



    Tested on ~30GB file, took ~40 seconds on MacBookPro (3.3GHz/16GB).



    This approach can be used (if necessary) to combine multiple partitioned files created by e.g. Spark engine.






    share|improve this answer

























      up vote
      0
      down vote



      accepted










      I come up with the following solution:




      1. Turn prepend string/file into InputStream

      2. Turn large file into InputStream

      3. "Combine" InputStreams together using java.io.SequenceInputStream


      4. Use java.nio.file.Files.copy to write to target file



        object FileAppender {
        def main(args: Array[String]): Unit = {
        val stringToPrepend = new ByteArrayInputStream("header_information|123.45|xyzn".getBytes)
        val largeFile = new FileInputStream("big_file.dat")
        Files.copy(
        new SequenceInputStream(stringToPrepend, largeFile),
        Paths.get("output_file.dat"),
        StandardCopyOption.REPLACE_EXISTING
        )
        }
        }



      Tested on ~30GB file, took ~40 seconds on MacBookPro (3.3GHz/16GB).



      This approach can be used (if necessary) to combine multiple partitioned files created by e.g. Spark engine.






      share|improve this answer























        up vote
        0
        down vote



        accepted







        up vote
        0
        down vote



        accepted






        I come up with the following solution:




        1. Turn prepend string/file into InputStream

        2. Turn large file into InputStream

        3. "Combine" InputStreams together using java.io.SequenceInputStream


        4. Use java.nio.file.Files.copy to write to target file



          object FileAppender {
          def main(args: Array[String]): Unit = {
          val stringToPrepend = new ByteArrayInputStream("header_information|123.45|xyzn".getBytes)
          val largeFile = new FileInputStream("big_file.dat")
          Files.copy(
          new SequenceInputStream(stringToPrepend, largeFile),
          Paths.get("output_file.dat"),
          StandardCopyOption.REPLACE_EXISTING
          )
          }
          }



        Tested on ~30GB file, took ~40 seconds on MacBookPro (3.3GHz/16GB).



        This approach can be used (if necessary) to combine multiple partitioned files created by e.g. Spark engine.






        share|improve this answer












        I come up with the following solution:




        1. Turn prepend string/file into InputStream

        2. Turn large file into InputStream

        3. "Combine" InputStreams together using java.io.SequenceInputStream


        4. Use java.nio.file.Files.copy to write to target file



          object FileAppender {
          def main(args: Array[String]): Unit = {
          val stringToPrepend = new ByteArrayInputStream("header_information|123.45|xyzn".getBytes)
          val largeFile = new FileInputStream("big_file.dat")
          Files.copy(
          new SequenceInputStream(stringToPrepend, largeFile),
          Paths.get("output_file.dat"),
          StandardCopyOption.REPLACE_EXISTING
          )
          }
          }



        Tested on ~30GB file, took ~40 seconds on MacBookPro (3.3GHz/16GB).



        This approach can be used (if necessary) to combine multiple partitioned files created by e.g. Spark engine.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 20 at 1:43









        Andrey Dmitriev

        1571317




        1571317






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53384829%2fprepend-header-record-or-a-string-a-file-to-large-file-in-scala-java%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen