Can't Remove White Spaces from CSV Headers with Pandas











up vote
0
down vote

favorite












I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.



import pandas as pd

df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')









share|improve this question


















  • 1




    See discussion below, this is a Python 2 related question.
    – 576i
    Nov 19 at 21:31















up vote
0
down vote

favorite












I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.



import pandas as pd

df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')









share|improve this question


















  • 1




    See discussion below, this is a Python 2 related question.
    – 576i
    Nov 19 at 21:31













up vote
0
down vote

favorite









up vote
0
down vote

favorite











I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.



import pandas as pd

df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')









share|improve this question













I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.



import pandas as pd

df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')






python pandas csv






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 19 at 21:03









Bill Hambone

678




678








  • 1




    See discussion below, this is a Python 2 related question.
    – 576i
    Nov 19 at 21:31














  • 1




    See discussion below, this is a Python 2 related question.
    – 576i
    Nov 19 at 21:31








1




1




See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31




See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31












3 Answers
3






active

oldest

votes

















up vote
1
down vote













Try using a list comprehension.



df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]





share|improve this answer





















  • This also didn't work
    – Bill Hambone
    Nov 19 at 21:07










  • Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
    – 576i
    Nov 19 at 21:11










  • So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
    – Bill Hambone
    Nov 19 at 21:12












  • It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
    – Bill Hambone
    Nov 19 at 21:20










  • No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
    – 576i
    Nov 19 at 21:23


















up vote
0
down vote













Tried using rename?



df.rename(index=str, columns={"A space": "a", "B space ": "c"})





share|improve this answer





















  • I tried this as well, still no change.
    – Bill Hambone
    Nov 19 at 21:18










  • Could you provide a reproducible code?
    – Ken Dekalb
    Nov 19 at 21:19


















up vote
0
down vote













I ditched Pandas and just used the CSV module in Python 2.7.



import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace

newHeaderList =

with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)

os.rename(tmp.name, new_text_filepath)


new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)





share|improve this answer





















    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382584%2fcant-remove-white-spaces-from-csv-headers-with-pandas%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    3 Answers
    3






    active

    oldest

    votes








    3 Answers
    3






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    Try using a list comprehension.



    df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]





    share|improve this answer





















    • This also didn't work
      – Bill Hambone
      Nov 19 at 21:07










    • Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
      – 576i
      Nov 19 at 21:11










    • So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
      – Bill Hambone
      Nov 19 at 21:12












    • It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
      – Bill Hambone
      Nov 19 at 21:20










    • No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
      – 576i
      Nov 19 at 21:23















    up vote
    1
    down vote













    Try using a list comprehension.



    df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]





    share|improve this answer





















    • This also didn't work
      – Bill Hambone
      Nov 19 at 21:07










    • Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
      – 576i
      Nov 19 at 21:11










    • So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
      – Bill Hambone
      Nov 19 at 21:12












    • It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
      – Bill Hambone
      Nov 19 at 21:20










    • No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
      – 576i
      Nov 19 at 21:23













    up vote
    1
    down vote










    up vote
    1
    down vote









    Try using a list comprehension.



    df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]





    share|improve this answer












    Try using a list comprehension.



    df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 19 at 21:05









    576i

    2,2091033




    2,2091033












    • This also didn't work
      – Bill Hambone
      Nov 19 at 21:07










    • Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
      – 576i
      Nov 19 at 21:11










    • So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
      – Bill Hambone
      Nov 19 at 21:12












    • It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
      – Bill Hambone
      Nov 19 at 21:20










    • No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
      – 576i
      Nov 19 at 21:23


















    • This also didn't work
      – Bill Hambone
      Nov 19 at 21:07










    • Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
      – 576i
      Nov 19 at 21:11










    • So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
      – Bill Hambone
      Nov 19 at 21:12












    • It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
      – Bill Hambone
      Nov 19 at 21:20










    • No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
      – 576i
      Nov 19 at 21:23
















    This also didn't work
    – Bill Hambone
    Nov 19 at 21:07




    This also didn't work
    – Bill Hambone
    Nov 19 at 21:07












    Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
    – 576i
    Nov 19 at 21:11




    Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
    – 576i
    Nov 19 at 21:11












    So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
    – Bill Hambone
    Nov 19 at 21:12






    So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
    – Bill Hambone
    Nov 19 at 21:12














    It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
    – Bill Hambone
    Nov 19 at 21:20




    It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
    – Bill Hambone
    Nov 19 at 21:20












    No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
    – 576i
    Nov 19 at 21:23




    No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
    – 576i
    Nov 19 at 21:23












    up vote
    0
    down vote













    Tried using rename?



    df.rename(index=str, columns={"A space": "a", "B space ": "c"})





    share|improve this answer





















    • I tried this as well, still no change.
      – Bill Hambone
      Nov 19 at 21:18










    • Could you provide a reproducible code?
      – Ken Dekalb
      Nov 19 at 21:19















    up vote
    0
    down vote













    Tried using rename?



    df.rename(index=str, columns={"A space": "a", "B space ": "c"})





    share|improve this answer





















    • I tried this as well, still no change.
      – Bill Hambone
      Nov 19 at 21:18










    • Could you provide a reproducible code?
      – Ken Dekalb
      Nov 19 at 21:19













    up vote
    0
    down vote










    up vote
    0
    down vote









    Tried using rename?



    df.rename(index=str, columns={"A space": "a", "B space ": "c"})





    share|improve this answer












    Tried using rename?



    df.rename(index=str, columns={"A space": "a", "B space ": "c"})






    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Nov 19 at 21:12









    Ken Dekalb

    15011




    15011












    • I tried this as well, still no change.
      – Bill Hambone
      Nov 19 at 21:18










    • Could you provide a reproducible code?
      – Ken Dekalb
      Nov 19 at 21:19


















    • I tried this as well, still no change.
      – Bill Hambone
      Nov 19 at 21:18










    • Could you provide a reproducible code?
      – Ken Dekalb
      Nov 19 at 21:19
















    I tried this as well, still no change.
    – Bill Hambone
    Nov 19 at 21:18




    I tried this as well, still no change.
    – Bill Hambone
    Nov 19 at 21:18












    Could you provide a reproducible code?
    – Ken Dekalb
    Nov 19 at 21:19




    Could you provide a reproducible code?
    – Ken Dekalb
    Nov 19 at 21:19










    up vote
    0
    down vote













    I ditched Pandas and just used the CSV module in Python 2.7.



    import csv
    import re
    import tempfile
    import sys
    import os
    if sys.version_info >= (3, 3):
    from os import replace
    elif sys.platform == "win32":
    from osreplace import replace
    else:
    from os import rename as replace

    newHeaderList =

    with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
    open('myFile.txt', 'rb') as f:
    r = csv.reader(f, delimiter = 't')
    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
    header = next(r)
    for h in header:
    headerNoSpace = re.sub("s+", "_", h.strip())
    newHeaderList.append(headerNoSpace)
    w.writerow(newHeaderList)
    for row in r:
    w.writerow(row)

    os.rename(tmp.name, new_text_filepath)


    new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
    out_csv = csv.writer(open('myFile.csv', 'wb'))
    out_csv.writerows(new_txt)





    share|improve this answer

























      up vote
      0
      down vote













      I ditched Pandas and just used the CSV module in Python 2.7.



      import csv
      import re
      import tempfile
      import sys
      import os
      if sys.version_info >= (3, 3):
      from os import replace
      elif sys.platform == "win32":
      from osreplace import replace
      else:
      from os import rename as replace

      newHeaderList =

      with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
      open('myFile.txt', 'rb') as f:
      r = csv.reader(f, delimiter = 't')
      w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
      header = next(r)
      for h in header:
      headerNoSpace = re.sub("s+", "_", h.strip())
      newHeaderList.append(headerNoSpace)
      w.writerow(newHeaderList)
      for row in r:
      w.writerow(row)

      os.rename(tmp.name, new_text_filepath)


      new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
      out_csv = csv.writer(open('myFile.csv', 'wb'))
      out_csv.writerows(new_txt)





      share|improve this answer























        up vote
        0
        down vote










        up vote
        0
        down vote









        I ditched Pandas and just used the CSV module in Python 2.7.



        import csv
        import re
        import tempfile
        import sys
        import os
        if sys.version_info >= (3, 3):
        from os import replace
        elif sys.platform == "win32":
        from osreplace import replace
        else:
        from os import rename as replace

        newHeaderList =

        with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
        open('myFile.txt', 'rb') as f:
        r = csv.reader(f, delimiter = 't')
        w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
        header = next(r)
        for h in header:
        headerNoSpace = re.sub("s+", "_", h.strip())
        newHeaderList.append(headerNoSpace)
        w.writerow(newHeaderList)
        for row in r:
        w.writerow(row)

        os.rename(tmp.name, new_text_filepath)


        new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
        out_csv = csv.writer(open('myFile.csv', 'wb'))
        out_csv.writerows(new_txt)





        share|improve this answer












        I ditched Pandas and just used the CSV module in Python 2.7.



        import csv
        import re
        import tempfile
        import sys
        import os
        if sys.version_info >= (3, 3):
        from os import replace
        elif sys.platform == "win32":
        from osreplace import replace
        else:
        from os import rename as replace

        newHeaderList =

        with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
        open('myFile.txt', 'rb') as f:
        r = csv.reader(f, delimiter = 't')
        w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
        header = next(r)
        for h in header:
        headerNoSpace = re.sub("s+", "_", h.strip())
        newHeaderList.append(headerNoSpace)
        w.writerow(newHeaderList)
        for row in r:
        w.writerow(row)

        os.rename(tmp.name, new_text_filepath)


        new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
        out_csv = csv.writer(open('myFile.csv', 'wb'))
        out_csv.writerows(new_txt)






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Nov 20 at 18:42









        Bill Hambone

        678




        678






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.





            Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


            Please pay close attention to the following guidance:


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382584%2fcant-remove-white-spaces-from-csv-headers-with-pandas%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wiesbaden

            Marschland

            Dieringhausen