Can't Remove White Spaces from CSV Headers with Pandas

up vote
0
down vote

favorite

I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.

import pandas as pd



df = pd.read_csv("my.csv",low_memory=False)

df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')

asked Nov 19 at 21:03

Bill Hambone

678

1

See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31

add a comment |

up vote
0
down vote

favorite

I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.

import pandas as pd



df = pd.read_csv("my.csv",low_memory=False)

df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')

asked Nov 19 at 21:03

Bill Hambone

678

1

See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31

add a comment |

up vote
0
down vote

favorite

I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.

import pandas as pd



df = pd.read_csv("my.csv",low_memory=False)

df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')

asked Nov 19 at 21:03

Bill Hambone

678

I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.

import pandas as pd



df = pd.read_csv("my.csv",low_memory=False)

df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')

python pandas csv

asked Nov 19 at 21:03

Bill Hambone

678

asked Nov 19 at 21:03

Bill Hambone

678

asked Nov 19 at 21:03

Bill Hambone

678

asked Nov 19 at 21:03

Bill Hambone

678

asked Nov 19 at 21:03

Bill Hambone

678

1

See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31

add a comment |

1

See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31

See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31

add a comment |

3 Answers
3

active

oldest

votes

up vote
1
down vote

Try using a list comprehension.

df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]

answered Nov 19 at 21:05

576i

2,2091033

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

|
show 4 more comments

up vote
0
down vote

Tried using rename?

df.rename(index=str, columns={"A space": "a", "B space ": "c"})

answered Nov 19 at 21:12

Ken Dekalb

15011

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

add a comment |

up vote
0
down vote

I ditched Pandas and just used the CSV module in Python 2.7.

import csv

import re

import tempfile

import sys

import os

if sys.version_info >= (3, 3):

    from os import replace

elif sys.platform == "win32":

    from osreplace import replace

else:

    from os import rename as replace



newHeaderList = 



with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, 

    open('myFile.txt', 'rb') as f:

    r = csv.reader(f, delimiter = 't')

    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)

    header = next(r)

    for h in header:

        headerNoSpace = re.sub("s+", "_", h.strip())

        newHeaderList.append(headerNoSpace)

    w.writerow(newHeaderList)

    for row in r:

        w.writerow(row)



os.rename(tmp.name, new_text_filepath)





new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')

out_csv = csv.writer(open('myFile.csv', 'wb'))

out_csv.writerows(new_txt)

answered Nov 20 at 18:42

Bill Hambone

678

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382584%2fcant-remove-white-spaces-from-csv-headers-with-pandas%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

up vote
1
down vote

Try using a list comprehension.

df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]

answered Nov 19 at 21:05

576i

2,2091033

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

|
show 4 more comments

up vote
1
down vote

Try using a list comprehension.

df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]

answered Nov 19 at 21:05

576i

2,2091033

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

|
show 4 more comments

up vote
1
down vote

Try using a list comprehension.

df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]

answered Nov 19 at 21:05

576i

2,2091033

Try using a list comprehension.

df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]

answered Nov 19 at 21:05

576i

2,2091033

answered Nov 19 at 21:05

576i

2,2091033

answered Nov 19 at 21:05

576i

2,2091033

answered Nov 19 at 21:05

576i

2,2091033

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

|
show 4 more comments

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

This also didn't work
– Bill Hambone
Nov 19 at 21:07

Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11

So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12

It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20

No. From the 'u'prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23

|
show 4 more comments

up vote
0
down vote

Tried using rename?

df.rename(index=str, columns={"A space": "a", "B space ": "c"})

answered Nov 19 at 21:12

Ken Dekalb

15011

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

add a comment |

up vote
0
down vote

Tried using rename?

df.rename(index=str, columns={"A space": "a", "B space ": "c"})

answered Nov 19 at 21:12

Ken Dekalb

15011

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

add a comment |

up vote
0
down vote

Tried using rename?

df.rename(index=str, columns={"A space": "a", "B space ": "c"})

answered Nov 19 at 21:12

Ken Dekalb

15011

Tried using rename?

df.rename(index=str, columns={"A space": "a", "B space ": "c"})

answered Nov 19 at 21:12

Ken Dekalb

15011

answered Nov 19 at 21:12

Ken Dekalb

15011

answered Nov 19 at 21:12

Ken Dekalb

15011

answered Nov 19 at 21:12

Ken Dekalb

15011

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

add a comment |

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18

Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19

add a comment |

up vote
0
down vote

I ditched Pandas and just used the CSV module in Python 2.7.

import csv

import re

import tempfile

import sys

import os

if sys.version_info >= (3, 3):

    from os import replace

elif sys.platform == "win32":

    from osreplace import replace

else:

    from os import rename as replace



newHeaderList = 



with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, 

    open('myFile.txt', 'rb') as f:

    r = csv.reader(f, delimiter = 't')

    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)

    header = next(r)

    for h in header:

        headerNoSpace = re.sub("s+", "_", h.strip())

        newHeaderList.append(headerNoSpace)

    w.writerow(newHeaderList)

    for row in r:

        w.writerow(row)



os.rename(tmp.name, new_text_filepath)





new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')

out_csv = csv.writer(open('myFile.csv', 'wb'))

out_csv.writerows(new_txt)

answered Nov 20 at 18:42

Bill Hambone

678

add a comment |

up vote
0
down vote

I ditched Pandas and just used the CSV module in Python 2.7.

import csv

import re

import tempfile

import sys

import os

if sys.version_info >= (3, 3):

    from os import replace

elif sys.platform == "win32":

    from osreplace import replace

else:

    from os import rename as replace



newHeaderList = 



with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, 

    open('myFile.txt', 'rb') as f:

    r = csv.reader(f, delimiter = 't')

    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)

    header = next(r)

    for h in header:

        headerNoSpace = re.sub("s+", "_", h.strip())

        newHeaderList.append(headerNoSpace)

    w.writerow(newHeaderList)

    for row in r:

        w.writerow(row)



os.rename(tmp.name, new_text_filepath)





new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')

out_csv = csv.writer(open('myFile.csv', 'wb'))

out_csv.writerows(new_txt)

answered Nov 20 at 18:42

Bill Hambone

678

add a comment |

up vote
0
down vote

I ditched Pandas and just used the CSV module in Python 2.7.

import csv

import re

import tempfile

import sys

import os

if sys.version_info >= (3, 3):

    from os import replace

elif sys.platform == "win32":

    from osreplace import replace

else:

    from os import rename as replace



newHeaderList = 



with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, 

    open('myFile.txt', 'rb') as f:

    r = csv.reader(f, delimiter = 't')

    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)

    header = next(r)

    for h in header:

        headerNoSpace = re.sub("s+", "_", h.strip())

        newHeaderList.append(headerNoSpace)

    w.writerow(newHeaderList)

    for row in r:

        w.writerow(row)



os.rename(tmp.name, new_text_filepath)





new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')

out_csv = csv.writer(open('myFile.csv', 'wb'))

out_csv.writerows(new_txt)

answered Nov 20 at 18:42

Bill Hambone

678

I ditched Pandas and just used the CSV module in Python 2.7.

import csv

import re

import tempfile

import sys

import os

if sys.version_info >= (3, 3):

    from os import replace

elif sys.platform == "win32":

    from osreplace import replace

else:

    from os import rename as replace



newHeaderList = 



with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp, 

    open('myFile.txt', 'rb') as f:

    r = csv.reader(f, delimiter = 't')

    w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)

    header = next(r)

    for h in header:

        headerNoSpace = re.sub("s+", "_", h.strip())

        newHeaderList.append(headerNoSpace)

    w.writerow(newHeaderList)

    for row in r:

        w.writerow(row)



os.rename(tmp.name, new_text_filepath)





new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')

out_csv = csv.writer(open('myFile.csv', 'wb'))

out_csv.writerows(new_txt)

answered Nov 20 at 18:42

Bill Hambone

678

answered Nov 20 at 18:42

Bill Hambone

678

answered Nov 20 at 18:42

Bill Hambone

678

answered Nov 20 at 18:42

Bill Hambone

678

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg