Can't Remove White Spaces from CSV Headers with Pandas
up vote
0
down vote
favorite
I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.
import pandas as pd
df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
python pandas csv
add a comment |
up vote
0
down vote
favorite
I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.
import pandas as pd
df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
python pandas csv
1
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.
import pandas as pd
df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
python pandas csv
I'm trying to rename headers in a csv that have white spaces. Using these lines from the Pandas API reference is not working. The headers still have white spaces instead of underscores.
import pandas as pd
df = pd.read_csv("my.csv",low_memory=False)
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
python pandas csv
python pandas csv
asked Nov 19 at 21:03
Bill Hambone
678
678
1
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31
add a comment |
1
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31
1
1
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31
add a comment |
3 Answers
3
active
oldest
votes
up vote
1
down vote
Try using a list comprehension.
df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23
|
show 4 more comments
up vote
0
down vote
Tried using rename?
df.rename(index=str, columns={"A space": "a", "B space ": "c"})
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
add a comment |
up vote
0
down vote
I ditched Pandas and just used the CSV module in Python 2.7.
import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace
newHeaderList =
with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)
os.rename(tmp.name, new_text_filepath)
new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)
add a comment |
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
Try using a list comprehension.
df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23
|
show 4 more comments
up vote
1
down vote
Try using a list comprehension.
df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23
|
show 4 more comments
up vote
1
down vote
up vote
1
down vote
Try using a list comprehension.
df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]
Try using a list comprehension.
df.columns = [c.strip().lower().replace(' ', '_') for c in df.columns]
answered Nov 19 at 21:05
576i
2,2091033
2,2091033
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23
|
show 4 more comments
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.
– 576i
Nov 19 at 21:23
This also didn't work
– Bill Hambone
Nov 19 at 21:07
This also didn't work
– Bill Hambone
Nov 19 at 21:07
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
Strange, because I tried this with one of my dataframes and it's working here. Maybe you can re-check your code and if it still doesn't work, provide the version number of pandas & python you are using. I'm testing with python 3.6 & pandas 0.23.4
– 576i
Nov 19 at 21:11
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
So when I just print df.columns and don't try to replace anything, I get this list (I've shortened it). The first header already has an underscore in its name. Index([u'DWH_ID', u'ID', u'Actual Retirement', u'In Service', u'Gross Area (Imperial)', u'Gross Area', u'Area (Imperial)', u'Area (Metric)', u'Area', u'Area Units'])
– Bill Hambone
Nov 19 at 21:12
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
It also says dtype='object' at the end of the index. Could that be why string replace isn't working?
– Bill Hambone
Nov 19 at 21:20
No. From the
'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.– 576i
Nov 19 at 21:23
No. From the
'u'
prefix it seems you are using the outdated python2 instead of the current python3, so you may be using an outdated pandas as well. Please post your version numbers.– 576i
Nov 19 at 21:23
|
show 4 more comments
up vote
0
down vote
Tried using rename?
df.rename(index=str, columns={"A space": "a", "B space ": "c"})
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
add a comment |
up vote
0
down vote
Tried using rename?
df.rename(index=str, columns={"A space": "a", "B space ": "c"})
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
add a comment |
up vote
0
down vote
up vote
0
down vote
Tried using rename?
df.rename(index=str, columns={"A space": "a", "B space ": "c"})
Tried using rename?
df.rename(index=str, columns={"A space": "a", "B space ": "c"})
answered Nov 19 at 21:12
Ken Dekalb
15011
15011
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
add a comment |
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
I tried this as well, still no change.
– Bill Hambone
Nov 19 at 21:18
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
Could you provide a reproducible code?
– Ken Dekalb
Nov 19 at 21:19
add a comment |
up vote
0
down vote
I ditched Pandas and just used the CSV module in Python 2.7.
import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace
newHeaderList =
with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)
os.rename(tmp.name, new_text_filepath)
new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)
add a comment |
up vote
0
down vote
I ditched Pandas and just used the CSV module in Python 2.7.
import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace
newHeaderList =
with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)
os.rename(tmp.name, new_text_filepath)
new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)
add a comment |
up vote
0
down vote
up vote
0
down vote
I ditched Pandas and just used the CSV module in Python 2.7.
import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace
newHeaderList =
with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)
os.rename(tmp.name, new_text_filepath)
new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)
I ditched Pandas and just used the CSV module in Python 2.7.
import csv
import re
import tempfile
import sys
import os
if sys.version_info >= (3, 3):
from os import replace
elif sys.platform == "win32":
from osreplace import replace
else:
from os import rename as replace
newHeaderList =
with tempfile.NamedTemporaryFile(dir='.', delete=False) as tmp,
open('myFile.txt', 'rb') as f:
r = csv.reader(f, delimiter = 't')
w = csv.writer(tmp, delimiter = 't', quoting=csv.QUOTE_NONNUMERIC)
header = next(r)
for h in header:
headerNoSpace = re.sub("s+", "_", h.strip())
newHeaderList.append(headerNoSpace)
w.writerow(newHeaderList)
for row in r:
w.writerow(row)
os.rename(tmp.name, new_text_filepath)
new_txt = csv.reader(open('newFile.txt', "rb"), delimiter = 't')
out_csv = csv.writer(open('myFile.csv', 'wb'))
out_csv.writerows(new_txt)
answered Nov 20 at 18:42
Bill Hambone
678
678
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53382584%2fcant-remove-white-spaces-from-csv-headers-with-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
See discussion below, this is a Python 2 related question.
– 576i
Nov 19 at 21:31