error using astype when NaN exists in a dataframe











up vote
18
down vote

favorite
4












df
A B
0 a=10 b=20.10
1 a=20 NaN
2 NaN b=30.10
3 a=40 b=40.10


I tried :



df['A'] = df['A'].str.extract('(d+)').astype(int)
df['B'] = df['B'].str.extract('(d+)').astype(float)


But I get the following error:




ValueError: cannot convert float NaN to integer




And:




AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas




How do I fix this ?










share|improve this question




















  • 1




    Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
    – EdChum
    Jan 9 '17 at 15:02















up vote
18
down vote

favorite
4












df
A B
0 a=10 b=20.10
1 a=20 NaN
2 NaN b=30.10
3 a=40 b=40.10


I tried :



df['A'] = df['A'].str.extract('(d+)').astype(int)
df['B'] = df['B'].str.extract('(d+)').astype(float)


But I get the following error:




ValueError: cannot convert float NaN to integer




And:




AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas




How do I fix this ?










share|improve this question




















  • 1




    Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
    – EdChum
    Jan 9 '17 at 15:02













up vote
18
down vote

favorite
4









up vote
18
down vote

favorite
4






4





df
A B
0 a=10 b=20.10
1 a=20 NaN
2 NaN b=30.10
3 a=40 b=40.10


I tried :



df['A'] = df['A'].str.extract('(d+)').astype(int)
df['B'] = df['B'].str.extract('(d+)').astype(float)


But I get the following error:




ValueError: cannot convert float NaN to integer




And:




AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas




How do I fix this ?










share|improve this question















df
A B
0 a=10 b=20.10
1 a=20 NaN
2 NaN b=30.10
3 a=40 b=40.10


I tried :



df['A'] = df['A'].str.extract('(d+)').astype(int)
df['B'] = df['B'].str.extract('(d+)').astype(float)


But I get the following error:




ValueError: cannot convert float NaN to integer




And:




AttributeError: Can only use .str accessor with string values, which use np.object_ dtype in pandas




How do I fix this ?







pandas






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 9 '17 at 15:04









IanS

8,20122457




8,20122457










asked Jan 9 '17 at 14:57









Sun

2683514




2683514








  • 1




    Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
    – EdChum
    Jan 9 '17 at 15:02














  • 1




    Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
    – EdChum
    Jan 9 '17 at 15:02








1




1




Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
– EdChum
Jan 9 '17 at 15:02




Firstly NaN can only be represented by float so you can't cast to int in that case, second if you have mixed dtypes for instance string and some other thing then using ``str.extract` will fail, although mixed dtypes are supported, it's not a good idea as it leads to errors. You should decide what the final dtype should be and replace the missing values that makes sense to you
– EdChum
Jan 9 '17 at 15:02












1 Answer
1






active

oldest

votes

















up vote
31
down vote



accepted










If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.



print (type(np.nan))
<class 'float'>


See docs how convert values if at least one NaN:




integer > cast to float64




If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:



df['A'] = df['A'].str.extract('(d+)', expand=False)
df['B'] = df['B'].str.extract('(d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40

df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40

print (df1.dtypes)
A int32
B int32
dtype: object





share|improve this answer























  • works. Thanks a lot for your help.
    – Sun
    Jan 10 '17 at 10:26











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41550746%2ferror-using-astype-when-nan-exists-in-a-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
31
down vote



accepted










If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.



print (type(np.nan))
<class 'float'>


See docs how convert values if at least one NaN:




integer > cast to float64




If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:



df['A'] = df['A'].str.extract('(d+)', expand=False)
df['B'] = df['B'].str.extract('(d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40

df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40

print (df1.dtypes)
A int32
B int32
dtype: object





share|improve this answer























  • works. Thanks a lot for your help.
    – Sun
    Jan 10 '17 at 10:26















up vote
31
down vote



accepted










If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.



print (type(np.nan))
<class 'float'>


See docs how convert values if at least one NaN:




integer > cast to float64




If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:



df['A'] = df['A'].str.extract('(d+)', expand=False)
df['B'] = df['B'].str.extract('(d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40

df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40

print (df1.dtypes)
A int32
B int32
dtype: object





share|improve this answer























  • works. Thanks a lot for your help.
    – Sun
    Jan 10 '17 at 10:26













up vote
31
down vote



accepted







up vote
31
down vote



accepted






If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.



print (type(np.nan))
<class 'float'>


See docs how convert values if at least one NaN:




integer > cast to float64




If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:



df['A'] = df['A'].str.extract('(d+)', expand=False)
df['B'] = df['B'].str.extract('(d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40

df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40

print (df1.dtypes)
A int32
B int32
dtype: object





share|improve this answer














If some values in column are missing (NaN) and then converted to numeric, always dtype is float. You cannot convert values to int. Only to float, because type of NaN is float.



print (type(np.nan))
<class 'float'>


See docs how convert values if at least one NaN:




integer > cast to float64




If need int values you need replace NaN to some int, e.g. 0 by fillna and then it works perfectly:



df['A'] = df['A'].str.extract('(d+)', expand=False)
df['B'] = df['B'].str.extract('(d+)', expand=False)
print (df)
A B
0 10 20
1 20 NaN
2 NaN 30
3 40 40

df1 = df.fillna(0).astype(int)
print (df1)
A B
0 10 20
1 20 0
2 0 30
3 40 40

print (df1.dtypes)
A int32
B int32
dtype: object






share|improve this answer














share|improve this answer



share|improve this answer








edited Jan 9 '17 at 15:09

























answered Jan 9 '17 at 14:59









jezrael

311k21247322




311k21247322












  • works. Thanks a lot for your help.
    – Sun
    Jan 10 '17 at 10:26


















  • works. Thanks a lot for your help.
    – Sun
    Jan 10 '17 at 10:26
















works. Thanks a lot for your help.
– Sun
Jan 10 '17 at 10:26




works. Thanks a lot for your help.
– Sun
Jan 10 '17 at 10:26


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f41550746%2ferror-using-astype-when-nan-exists-in-a-dataframe%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Tonle Sap (See)

I get strange results when I access the Sqlitedatabase with Unity C# via XAMPP

Guatemaltekische Davis-Cup-Mannschaft