Add column based on different conditions for different columns | python pandas
I have a dataframe with 4 columns:
c1 c2 c3 GName
0.221445 0.300534 5.689 KDD
0.001000 0.969000 15.140 ACC
1.000000 0.094000 -0.245 QETF
And dataframe called file
of one column:
GName
Abd
kkoew
KDD
pwqh
ACC
dsewf
I need to add new column call label
that based on checking the scores in c1, c2 and c3
and GName
So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file
; the label
= 1, otherwise the label
= 0
The conditions of c1 should be > 0.95
c2 should be > 0.50
c3 should be > 15
The output will be like this:
c1 c2 c3 GName label
0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)
I'm struggling with those different conditions, any help please?
python pandas dataframe
add a comment |
I have a dataframe with 4 columns:
c1 c2 c3 GName
0.221445 0.300534 5.689 KDD
0.001000 0.969000 15.140 ACC
1.000000 0.094000 -0.245 QETF
And dataframe called file
of one column:
GName
Abd
kkoew
KDD
pwqh
ACC
dsewf
I need to add new column call label
that based on checking the scores in c1, c2 and c3
and GName
So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file
; the label
= 1, otherwise the label
= 0
The conditions of c1 should be > 0.95
c2 should be > 0.50
c3 should be > 15
The output will be like this:
c1 c2 c3 GName label
0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)
I'm struggling with those different conditions, any help please?
python pandas dataframe
add a comment |
I have a dataframe with 4 columns:
c1 c2 c3 GName
0.221445 0.300534 5.689 KDD
0.001000 0.969000 15.140 ACC
1.000000 0.094000 -0.245 QETF
And dataframe called file
of one column:
GName
Abd
kkoew
KDD
pwqh
ACC
dsewf
I need to add new column call label
that based on checking the scores in c1, c2 and c3
and GName
So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file
; the label
= 1, otherwise the label
= 0
The conditions of c1 should be > 0.95
c2 should be > 0.50
c3 should be > 15
The output will be like this:
c1 c2 c3 GName label
0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)
I'm struggling with those different conditions, any help please?
python pandas dataframe
I have a dataframe with 4 columns:
c1 c2 c3 GName
0.221445 0.300534 5.689 KDD
0.001000 0.969000 15.140 ACC
1.000000 0.094000 -0.245 QETF
And dataframe called file
of one column:
GName
Abd
kkoew
KDD
pwqh
ACC
dsewf
I need to add new column call label
that based on checking the scores in c1, c2 and c3
and GName
So, if the majority of the 3 scores agreed on their conditions (2 out of the 3 or all the 3) and the value of GName exist in the dataframe file
; the label
= 1, otherwise the label
= 0
The conditions of c1 should be > 0.95
c2 should be > 0.50
c3 should be > 15
The output will be like this:
c1 c2 c3 GName label
0.221445 0.300534 5.689 KDD 0 (because 0 out of 3 and KDD in file)
0.001000 0.969000 15.140 ACC 1 (because 2 out of 3 and ACC in file)
1.000000 0.94060 -0.245 QETF 0 (because 2 out of 3 but QETF not in file)
I'm struggling with those different conditions, any help please?
python pandas dataframe
python pandas dataframe
asked Nov 20 at 23:45
Sara Wasl
817
817
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
The way I would do it is this:
import pandas as pd
df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
'c2':[0.300534, 0.969000, 0.094000],
'c3':[5.689, 15.140, -0.245],
'GName':['KDD', 'ACC', 'QETF']})
file = pd.DataFrame({'GName':['KDD', 'ACC']})
conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
df['label'] = 0
df.loc[conditions, 'label'] = 1
>>> df
c1 c2 c3 GName label
0 0.221445 0.300534 5.689 KDD 0
1 0.001000 0.969000 15.140 ACC 1
2 1.000000 0.094000 -0.245 QETF 0
It would be nice if you could include code to generate your dataframe in your question, as well.
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403293%2fadd-column-based-on-different-conditions-for-different-columns-python-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
The way I would do it is this:
import pandas as pd
df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
'c2':[0.300534, 0.969000, 0.094000],
'c3':[5.689, 15.140, -0.245],
'GName':['KDD', 'ACC', 'QETF']})
file = pd.DataFrame({'GName':['KDD', 'ACC']})
conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
df['label'] = 0
df.loc[conditions, 'label'] = 1
>>> df
c1 c2 c3 GName label
0 0.221445 0.300534 5.689 KDD 0
1 0.001000 0.969000 15.140 ACC 1
2 1.000000 0.094000 -0.245 QETF 0
It would be nice if you could include code to generate your dataframe in your question, as well.
add a comment |
The way I would do it is this:
import pandas as pd
df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
'c2':[0.300534, 0.969000, 0.094000],
'c3':[5.689, 15.140, -0.245],
'GName':['KDD', 'ACC', 'QETF']})
file = pd.DataFrame({'GName':['KDD', 'ACC']})
conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
df['label'] = 0
df.loc[conditions, 'label'] = 1
>>> df
c1 c2 c3 GName label
0 0.221445 0.300534 5.689 KDD 0
1 0.001000 0.969000 15.140 ACC 1
2 1.000000 0.094000 -0.245 QETF 0
It would be nice if you could include code to generate your dataframe in your question, as well.
add a comment |
The way I would do it is this:
import pandas as pd
df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
'c2':[0.300534, 0.969000, 0.094000],
'c3':[5.689, 15.140, -0.245],
'GName':['KDD', 'ACC', 'QETF']})
file = pd.DataFrame({'GName':['KDD', 'ACC']})
conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
df['label'] = 0
df.loc[conditions, 'label'] = 1
>>> df
c1 c2 c3 GName label
0 0.221445 0.300534 5.689 KDD 0
1 0.001000 0.969000 15.140 ACC 1
2 1.000000 0.094000 -0.245 QETF 0
It would be nice if you could include code to generate your dataframe in your question, as well.
The way I would do it is this:
import pandas as pd
df = pd.DataFrame({'c1':[0.221445, 0.001000, 1.000000],
'c2':[0.300534, 0.969000, 0.094000],
'c3':[5.689, 15.140, -0.245],
'GName':['KDD', 'ACC', 'QETF']})
file = pd.DataFrame({'GName':['KDD', 'ACC']})
conditions = (df['c1'] > 0.95).astype(int) + (df['c2'] > 0.5).astype(int) + (df['c3'] > 15).astype(int)
conditions = (conditions >= 2) & (df['GName'].isin(file['GName']))
df['label'] = 0
df.loc[conditions, 'label'] = 1
>>> df
c1 c2 c3 GName label
0 0.221445 0.300534 5.689 KDD 0
1 0.001000 0.969000 15.140 ACC 1
2 1.000000 0.094000 -0.245 QETF 0
It would be nice if you could include code to generate your dataframe in your question, as well.
answered Nov 21 at 0:12
CJ59
1,2171214
1,2171214
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53403293%2fadd-column-based-on-different-conditions-for-different-columns-python-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown