drop group by number of occurrence
Hi I want to delete the rows with the entries whose number of occurrence is smaller than a number, for example:
df = pd.DataFrame({'a': [1,2,3,2], 'b':[4,5,6,7], 'c':[0,1,3,2]})
df
a b c
0 1 4 0
1 2 5 1
2 3 6 3
3 2 7 2
Here I want to delete all the rows if the number of occurrence in column 'a' is less than twice.
Wanted output:
a b c
1 2 5 1
3 2 7 2
What I know:
we can find the number of occurrence by condition = df['a'].value_counts() < 2, and it will give me something like:
2 False
3 True
1 True
Name: a, dtype: int64
But I don't know how I should approach from here to delete the rows.
Thanks in advance!
python pandas dataframe counter pandas-groupby
add a comment |
Hi I want to delete the rows with the entries whose number of occurrence is smaller than a number, for example:
df = pd.DataFrame({'a': [1,2,3,2], 'b':[4,5,6,7], 'c':[0,1,3,2]})
df
a b c
0 1 4 0
1 2 5 1
2 3 6 3
3 2 7 2
Here I want to delete all the rows if the number of occurrence in column 'a' is less than twice.
Wanted output:
a b c
1 2 5 1
3 2 7 2
What I know:
we can find the number of occurrence by condition = df['a'].value_counts() < 2, and it will give me something like:
2 False
3 True
1 True
Name: a, dtype: int64
But I don't know how I should approach from here to delete the rows.
Thanks in advance!
python pandas dataframe counter pandas-groupby
add a comment |
Hi I want to delete the rows with the entries whose number of occurrence is smaller than a number, for example:
df = pd.DataFrame({'a': [1,2,3,2], 'b':[4,5,6,7], 'c':[0,1,3,2]})
df
a b c
0 1 4 0
1 2 5 1
2 3 6 3
3 2 7 2
Here I want to delete all the rows if the number of occurrence in column 'a' is less than twice.
Wanted output:
a b c
1 2 5 1
3 2 7 2
What I know:
we can find the number of occurrence by condition = df['a'].value_counts() < 2, and it will give me something like:
2 False
3 True
1 True
Name: a, dtype: int64
But I don't know how I should approach from here to delete the rows.
Thanks in advance!
python pandas dataframe counter pandas-groupby
Hi I want to delete the rows with the entries whose number of occurrence is smaller than a number, for example:
df = pd.DataFrame({'a': [1,2,3,2], 'b':[4,5,6,7], 'c':[0,1,3,2]})
df
a b c
0 1 4 0
1 2 5 1
2 3 6 3
3 2 7 2
Here I want to delete all the rows if the number of occurrence in column 'a' is less than twice.
Wanted output:
a b c
1 2 5 1
3 2 7 2
What I know:
we can find the number of occurrence by condition = df['a'].value_counts() < 2, and it will give me something like:
2 False
3 True
1 True
Name: a, dtype: int64
But I don't know how I should approach from here to delete the rows.
Thanks in advance!
python pandas dataframe counter pandas-groupby
python pandas dataframe counter pandas-groupby
edited Nov 25 '18 at 20:43
jpp
102k2165115
102k2165115
asked Nov 25 '18 at 20:06
Louise FanLouise Fan
183
183
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
groupby + size
res = df[df.groupby('a')['b'].transform('size') >= 2]
The transform method maps df.groupby('a')['b'].size() to df aligned with df['a'].
value_counts + map
s = df['a'].value_counts()
res = df[df['a'].map(s) >= 2]
print(res)
a b c
1 2 5 1
3 2 7 2
add a comment |
You Can use df.where and the dropna
df.where(df['a'].value_counts() <2).dropna()
a b c
1 2.0 5.0 1.0
3 2.0 7.0 2.0
add a comment |
You could try something like this to get the length of each group, transform back to original index and index the df by it
df[df.groupby("a").transform(len)["b"] >= 2]
a b c
1 2 5 1
3 2 7 2
Breaking it into individual steps you get:
df.groupby("a").transform(len)["b"]
0 1
1 2
2 1
3 2
Name: b, dtype: int64
These are the group sizes transformed back onto your original index
df.groupby("a").transform(len)["b"] >=2
0 False
1 True
2 False
3 True
Name: b, dtype: bool
We then turn this into the boolean index and index our original dataframe by it
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53471422%2fdrop-group-by-number-of-occurrence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
groupby + size
res = df[df.groupby('a')['b'].transform('size') >= 2]
The transform method maps df.groupby('a')['b'].size() to df aligned with df['a'].
value_counts + map
s = df['a'].value_counts()
res = df[df['a'].map(s) >= 2]
print(res)
a b c
1 2 5 1
3 2 7 2
add a comment |
groupby + size
res = df[df.groupby('a')['b'].transform('size') >= 2]
The transform method maps df.groupby('a')['b'].size() to df aligned with df['a'].
value_counts + map
s = df['a'].value_counts()
res = df[df['a'].map(s) >= 2]
print(res)
a b c
1 2 5 1
3 2 7 2
add a comment |
groupby + size
res = df[df.groupby('a')['b'].transform('size') >= 2]
The transform method maps df.groupby('a')['b'].size() to df aligned with df['a'].
value_counts + map
s = df['a'].value_counts()
res = df[df['a'].map(s) >= 2]
print(res)
a b c
1 2 5 1
3 2 7 2
groupby + size
res = df[df.groupby('a')['b'].transform('size') >= 2]
The transform method maps df.groupby('a')['b'].size() to df aligned with df['a'].
value_counts + map
s = df['a'].value_counts()
res = df[df['a'].map(s) >= 2]
print(res)
a b c
1 2 5 1
3 2 7 2
answered Nov 25 '18 at 20:26
jppjpp
102k2165115
102k2165115
add a comment |
add a comment |
You Can use df.where and the dropna
df.where(df['a'].value_counts() <2).dropna()
a b c
1 2.0 5.0 1.0
3 2.0 7.0 2.0
add a comment |
You Can use df.where and the dropna
df.where(df['a'].value_counts() <2).dropna()
a b c
1 2.0 5.0 1.0
3 2.0 7.0 2.0
add a comment |
You Can use df.where and the dropna
df.where(df['a'].value_counts() <2).dropna()
a b c
1 2.0 5.0 1.0
3 2.0 7.0 2.0
You Can use df.where and the dropna
df.where(df['a'].value_counts() <2).dropna()
a b c
1 2.0 5.0 1.0
3 2.0 7.0 2.0
answered Nov 25 '18 at 20:14
Khalil Al HootiKhalil Al Hooti
1,2661820
1,2661820
add a comment |
add a comment |
You could try something like this to get the length of each group, transform back to original index and index the df by it
df[df.groupby("a").transform(len)["b"] >= 2]
a b c
1 2 5 1
3 2 7 2
Breaking it into individual steps you get:
df.groupby("a").transform(len)["b"]
0 1
1 2
2 1
3 2
Name: b, dtype: int64
These are the group sizes transformed back onto your original index
df.groupby("a").transform(len)["b"] >=2
0 False
1 True
2 False
3 True
Name: b, dtype: bool
We then turn this into the boolean index and index our original dataframe by it
add a comment |
You could try something like this to get the length of each group, transform back to original index and index the df by it
df[df.groupby("a").transform(len)["b"] >= 2]
a b c
1 2 5 1
3 2 7 2
Breaking it into individual steps you get:
df.groupby("a").transform(len)["b"]
0 1
1 2
2 1
3 2
Name: b, dtype: int64
These are the group sizes transformed back onto your original index
df.groupby("a").transform(len)["b"] >=2
0 False
1 True
2 False
3 True
Name: b, dtype: bool
We then turn this into the boolean index and index our original dataframe by it
add a comment |
You could try something like this to get the length of each group, transform back to original index and index the df by it
df[df.groupby("a").transform(len)["b"] >= 2]
a b c
1 2 5 1
3 2 7 2
Breaking it into individual steps you get:
df.groupby("a").transform(len)["b"]
0 1
1 2
2 1
3 2
Name: b, dtype: int64
These are the group sizes transformed back onto your original index
df.groupby("a").transform(len)["b"] >=2
0 False
1 True
2 False
3 True
Name: b, dtype: bool
We then turn this into the boolean index and index our original dataframe by it
You could try something like this to get the length of each group, transform back to original index and index the df by it
df[df.groupby("a").transform(len)["b"] >= 2]
a b c
1 2 5 1
3 2 7 2
Breaking it into individual steps you get:
df.groupby("a").transform(len)["b"]
0 1
1 2
2 1
3 2
Name: b, dtype: int64
These are the group sizes transformed back onto your original index
df.groupby("a").transform(len)["b"] >=2
0 False
1 True
2 False
3 True
Name: b, dtype: bool
We then turn this into the boolean index and index our original dataframe by it
edited Nov 25 '18 at 20:20
answered Nov 25 '18 at 20:15
Sven HarrisSven Harris
2,1861516
2,1861516
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53471422%2fdrop-group-by-number-of-occurrence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown