Pandas sorted ranking of column A sorted by column B

Currently I have the following python code

forumposts = pd.DataFrame({'UserId': [1,1,2,3,2,1,3], 'FirstPostDate': [2018,2018,2017,2019,2017,2018,2019], 'PostDate': [201801,201802,201701,201901,201801,201803,201902]})



data = forumposts.groupby(['UserId', 'PostDate','FirstPostDate']).size().reset_index()



rankedUserIdByFirstPostDate = data.groupby(['UserId', 'FirstPostDate']).size().reset_index().sort_values('FirstPostDate').reset_index(drop=True).reset_index()



data.loc[:,'Rank'] = data.merge(rankedUserIdByFirstPostDate , how='left', on='UserId')['index'].values

The code works as intended but its complicated is there a more pandas like way of doing this? The intent is the following:

Create a dense rank over the UserId column sorted by the FirstPostDate such that the user with the earliest posting gets rank 0 and the user with the second earliest first post gets rank 1 and so on.

Using forumposts.UserId.rank(method='dense') gives me a ranking but its sorted by the order of the UserId.

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

Can you add some sample data?

– jezrael
Nov 22 '18 at 8:06

@jezrael Done :) Output should rank user 2 first, user 1 second and user 3 third.

– CodeMonkey
Nov 22 '18 at 8:13

add a comment |

Currently I have the following python code

forumposts = pd.DataFrame({'UserId': [1,1,2,3,2,1,3], 'FirstPostDate': [2018,2018,2017,2019,2017,2018,2019], 'PostDate': [201801,201802,201701,201901,201801,201803,201902]})



data = forumposts.groupby(['UserId', 'PostDate','FirstPostDate']).size().reset_index()



rankedUserIdByFirstPostDate = data.groupby(['UserId', 'FirstPostDate']).size().reset_index().sort_values('FirstPostDate').reset_index(drop=True).reset_index()



data.loc[:,'Rank'] = data.merge(rankedUserIdByFirstPostDate , how='left', on='UserId')['index'].values

The code works as intended but its complicated is there a more pandas like way of doing this? The intent is the following:

Using forumposts.UserId.rank(method='dense') gives me a ranking but its sorted by the order of the UserId.

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

Can you add some sample data?

– jezrael
Nov 22 '18 at 8:06

@jezrael Done :) Output should rank user 2 first, user 1 second and user 3 third.

– CodeMonkey
Nov 22 '18 at 8:13

add a comment |

Currently I have the following python code

forumposts = pd.DataFrame({'UserId': [1,1,2,3,2,1,3], 'FirstPostDate': [2018,2018,2017,2019,2017,2018,2019], 'PostDate': [201801,201802,201701,201901,201801,201803,201902]})



data = forumposts.groupby(['UserId', 'PostDate','FirstPostDate']).size().reset_index()



rankedUserIdByFirstPostDate = data.groupby(['UserId', 'FirstPostDate']).size().reset_index().sort_values('FirstPostDate').reset_index(drop=True).reset_index()



data.loc[:,'Rank'] = data.merge(rankedUserIdByFirstPostDate , how='left', on='UserId')['index'].values

The code works as intended but its complicated is there a more pandas like way of doing this? The intent is the following:

Using forumposts.UserId.rank(method='dense') gives me a ranking but its sorted by the order of the UserId.

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

Currently I have the following python code

forumposts = pd.DataFrame({'UserId': [1,1,2,3,2,1,3], 'FirstPostDate': [2018,2018,2017,2019,2017,2018,2019], 'PostDate': [201801,201802,201701,201901,201801,201803,201902]})



data = forumposts.groupby(['UserId', 'PostDate','FirstPostDate']).size().reset_index()



rankedUserIdByFirstPostDate = data.groupby(['UserId', 'FirstPostDate']).size().reset_index().sort_values('FirstPostDate').reset_index(drop=True).reset_index()



data.loc[:,'Rank'] = data.merge(rankedUserIdByFirstPostDate , how='left', on='UserId')['index'].values

The code works as intended but its complicated is there a more pandas like way of doing this? The intent is the following:

Using forumposts.UserId.rank(method='dense') gives me a ranking but its sorted by the order of the UserId.

python pandas pandas-groupby

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

edited Nov 22 '18 at 8:12

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

asked Nov 22 '18 at 8:05

CodeMonkey

1,31211628

Can you add some sample data?

– jezrael
Nov 22 '18 at 8:06

@jezrael Done :) Output should rank user 2 first, user 1 second and user 3 third.

– CodeMonkey
Nov 22 '18 at 8:13

add a comment |

Can you add some sample data?

– jezrael
Nov 22 '18 at 8:06

@jezrael Done :) Output should rank user 2 first, user 1 second and user 3 third.

– CodeMonkey
Nov 22 '18 at 8:13

Can you add some sample data?

– jezrael
Nov 22 '18 at 8:06

@jezrael Done :) Output should rank user 2 first, user 1 second and user 3 third.

– CodeMonkey
Nov 22 '18 at 8:13

add a comment |

1 Answer
1

active

oldest

votes

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange:

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])

                  .size()

                  .reset_index(name='count'))



users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']

d = dict(zip(users, np.arange(len(users))))

data['Rank'] = data['UserId'].map(d)

print (data)

   UserId  PostDate  FirstPostDate  count  Rank

0       1    201801           2018      1     1

1       1    201802           2018      1     1

2       1    201803           2018      1     1

3       2    201701           2017      1     0

4       2    201801           2017      1     0

5       3    201901           2019      1     2

6       3    201902           2019      1     2

Another solution:

data['Rank'] = (data.groupby('UserId')['FirstPostDate']

                   .transform('min')

                   .rank(method='dense')

                   .sub(1)

                   .astype(int))

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

1

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53426350%2fpandas-sorted-ranking-of-column-a-sorted-by-column-b%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange:

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])

                  .size()

                  .reset_index(name='count'))



users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']

d = dict(zip(users, np.arange(len(users))))

data['Rank'] = data['UserId'].map(d)

print (data)

   UserId  PostDate  FirstPostDate  count  Rank

0       1    201801           2018      1     1

1       1    201802           2018      1     1

2       1    201803           2018      1     1

3       2    201701           2017      1     0

4       2    201801           2017      1     0

5       3    201901           2019      1     2

6       3    201902           2019      1     2

Another solution:

data['Rank'] = (data.groupby('UserId')['FirstPostDate']

                   .transform('min')

                   .rank(method='dense')

                   .sub(1)

                   .astype(int))

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

1

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

add a comment |

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange:

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])

                  .size()

                  .reset_index(name='count'))



users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']

d = dict(zip(users, np.arange(len(users))))

data['Rank'] = data['UserId'].map(d)

print (data)

   UserId  PostDate  FirstPostDate  count  Rank

0       1    201801           2018      1     1

1       1    201802           2018      1     1

2       1    201803           2018      1     1

3       2    201701           2017      1     0

4       2    201801           2017      1     0

5       3    201901           2019      1     2

6       3    201902           2019      1     2

Another solution:

data['Rank'] = (data.groupby('UserId')['FirstPostDate']

                   .transform('min')

                   .rank(method='dense')

                   .sub(1)

                   .astype(int))

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

1

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

add a comment |

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange:

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])

                  .size()

                  .reset_index(name='count'))



users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']

d = dict(zip(users, np.arange(len(users))))

data['Rank'] = data['UserId'].map(d)

print (data)

   UserId  PostDate  FirstPostDate  count  Rank

0       1    201801           2018      1     1

1       1    201802           2018      1     1

2       1    201803           2018      1     1

3       2    201701           2017      1     0

4       2    201801           2017      1     0

5       3    201901           2019      1     2

6       3    201902           2019      1     2

Another solution:

data['Rank'] = (data.groupby('UserId')['FirstPostDate']

                   .transform('min')

                   .rank(method='dense')

                   .sub(1)

                   .astype(int))

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange:

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])

                  .size()

                  .reset_index(name='count'))



users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']

d = dict(zip(users, np.arange(len(users))))

data['Rank'] = data['UserId'].map(d)

print (data)

   UserId  PostDate  FirstPostDate  count  Rank

0       1    201801           2018      1     1

1       1    201802           2018      1     1

2       1    201803           2018      1     1

3       2    201701           2017      1     0

4       2    201801           2017      1     0

5       3    201901           2019      1     2

6       3    201902           2019      1     2

Another solution:

data['Rank'] = (data.groupby('UserId')['FirstPostDate']

                   .transform('min')

                   .rank(method='dense')

                   .sub(1)

                   .astype(int))

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

edited Nov 22 '18 at 8:30

answered Nov 22 '18 at 8:21

jezrael

327k23270348

answered Nov 22 '18 at 8:21

jezrael

327k23270348

answered Nov 22 '18 at 8:21

jezrael

327k23270348

1

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

add a comment |

1

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

Great, learned something new, clever use of dict/zip and map!

– CodeMonkey
Nov 22 '18 at 8:43

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg