python pandas: mean scores per hour per workday

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}

I have a database that has votes in it. These vote records consist of:
"Timestamp ; score"

The score is an integer.

I want to create a heatmap, so i want to have a dataframe with values for each hour in every workday with a mean score of all scores in that timeframe.

If there are no values in the hour of that workday, set the mean to 0.

Thusfar i've come to this:

gdf = pd.read_sql("select * from scores where survey_id='{}';    ".format(survey_id), self.db_conn)

gdf = gdf.set_index(['time_stamp'])

gdf.index = pd.to_datetime(gdf.index, unit='s')

if len(gdf) == 0:

  return None

gdf['weekday'] = gdf.index.weekday

# gdf['hour'] = gdf.index.hour

gdf = gdf.groupby(by=[gdf['weekday'], pd.Grouper(freq='H')]).agg(['mean']).fillna(0)

The result of this is:

                                score weekday hour

                             mean    mean mean

weekday time_stamp                                

0       2018-10-22 17:00:00  1.600000       0   17

1       2018-10-23 09:00:00  2.666667       1    9

2       2018-10-31 14:00:00  3.000000       2   14

        2018-10-31 19:00:00  4.000000       2   19

This misses all the other hours of the week with the 0 value as mean.

Any suggestions to what i'm doing wrong?

Thanks !! :)

asked Nov 26 '18 at 13:44

socialb

233

1

I think I'd do mostly what you've done (breaking the time_stamp out to days/hours, then grouping and getting the mean). Then I'd generate a dataframe with every combination of days/hours and left join my mean data onto it, so that all the missed rows are NA and I can fill them with .fillna(0).

– CJR
Nov 26 '18 at 13:59

add a comment |

I have a database that has votes in it. These vote records consist of:
"Timestamp ; score"

The score is an integer.

I want to create a heatmap, so i want to have a dataframe with values for each hour in every workday with a mean score of all scores in that timeframe.

If there are no values in the hour of that workday, set the mean to 0.

Thusfar i've come to this:

gdf = pd.read_sql("select * from scores where survey_id='{}';    ".format(survey_id), self.db_conn)

gdf = gdf.set_index(['time_stamp'])

gdf.index = pd.to_datetime(gdf.index, unit='s')

if len(gdf) == 0:

  return None

gdf['weekday'] = gdf.index.weekday

# gdf['hour'] = gdf.index.hour

gdf = gdf.groupby(by=[gdf['weekday'], pd.Grouper(freq='H')]).agg(['mean']).fillna(0)

The result of this is:

                                score weekday hour

                             mean    mean mean

weekday time_stamp                                

0       2018-10-22 17:00:00  1.600000       0   17

1       2018-10-23 09:00:00  2.666667       1    9

2       2018-10-31 14:00:00  3.000000       2   14

        2018-10-31 19:00:00  4.000000       2   19

This misses all the other hours of the week with the 0 value as mean.

Any suggestions to what i'm doing wrong?

Thanks !! :)

asked Nov 26 '18 at 13:44

socialb

233

1

I think I'd do mostly what you've done (breaking the time_stamp out to days/hours, then grouping and getting the mean). Then I'd generate a dataframe with every combination of days/hours and left join my mean data onto it, so that all the missed rows are NA and I can fill them with .fillna(0).

– CJR
Nov 26 '18 at 13:59

add a comment |

I have a database that has votes in it. These vote records consist of:
"Timestamp ; score"

The score is an integer.

I want to create a heatmap, so i want to have a dataframe with values for each hour in every workday with a mean score of all scores in that timeframe.

If there are no values in the hour of that workday, set the mean to 0.

Thusfar i've come to this:

gdf = pd.read_sql("select * from scores where survey_id='{}';    ".format(survey_id), self.db_conn)

gdf = gdf.set_index(['time_stamp'])

gdf.index = pd.to_datetime(gdf.index, unit='s')

if len(gdf) == 0:

  return None

gdf['weekday'] = gdf.index.weekday

# gdf['hour'] = gdf.index.hour

gdf = gdf.groupby(by=[gdf['weekday'], pd.Grouper(freq='H')]).agg(['mean']).fillna(0)

The result of this is:

                                score weekday hour

                             mean    mean mean

weekday time_stamp                                

0       2018-10-22 17:00:00  1.600000       0   17

1       2018-10-23 09:00:00  2.666667       1    9

2       2018-10-31 14:00:00  3.000000       2   14

        2018-10-31 19:00:00  4.000000       2   19

This misses all the other hours of the week with the 0 value as mean.

Any suggestions to what i'm doing wrong?

Thanks !! :)

asked Nov 26 '18 at 13:44

socialb

233

I have a database that has votes in it. These vote records consist of:
"Timestamp ; score"

The score is an integer.

I want to create a heatmap, so i want to have a dataframe with values for each hour in every workday with a mean score of all scores in that timeframe.

If there are no values in the hour of that workday, set the mean to 0.

Thusfar i've come to this:

gdf = pd.read_sql("select * from scores where survey_id='{}';    ".format(survey_id), self.db_conn)

gdf = gdf.set_index(['time_stamp'])

gdf.index = pd.to_datetime(gdf.index, unit='s')

if len(gdf) == 0:

  return None

gdf['weekday'] = gdf.index.weekday

# gdf['hour'] = gdf.index.hour

gdf = gdf.groupby(by=[gdf['weekday'], pd.Grouper(freq='H')]).agg(['mean']).fillna(0)

The result of this is:

                                score weekday hour

                             mean    mean mean

weekday time_stamp                                

0       2018-10-22 17:00:00  1.600000       0   17

1       2018-10-23 09:00:00  2.666667       1    9

2       2018-10-31 14:00:00  3.000000       2   14

        2018-10-31 19:00:00  4.000000       2   19

This misses all the other hours of the week with the 0 value as mean.

Any suggestions to what i'm doing wrong?

Thanks !! :)

python pandas aggregate

asked Nov 26 '18 at 13:44

socialb

233

asked Nov 26 '18 at 13:44

socialb

233

asked Nov 26 '18 at 13:44

socialb

233

asked Nov 26 '18 at 13:44

socialb

233

asked Nov 26 '18 at 13:44

socialb

233

1

I think I'd do mostly what you've done (breaking the time_stamp out to days/hours, then grouping and getting the mean). Then I'd generate a dataframe with every combination of days/hours and left join my mean data onto it, so that all the missed rows are NA and I can fill them with .fillna(0).

– CJR
Nov 26 '18 at 13:59

add a comment |

1

I think I'd do mostly what you've done (breaking the time_stamp out to days/hours, then grouping and getting the mean). Then I'd generate a dataframe with every combination of days/hours and left join my mean data onto it, so that all the missed rows are NA and I can fill them with .fillna(0).

– CJR
Nov 26 '18 at 13:59

I think I'd do mostly what you've done (breaking the time_stamp out to days/hours, then grouping and getting the mean). Then I'd generate a dataframe with every combination of days/hours and left join my mean data onto it, so that all the missed rows are NA and I can fill them with .fillna(0).

– CJR
Nov 26 '18 at 13:59

add a comment |

1 Answer
1

active

oldest

votes

i got it:

This worked, don't know if it could have been shorter but this did the job:

make new dataframe with 0 values for each hour of each workday.

appended the values from the database

    todays_date = datetime.datetime.now().date()

    index = pd.date_range(todays_date - datetime.timedelta(7), periods=7*24, freq='H')

    columns = ['user', 'survey_id', 'score']

    df_ = pd.DataFrame(index=index, columns=columns)

    df_ = df_.fillna(0)  # with 0s rather than NaNs

    gdf = pd.read_sql("select * from scores where survey_id='{}'; ".format(survey_id), self.db_conn)

    gdf = gdf.set_index(['time_stamp'])

    gdf.index = pd.to_datetime(gdf.index, unit='s')

    df_ = df_.append(gdf, ignore_index=False)  # ignoring index is optional



    if len(gdf) == 0:

        return None

    df_['weekday'] = df_.index.weekday

    df_['hour'] = df_.index.hour

    df_ = df_.groupby(by=[df_['weekday'], df_['hour']]).agg(['mean']).fillna(0)

answered Nov 27 '18 at 12:30

socialb

233

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53482450%2fpython-pandas-mean-scores-per-hour-per-workday%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

i got it:

This worked, don't know if it could have been shorter but this did the job:

make new dataframe with 0 values for each hour of each workday.

appended the values from the database

    todays_date = datetime.datetime.now().date()

    index = pd.date_range(todays_date - datetime.timedelta(7), periods=7*24, freq='H')

    columns = ['user', 'survey_id', 'score']

    df_ = pd.DataFrame(index=index, columns=columns)

    df_ = df_.fillna(0)  # with 0s rather than NaNs

    gdf = pd.read_sql("select * from scores where survey_id='{}'; ".format(survey_id), self.db_conn)

    gdf = gdf.set_index(['time_stamp'])

    gdf.index = pd.to_datetime(gdf.index, unit='s')

    df_ = df_.append(gdf, ignore_index=False)  # ignoring index is optional



    if len(gdf) == 0:

        return None

    df_['weekday'] = df_.index.weekday

    df_['hour'] = df_.index.hour

    df_ = df_.groupby(by=[df_['weekday'], df_['hour']]).agg(['mean']).fillna(0)

answered Nov 27 '18 at 12:30

socialb

233

add a comment |

i got it:

This worked, don't know if it could have been shorter but this did the job:

make new dataframe with 0 values for each hour of each workday.

appended the values from the database

    todays_date = datetime.datetime.now().date()

    index = pd.date_range(todays_date - datetime.timedelta(7), periods=7*24, freq='H')

    columns = ['user', 'survey_id', 'score']

    df_ = pd.DataFrame(index=index, columns=columns)

    df_ = df_.fillna(0)  # with 0s rather than NaNs

    gdf = pd.read_sql("select * from scores where survey_id='{}'; ".format(survey_id), self.db_conn)

    gdf = gdf.set_index(['time_stamp'])

    gdf.index = pd.to_datetime(gdf.index, unit='s')

    df_ = df_.append(gdf, ignore_index=False)  # ignoring index is optional



    if len(gdf) == 0:

        return None

    df_['weekday'] = df_.index.weekday

    df_['hour'] = df_.index.hour

    df_ = df_.groupby(by=[df_['weekday'], df_['hour']]).agg(['mean']).fillna(0)

answered Nov 27 '18 at 12:30

socialb

233

add a comment |

i got it:

This worked, don't know if it could have been shorter but this did the job:

make new dataframe with 0 values for each hour of each workday.

appended the values from the database

    todays_date = datetime.datetime.now().date()

    index = pd.date_range(todays_date - datetime.timedelta(7), periods=7*24, freq='H')

    columns = ['user', 'survey_id', 'score']

    df_ = pd.DataFrame(index=index, columns=columns)

    df_ = df_.fillna(0)  # with 0s rather than NaNs

    gdf = pd.read_sql("select * from scores where survey_id='{}'; ".format(survey_id), self.db_conn)

    gdf = gdf.set_index(['time_stamp'])

    gdf.index = pd.to_datetime(gdf.index, unit='s')

    df_ = df_.append(gdf, ignore_index=False)  # ignoring index is optional



    if len(gdf) == 0:

        return None

    df_['weekday'] = df_.index.weekday

    df_['hour'] = df_.index.hour

    df_ = df_.groupby(by=[df_['weekday'], df_['hour']]).agg(['mean']).fillna(0)

answered Nov 27 '18 at 12:30

socialb

233

i got it:

This worked, don't know if it could have been shorter but this did the job:

make new dataframe with 0 values for each hour of each workday.

appended the values from the database

    todays_date = datetime.datetime.now().date()

    index = pd.date_range(todays_date - datetime.timedelta(7), periods=7*24, freq='H')

    columns = ['user', 'survey_id', 'score']

    df_ = pd.DataFrame(index=index, columns=columns)

    df_ = df_.fillna(0)  # with 0s rather than NaNs

    gdf = pd.read_sql("select * from scores where survey_id='{}'; ".format(survey_id), self.db_conn)

    gdf = gdf.set_index(['time_stamp'])

    gdf.index = pd.to_datetime(gdf.index, unit='s')

    df_ = df_.append(gdf, ignore_index=False)  # ignoring index is optional



    if len(gdf) == 0:

        return None

    df_['weekday'] = df_.index.weekday

    df_['hour'] = df_.index.hour

    df_ = df_.groupby(by=[df_['weekday'], df_['hour']]).agg(['mean']).fillna(0)

answered Nov 27 '18 at 12:30

socialb

233

answered Nov 27 '18 at 12:30

socialb

233

answered Nov 27 '18 at 12:30

socialb

233

answered Nov 27 '18 at 12:30

socialb

233

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg