Increase Sampling rate on time-series data with Pandas
up vote
0
down vote
favorite
I have accelerometer data with variable sampling rate. I am trying to increase it a constant sampling rate 50hz through interpolation.The problem with the timestamps is, it doesn't have milliseconds.
How do i do it without losing the data i already have?
python pandas
add a comment |
up vote
0
down vote
favorite
I have accelerometer data with variable sampling rate. I am trying to increase it a constant sampling rate 50hz through interpolation.The problem with the timestamps is, it doesn't have milliseconds.
How do i do it without losing the data i already have?
python pandas
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have accelerometer data with variable sampling rate. I am trying to increase it a constant sampling rate 50hz through interpolation.The problem with the timestamps is, it doesn't have milliseconds.
How do i do it without losing the data i already have?
python pandas
I have accelerometer data with variable sampling rate. I am trying to increase it a constant sampling rate 50hz through interpolation.The problem with the timestamps is, it doesn't have milliseconds.
How do i do it without losing the data i already have?
python pandas
python pandas
asked Nov 19 at 16:01
subhash
76110
76110
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
You can first set the index as your datetime column using df.set_index('timestamp')
and use df.resample()
. The directive you want to pass into the resample
function is L
for milliseconds, but you can read more here. The resample
function also lets you choose a number of interpolation modes.
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
I see. That's tricky because topandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.
– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This waypandas
will still do most of the work.
– rvd
Nov 19 at 16:15
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
You can first set the index as your datetime column using df.set_index('timestamp')
and use df.resample()
. The directive you want to pass into the resample
function is L
for milliseconds, but you can read more here. The resample
function also lets you choose a number of interpolation modes.
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
I see. That's tricky because topandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.
– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This waypandas
will still do most of the work.
– rvd
Nov 19 at 16:15
add a comment |
up vote
0
down vote
You can first set the index as your datetime column using df.set_index('timestamp')
and use df.resample()
. The directive you want to pass into the resample
function is L
for milliseconds, but you can read more here. The resample
function also lets you choose a number of interpolation modes.
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
I see. That's tricky because topandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.
– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This waypandas
will still do most of the work.
– rvd
Nov 19 at 16:15
add a comment |
up vote
0
down vote
up vote
0
down vote
You can first set the index as your datetime column using df.set_index('timestamp')
and use df.resample()
. The directive you want to pass into the resample
function is L
for milliseconds, but you can read more here. The resample
function also lets you choose a number of interpolation modes.
You can first set the index as your datetime column using df.set_index('timestamp')
and use df.resample()
. The directive you want to pass into the resample
function is L
for milliseconds, but you can read more here. The resample
function also lets you choose a number of interpolation modes.
answered Nov 19 at 16:07
rvd
43117
43117
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
I see. That's tricky because topandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.
– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This waypandas
will still do most of the work.
– rvd
Nov 19 at 16:15
add a comment |
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
I see. That's tricky because topandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.
– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This waypandas
will still do most of the work.
– rvd
Nov 19 at 16:15
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
I tried this df.resample('20ms', on='timestamp'). This only take first occurence's value and increases the sampling rate but all the rest of data for that particular second is lost. Its just populated as nan.
– subhash
Nov 19 at 16:10
1
1
I see. That's tricky because to
pandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.– rvd
Nov 19 at 16:14
I see. That's tricky because to
pandas
, it just sees a bunch of values with the same timestamp and doesn't know what to do with it. Moreover, it doesn't even seem that there are always the same number of repeated timestamps. It seems you may have roll something manually; basically go second by second, assume all values for a given second are spaced out evenly for a second, and interpolate using the mean of neighbors or something.– rvd
Nov 19 at 16:14
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This way
pandas
will still do most of the work.– rvd
Nov 19 at 16:15
Another possible way is to go through second by second and change the timestamps so that they are spaced evenly by for how many times a second value repeats, and then use resample to fill everything else. This way
pandas
will still do most of the work.– rvd
Nov 19 at 16:15
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53378457%2fincrease-sampling-rate-on-time-series-data-with-pandas%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown