How to find the count of consecutive same string values in a pandas dataframe?
up vote
0
down vote
favorite
Assume that we have the following pandas dataframe:
df = pd.DataFrame({'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G'],'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG'], 'start':[1000,2000,3000,4000,5000,6000,10000,20000]})
input:
col1 col2 start
0 A>G TCT 1000
1 C>T ACA 2000
2 C>T TCA 3000
3 G>T TCA 4000
4 C>T GCT 5000
5 A>G ACT 6000
6 A>G CTG 10000
7 A>G ATG 20000
8 C>A TCT 10000
9 C>T ACA 2000
10 C>T TCA 3000
11 C>T TCA 4000
What I want to get is the number of consecutive values in col1 and length of these consecutive values and the difference between the last element's start and first element's start:
output:
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
python dataframe
add a comment |
up vote
0
down vote
favorite
Assume that we have the following pandas dataframe:
df = pd.DataFrame({'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G'],'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG'], 'start':[1000,2000,3000,4000,5000,6000,10000,20000]})
input:
col1 col2 start
0 A>G TCT 1000
1 C>T ACA 2000
2 C>T TCA 3000
3 G>T TCA 4000
4 C>T GCT 5000
5 A>G ACT 6000
6 A>G CTG 10000
7 A>G ATG 20000
8 C>A TCT 10000
9 C>T ACA 2000
10 C>T TCA 3000
11 C>T TCA 4000
What I want to get is the number of consecutive values in col1 and length of these consecutive values and the difference between the last element's start and first element's start:
output:
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
python dataframe
The data frame defined indf = ...
is missing some rows compared to the example below.
– Matthias Ossadnik
Nov 19 at 22:48
add a comment |
up vote
0
down vote
favorite
up vote
0
down vote
favorite
Assume that we have the following pandas dataframe:
df = pd.DataFrame({'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G'],'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG'], 'start':[1000,2000,3000,4000,5000,6000,10000,20000]})
input:
col1 col2 start
0 A>G TCT 1000
1 C>T ACA 2000
2 C>T TCA 3000
3 G>T TCA 4000
4 C>T GCT 5000
5 A>G ACT 6000
6 A>G CTG 10000
7 A>G ATG 20000
8 C>A TCT 10000
9 C>T ACA 2000
10 C>T TCA 3000
11 C>T TCA 4000
What I want to get is the number of consecutive values in col1 and length of these consecutive values and the difference between the last element's start and first element's start:
output:
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
python dataframe
Assume that we have the following pandas dataframe:
df = pd.DataFrame({'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G'],'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG'], 'start':[1000,2000,3000,4000,5000,6000,10000,20000]})
input:
col1 col2 start
0 A>G TCT 1000
1 C>T ACA 2000
2 C>T TCA 3000
3 G>T TCA 4000
4 C>T GCT 5000
5 A>G ACT 6000
6 A>G CTG 10000
7 A>G ATG 20000
8 C>A TCT 10000
9 C>T ACA 2000
10 C>T TCA 3000
11 C>T TCA 4000
What I want to get is the number of consecutive values in col1 and length of these consecutive values and the difference between the last element's start and first element's start:
output:
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
python dataframe
python dataframe
asked Nov 19 at 21:56
burcak
467
467
The data frame defined indf = ...
is missing some rows compared to the example below.
– Matthias Ossadnik
Nov 19 at 22:48
add a comment |
The data frame defined indf = ...
is missing some rows compared to the example below.
– Matthias Ossadnik
Nov 19 at 22:48
The data frame defined in
df = ...
is missing some rows compared to the example below.– Matthias Ossadnik
Nov 19 at 22:48
The data frame defined in
df = ...
is missing some rows compared to the example below.– Matthias Ossadnik
Nov 19 at 22:48
add a comment |
4 Answers
4
active
oldest
votes
up vote
2
down vote
accepted
With a little setup, you can 100% vectorise this using GroupBy.agg
:
aggfunc = {
'col1': [('type', 'first'), ('length', 'count')],
'start': [('diff', lambda x: abs(x.iat[-1] - x.iat[0]))]
}
grouper = df.col1.ne(df.col1.shift()).cumsum()
v = df.assign(key=grouper).groupby('key').agg(aggfunc)
v.columns = v.columns.droplevel(0)
v[v['diff'].ne(0)].reset_index(drop=True)
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
add a comment |
up vote
2
down vote
probably something like the below:
import pandas as pd
from itertools import groupby
df = pd.DataFrame({
'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G','C>T','C>T','C>T'],
'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG','ACA','TCA','TCA'],
'start':[1000,2000,3000,4000,5000,6000,10000,20000,2000,3000,4000]})
final =
pos = 0
for k,g in groupby([row.col1 for n,row in df.iterrows()]):
glist = [x for x in g]
first_pos = pos
last_pos = pos+len(glist)-1
if len(glist)>1:
print(glist)
val = df.iloc[first_pos].col1
first = df.iloc[first_pos].start
last = df.iloc[last_pos].start
final.append({'type':val,'length':len(glist),'diff':last-first})
pos = last_pos +1
final = pd.DataFrame(final)
print(final)
output:
diff length type
0 1000 2 C>T
1 14000 3 A>G
2 2000 3 C>T
add a comment |
up vote
0
down vote
Here is a two-step solution, first creating an auxiliary column that labels consecutive occurrences of the same string, the then using standard pandas groupby:
# add a group variable
values = df['col1'].values
# get locations where value changes
change = np.zeros(values.size, dtype=bool)
change[1:] = values[:-1] != values[1:]
df['group'] = change.cumsum() # summing change points yields the label
# do the aggregation
res = (df
.groupby('group')
.agg({'start': lambda x: x.max() - x.min(), 'col1': 'first', 'col2': 'size'})
.rename(columns={'col1': 'type', 'col2': 'length', 'start': 'diff'})
)
# filter on more than one consecutive value
res = res[res['length'] > 1]
print(res)
diff type length
group
1 1000 C>T 2
4 14000 A>G 3
5 2000 C>T 3
add a comment |
up vote
0
down vote
You can use pandas groupby
and more_itertools
:
import more_itertools as mit
def f(g):
result = pd.DataFrame(, columns={'type', 'length', 'diff'})
tp = g['col1'].iloc[0]
for group in mit.consecutive_groups(g.index):
group = list(group)
if len(group) == 1:
continue
cur_df = pd.DataFrame({'type': [tp], 'length': [len(group)], 'diff': g.loc[group[-1]]['start'] - g.loc[group[0]]['start']})
result = pd.concat([result, cur_df], ignore_index=True)
return result
df.groupby('col1').apply(f).reset_index(drop=True)
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
accepted
With a little setup, you can 100% vectorise this using GroupBy.agg
:
aggfunc = {
'col1': [('type', 'first'), ('length', 'count')],
'start': [('diff', lambda x: abs(x.iat[-1] - x.iat[0]))]
}
grouper = df.col1.ne(df.col1.shift()).cumsum()
v = df.assign(key=grouper).groupby('key').agg(aggfunc)
v.columns = v.columns.droplevel(0)
v[v['diff'].ne(0)].reset_index(drop=True)
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
add a comment |
up vote
2
down vote
accepted
With a little setup, you can 100% vectorise this using GroupBy.agg
:
aggfunc = {
'col1': [('type', 'first'), ('length', 'count')],
'start': [('diff', lambda x: abs(x.iat[-1] - x.iat[0]))]
}
grouper = df.col1.ne(df.col1.shift()).cumsum()
v = df.assign(key=grouper).groupby('key').agg(aggfunc)
v.columns = v.columns.droplevel(0)
v[v['diff'].ne(0)].reset_index(drop=True)
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
add a comment |
up vote
2
down vote
accepted
up vote
2
down vote
accepted
With a little setup, you can 100% vectorise this using GroupBy.agg
:
aggfunc = {
'col1': [('type', 'first'), ('length', 'count')],
'start': [('diff', lambda x: abs(x.iat[-1] - x.iat[0]))]
}
grouper = df.col1.ne(df.col1.shift()).cumsum()
v = df.assign(key=grouper).groupby('key').agg(aggfunc)
v.columns = v.columns.droplevel(0)
v[v['diff'].ne(0)].reset_index(drop=True)
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
With a little setup, you can 100% vectorise this using GroupBy.agg
:
aggfunc = {
'col1': [('type', 'first'), ('length', 'count')],
'start': [('diff', lambda x: abs(x.iat[-1] - x.iat[0]))]
}
grouper = df.col1.ne(df.col1.shift()).cumsum()
v = df.assign(key=grouper).groupby('key').agg(aggfunc)
v.columns = v.columns.droplevel(0)
v[v['diff'].ne(0)].reset_index(drop=True)
type length diff
0 C>T 2 1000
1 A>G 3 14000
2 C>T 3 2000
answered Nov 19 at 22:48
coldspeed
112k17101170
112k17101170
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
add a comment |
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
1
1
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
up voted. imo, this is the most concise and optimized solution.
– teng
Nov 19 at 22:54
1
1
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
@teng Thanks, returned :)
– coldspeed
Nov 19 at 22:55
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
Thanks. How does this aggfunc work here? Could you please explain?
– burcak
Nov 20 at 18:59
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
@burcak The keys are columns to aggregate. The values are a list of tuples. The first element is the column name of the output column, and the second is a function (or function name as a string) that does the aggregation.
– coldspeed
Nov 20 at 21:17
add a comment |
up vote
2
down vote
probably something like the below:
import pandas as pd
from itertools import groupby
df = pd.DataFrame({
'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G','C>T','C>T','C>T'],
'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG','ACA','TCA','TCA'],
'start':[1000,2000,3000,4000,5000,6000,10000,20000,2000,3000,4000]})
final =
pos = 0
for k,g in groupby([row.col1 for n,row in df.iterrows()]):
glist = [x for x in g]
first_pos = pos
last_pos = pos+len(glist)-1
if len(glist)>1:
print(glist)
val = df.iloc[first_pos].col1
first = df.iloc[first_pos].start
last = df.iloc[last_pos].start
final.append({'type':val,'length':len(glist),'diff':last-first})
pos = last_pos +1
final = pd.DataFrame(final)
print(final)
output:
diff length type
0 1000 2 C>T
1 14000 3 A>G
2 2000 3 C>T
add a comment |
up vote
2
down vote
probably something like the below:
import pandas as pd
from itertools import groupby
df = pd.DataFrame({
'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G','C>T','C>T','C>T'],
'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG','ACA','TCA','TCA'],
'start':[1000,2000,3000,4000,5000,6000,10000,20000,2000,3000,4000]})
final =
pos = 0
for k,g in groupby([row.col1 for n,row in df.iterrows()]):
glist = [x for x in g]
first_pos = pos
last_pos = pos+len(glist)-1
if len(glist)>1:
print(glist)
val = df.iloc[first_pos].col1
first = df.iloc[first_pos].start
last = df.iloc[last_pos].start
final.append({'type':val,'length':len(glist),'diff':last-first})
pos = last_pos +1
final = pd.DataFrame(final)
print(final)
output:
diff length type
0 1000 2 C>T
1 14000 3 A>G
2 2000 3 C>T
add a comment |
up vote
2
down vote
up vote
2
down vote
probably something like the below:
import pandas as pd
from itertools import groupby
df = pd.DataFrame({
'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G','C>T','C>T','C>T'],
'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG','ACA','TCA','TCA'],
'start':[1000,2000,3000,4000,5000,6000,10000,20000,2000,3000,4000]})
final =
pos = 0
for k,g in groupby([row.col1 for n,row in df.iterrows()]):
glist = [x for x in g]
first_pos = pos
last_pos = pos+len(glist)-1
if len(glist)>1:
print(glist)
val = df.iloc[first_pos].col1
first = df.iloc[first_pos].start
last = df.iloc[last_pos].start
final.append({'type':val,'length':len(glist),'diff':last-first})
pos = last_pos +1
final = pd.DataFrame(final)
print(final)
output:
diff length type
0 1000 2 C>T
1 14000 3 A>G
2 2000 3 C>T
probably something like the below:
import pandas as pd
from itertools import groupby
df = pd.DataFrame({
'col1':['A>G','C>T','C>T','G>T','C>T', 'A>G','A>G','A>G','C>T','C>T','C>T'],
'col2':['TCT','ACA','TCA','TCA','GCT', 'ACT','CTG','ATG','ACA','TCA','TCA'],
'start':[1000,2000,3000,4000,5000,6000,10000,20000,2000,3000,4000]})
final =
pos = 0
for k,g in groupby([row.col1 for n,row in df.iterrows()]):
glist = [x for x in g]
first_pos = pos
last_pos = pos+len(glist)-1
if len(glist)>1:
print(glist)
val = df.iloc[first_pos].col1
first = df.iloc[first_pos].start
last = df.iloc[last_pos].start
final.append({'type':val,'length':len(glist),'diff':last-first})
pos = last_pos +1
final = pd.DataFrame(final)
print(final)
output:
diff length type
0 1000 2 C>T
1 14000 3 A>G
2 2000 3 C>T
answered Nov 19 at 22:34
teng
767621
767621
add a comment |
add a comment |
up vote
0
down vote
Here is a two-step solution, first creating an auxiliary column that labels consecutive occurrences of the same string, the then using standard pandas groupby:
# add a group variable
values = df['col1'].values
# get locations where value changes
change = np.zeros(values.size, dtype=bool)
change[1:] = values[:-1] != values[1:]
df['group'] = change.cumsum() # summing change points yields the label
# do the aggregation
res = (df
.groupby('group')
.agg({'start': lambda x: x.max() - x.min(), 'col1': 'first', 'col2': 'size'})
.rename(columns={'col1': 'type', 'col2': 'length', 'start': 'diff'})
)
# filter on more than one consecutive value
res = res[res['length'] > 1]
print(res)
diff type length
group
1 1000 C>T 2
4 14000 A>G 3
5 2000 C>T 3
add a comment |
up vote
0
down vote
Here is a two-step solution, first creating an auxiliary column that labels consecutive occurrences of the same string, the then using standard pandas groupby:
# add a group variable
values = df['col1'].values
# get locations where value changes
change = np.zeros(values.size, dtype=bool)
change[1:] = values[:-1] != values[1:]
df['group'] = change.cumsum() # summing change points yields the label
# do the aggregation
res = (df
.groupby('group')
.agg({'start': lambda x: x.max() - x.min(), 'col1': 'first', 'col2': 'size'})
.rename(columns={'col1': 'type', 'col2': 'length', 'start': 'diff'})
)
# filter on more than one consecutive value
res = res[res['length'] > 1]
print(res)
diff type length
group
1 1000 C>T 2
4 14000 A>G 3
5 2000 C>T 3
add a comment |
up vote
0
down vote
up vote
0
down vote
Here is a two-step solution, first creating an auxiliary column that labels consecutive occurrences of the same string, the then using standard pandas groupby:
# add a group variable
values = df['col1'].values
# get locations where value changes
change = np.zeros(values.size, dtype=bool)
change[1:] = values[:-1] != values[1:]
df['group'] = change.cumsum() # summing change points yields the label
# do the aggregation
res = (df
.groupby('group')
.agg({'start': lambda x: x.max() - x.min(), 'col1': 'first', 'col2': 'size'})
.rename(columns={'col1': 'type', 'col2': 'length', 'start': 'diff'})
)
# filter on more than one consecutive value
res = res[res['length'] > 1]
print(res)
diff type length
group
1 1000 C>T 2
4 14000 A>G 3
5 2000 C>T 3
Here is a two-step solution, first creating an auxiliary column that labels consecutive occurrences of the same string, the then using standard pandas groupby:
# add a group variable
values = df['col1'].values
# get locations where value changes
change = np.zeros(values.size, dtype=bool)
change[1:] = values[:-1] != values[1:]
df['group'] = change.cumsum() # summing change points yields the label
# do the aggregation
res = (df
.groupby('group')
.agg({'start': lambda x: x.max() - x.min(), 'col1': 'first', 'col2': 'size'})
.rename(columns={'col1': 'type', 'col2': 'length', 'start': 'diff'})
)
# filter on more than one consecutive value
res = res[res['length'] > 1]
print(res)
diff type length
group
1 1000 C>T 2
4 14000 A>G 3
5 2000 C>T 3
answered Nov 19 at 22:38
Matthias Ossadnik
57427
57427
add a comment |
add a comment |
up vote
0
down vote
You can use pandas groupby
and more_itertools
:
import more_itertools as mit
def f(g):
result = pd.DataFrame(, columns={'type', 'length', 'diff'})
tp = g['col1'].iloc[0]
for group in mit.consecutive_groups(g.index):
group = list(group)
if len(group) == 1:
continue
cur_df = pd.DataFrame({'type': [tp], 'length': [len(group)], 'diff': g.loc[group[-1]]['start'] - g.loc[group[0]]['start']})
result = pd.concat([result, cur_df], ignore_index=True)
return result
df.groupby('col1').apply(f).reset_index(drop=True)
add a comment |
up vote
0
down vote
You can use pandas groupby
and more_itertools
:
import more_itertools as mit
def f(g):
result = pd.DataFrame(, columns={'type', 'length', 'diff'})
tp = g['col1'].iloc[0]
for group in mit.consecutive_groups(g.index):
group = list(group)
if len(group) == 1:
continue
cur_df = pd.DataFrame({'type': [tp], 'length': [len(group)], 'diff': g.loc[group[-1]]['start'] - g.loc[group[0]]['start']})
result = pd.concat([result, cur_df], ignore_index=True)
return result
df.groupby('col1').apply(f).reset_index(drop=True)
add a comment |
up vote
0
down vote
up vote
0
down vote
You can use pandas groupby
and more_itertools
:
import more_itertools as mit
def f(g):
result = pd.DataFrame(, columns={'type', 'length', 'diff'})
tp = g['col1'].iloc[0]
for group in mit.consecutive_groups(g.index):
group = list(group)
if len(group) == 1:
continue
cur_df = pd.DataFrame({'type': [tp], 'length': [len(group)], 'diff': g.loc[group[-1]]['start'] - g.loc[group[0]]['start']})
result = pd.concat([result, cur_df], ignore_index=True)
return result
df.groupby('col1').apply(f).reset_index(drop=True)
You can use pandas groupby
and more_itertools
:
import more_itertools as mit
def f(g):
result = pd.DataFrame(, columns={'type', 'length', 'diff'})
tp = g['col1'].iloc[0]
for group in mit.consecutive_groups(g.index):
group = list(group)
if len(group) == 1:
continue
cur_df = pd.DataFrame({'type': [tp], 'length': [len(group)], 'diff': g.loc[group[-1]]['start'] - g.loc[group[0]]['start']})
result = pd.concat([result, cur_df], ignore_index=True)
return result
df.groupby('col1').apply(f).reset_index(drop=True)
edited Nov 19 at 22:44
answered Nov 19 at 22:38
Eric Wang
30018
30018
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53383208%2fhow-to-find-the-count-of-consecutive-same-string-values-in-a-pandas-dataframe%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
The data frame defined in
df = ...
is missing some rows compared to the example below.– Matthias Ossadnik
Nov 19 at 22:48