Write in a column of a csv formatted file after n lines?

I'm new to python.
I'm having problems working with a csv file.
This is a file that has 12 lines of header and after starts the data.
I've to read some datas from columns (on that is ok) and after an elaboration I've to add to the same file a column with a value in each row but without any id in the first column and the column had to start from the 13th line not from the first.

I've tried to use pandas library but it doesn't work

df = pd.read_csv("./1540476113.gt.tie")

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

where package is the name of the column (but i know also the index) and packages is the array of string (the elements that I've to write).
This code works but starts to add from the first line (I don't know how can i set an offset) and add to the file the index in the first column (non wanted) and a char ' before each element.
sep is the separator of each column.

Sample input data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap



# Working Mode: off-line

# Session Type: biflow

# 1 plugins enabled: ndping



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip          dst_ip          proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start                 t_last                  app_id  sub_id  app_details     confidence

17      192.168.20.105  216.58.205.42   6       50854   443     8       9       1507    1728    1540476136.698920       1540476136.879543       501     0       Google  100

26      192.168.20.105  151.101.66.202  6       40107   443     15      18      5874    1882    1540476194.196948       1540476204.641949       501     0       SSL_with_certificate    100

27      192.168.20.105  31.13.90.2      6       48133   443     10      15      4991    1598    1540476194.218949       1540476196.358946       501     0       Facebook        100

Sample output data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r           /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap 



# Working Mode: off-line

# Session Type: biflow 

# 1 plugins enabled: ndping 



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence  package

17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100  N/C    

26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100 com.joelapenna.foursquared  

38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0   N/C

29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared  

31  192.168.20.105  8.8.8.8 17  1219    53  1   1   253 68  1540476194.898945   1540476194.931198   501 0   DNS 100

I do not care of the alinemen, the delimiter of each column is a 't'.

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

Please show sample input data and desired output data. See Minimal, Complete, and Verifiable example guidelines.

– Mark Tolonen
Nov 24 '18 at 19:01

Done, updated the question

– Giuseppe Ferrara
Nov 25 '18 at 14:50

add a comment |

I've tried to use pandas library but it doesn't work

df = pd.read_csv("./1540476113.gt.tie")

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Sample input data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap



# Working Mode: off-line

# Session Type: biflow

# 1 plugins enabled: ndping



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip          dst_ip          proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start                 t_last                  app_id  sub_id  app_details     confidence

17      192.168.20.105  216.58.205.42   6       50854   443     8       9       1507    1728    1540476136.698920       1540476136.879543       501     0       Google  100

26      192.168.20.105  151.101.66.202  6       40107   443     15      18      5874    1882    1540476194.196948       1540476204.641949       501     0       SSL_with_certificate    100

27      192.168.20.105  31.13.90.2      6       48133   443     10      15      4991    1598    1540476194.218949       1540476196.358946       501     0       Facebook        100

Sample output data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r           /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap 



# Working Mode: off-line

# Session Type: biflow 

# 1 plugins enabled: ndping 



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence  package

17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100  N/C    

26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100 com.joelapenna.foursquared  

38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0   N/C

29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared  

31  192.168.20.105  8.8.8.8 17  1219    53  1   1   253 68  1540476194.898945   1540476194.931198   501 0   DNS 100

I do not care of the alinemen, the delimiter of each column is a 't'.

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

Please show sample input data and desired output data. See Minimal, Complete, and Verifiable example guidelines.

– Mark Tolonen
Nov 24 '18 at 19:01

Done, updated the question

– Giuseppe Ferrara
Nov 25 '18 at 14:50

add a comment |

I've tried to use pandas library but it doesn't work

df = pd.read_csv("./1540476113.gt.tie")

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Sample input data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap



# Working Mode: off-line

# Session Type: biflow

# 1 plugins enabled: ndping



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip          dst_ip          proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start                 t_last                  app_id  sub_id  app_details     confidence

17      192.168.20.105  216.58.205.42   6       50854   443     8       9       1507    1728    1540476136.698920       1540476136.879543       501     0       Google  100

26      192.168.20.105  151.101.66.202  6       40107   443     15      18      5874    1882    1540476194.196948       1540476204.641949       501     0       SSL_with_certificate    100

27      192.168.20.105  31.13.90.2      6       48133   443     10      15      4991    1598    1540476194.218949       1540476196.358946       501     0       Facebook        100

Sample output data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r           /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap 



# Working Mode: off-line

# Session Type: biflow 

# 1 plugins enabled: ndping 



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence  package

17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100  N/C    

26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100 com.joelapenna.foursquared  

38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0   N/C

29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared  

31  192.168.20.105  8.8.8.8 17  1219    53  1   1   253 68  1540476194.898945   1540476194.931198   501 0   DNS 100

I do not care of the alinemen, the delimiter of each column is a 't'.

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

I've tried to use pandas library but it doesn't work

df = pd.read_csv("./1540476113.gt.tie")

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Sample input data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap



# Working Mode: off-line

# Session Type: biflow

# 1 plugins enabled: ndping



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip          dst_ip          proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start                 t_last                  app_id  sub_id  app_details     confidence

17      192.168.20.105  216.58.205.42   6       50854   443     8       9       1507    1728    1540476136.698920       1540476136.879543       501     0       Google  100

26      192.168.20.105  151.101.66.202  6       40107   443     15      18      5874    1882    1540476194.196948       1540476204.641949       501     0       SSL_with_certificate    100

27      192.168.20.105  31.13.90.2      6       48133   443     10      15      4991    1598    1540476194.218949       1540476196.358946       501     0       Facebook        100

Sample output data:

# TIE output version: 1.0 (text format)

# generated by: . -a ndping_1.0 -r           /home/giuseppe/Scrivania/gruppo30/1540476113/traffic.pcap 



# Working Mode: off-line

# Session Type: biflow 

# 1 plugins enabled: ndping 



# begin trace interval: 1540476116.42434



# begin TIE Table

# id    src_ip      dst_ip      proto   sport   dport   dwpkts  uppkts  dwbytes upbytes t_start         t_last          app_id  sub_id  app_details confidence  package

17  192.168.20.105  216.58.205.42   6   50854   443 8   9   1507    1728    1540476136.698920   1540476136.879543   501 0   Google  100  N/C    

26  192.168.20.105  151.101.66.202  6   40107   443 15  18  5874    1882    1540476194.196948   1540476204.641949   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

27  192.168.20.105  31.13.90.2  6   48133   443 10  15  4991    1598    1540476194.218949   1540476196.358946   501 0   Facebook    100 com.joelapenna.foursquared  

38  192.168.20.105  13.32.71.69 6   52108   443 9   12  5297    2062    1540476195.492946   1540476308.604998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared

0   34.246.212.92   192.168.20.105  6   443 37981   3   2   187 98  1540476116.042434   1540476189.868844   0   0   Other TCP   0   N/C

29  192.168.20.105  13.32.123.222   6   36481   443 11  15  6638    1914    1540476194.376945   1540476308.572998   501 0   SSL_with_certificate    100 com.joelapenna.foursquared  

31  192.168.20.105  8.8.8.8 17  1219    53  1   1   253 68  1540476194.898945   1540476194.931198   501 0   DNS 100

I do not care of the alinemen, the delimiter of each column is a 't'.

python pandas csv dataframe file-io

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

edited Nov 25 '18 at 8:36

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

asked Nov 24 '18 at 17:26

Giuseppe Ferrara

103

Please show sample input data and desired output data. See Minimal, Complete, and Verifiable example guidelines.

– Mark Tolonen
Nov 24 '18 at 19:01

Done, updated the question

– Giuseppe Ferrara
Nov 25 '18 at 14:50

add a comment |

Please show sample input data and desired output data. See Minimal, Complete, and Verifiable example guidelines.

– Mark Tolonen
Nov 24 '18 at 19:01

Done, updated the question

– Giuseppe Ferrara
Nov 25 '18 at 14:50

Please show sample input data and desired output data. See Minimal, Complete, and Verifiable example guidelines.

– Mark Tolonen
Nov 24 '18 at 19:01

Done, updated the question

– Giuseppe Ferrara
Nov 25 '18 at 14:50

add a comment |

1 Answer
1

active

oldest

votes

You can skip to the data by passing some arguments to read_csv.

df = pd.read_csv("./1540476113.gt.tie", header=None, skiprows=12)

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Then explicitly name your columns:

df.columns = [col_names]

If the 13th row is a header row with the column names that you want then do not pass the header=None argument.

Check out more in the docs here.

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

I've tried this (df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11) df["package"] = pd.Series(packages) df.columns = ["package"] df.to_csv("./1540476113.gt.tie", sep = "t") because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53460651%2fwrite-in-a-column-of-a-csv-formatted-file-after-n-lines%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You can skip to the data by passing some arguments to read_csv.

df = pd.read_csv("./1540476113.gt.tie", header=None, skiprows=12)

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Then explicitly name your columns:

df.columns = [col_names]

If the 13th row is a header row with the column names that you want then do not pass the header=None argument.

Check out more in the docs here.

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

I've tried this (df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11) df["package"] = pd.Series(packages) df.columns = ["package"] df.to_csv("./1540476113.gt.tie", sep = "t") because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

add a comment |

You can skip to the data by passing some arguments to read_csv.

df = pd.read_csv("./1540476113.gt.tie", header=None, skiprows=12)

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Then explicitly name your columns:

df.columns = [col_names]

If the 13th row is a header row with the column names that you want then do not pass the header=None argument.

Check out more in the docs here.

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

I've tried this (df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11) df["package"] = pd.Series(packages) df.columns = ["package"] df.to_csv("./1540476113.gt.tie", sep = "t") because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

add a comment |

You can skip to the data by passing some arguments to read_csv.

df = pd.read_csv("./1540476113.gt.tie", header=None, skiprows=12)

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Then explicitly name your columns:

df.columns = [col_names]

If the 13th row is a header row with the column names that you want then do not pass the header=None argument.

Check out more in the docs here.

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

You can skip to the data by passing some arguments to read_csv.

df = pd.read_csv("./1540476113.gt.tie", header=None, skiprows=12)

df["package"] = pd.Series(packages)

df.to_csv("./1540476113.gt.tie", sep = "t")

Then explicitly name your columns:

df.columns = [col_names]

If the 13th row is a header row with the column names that you want then do not pass the header=None argument.

Check out more in the docs here.

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

answered Nov 24 '18 at 18:13

Austin Mackillop

36527

I've tried this (df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11) df["package"] = pd.Series(packages) df.columns = ["package"] df.to_csv("./1540476113.gt.tie", sep = "t") because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

add a comment |

I've tried this (df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11) df["package"] = pd.Series(packages) df.columns = ["package"] df.to_csv("./1540476113.gt.tie", sep = "t") because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

I've tried this

(df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11)  df["package"] = pd.Series(packages)  df.columns = ["package"]  df.to_csv("./1540476113.gt.tie", sep = "t")

because in the file the 12th row is still the header but i've to insert the column label. However it doesn't work and says me error : ValueError: Length mismatch: Expected axis has 2 elements, new values have 1 elements Deleting the 'code'f.columns=[col_names] it skips the header in the new file and delete that!

– Giuseppe Ferrara
Nov 25 '18 at 8:09

I've tried this

(df = pd.read_csv("./1540476113.gt.tie",header=None,skiprows=11)  df["package"] = pd.Series(packages)  df.columns = ["package"]  df.to_csv("./1540476113.gt.tie", sep = "t")

You are writing df.columns = ['package'] instead of df.columns = ['id', ......, 'package'].

– foobarna
Nov 25 '18 at 8:21

I've tried to pass the array of all the column names but the error is ValueError: Length mismatch: Expected 2 elements, new values have 23 elements the 23 is bad but just because some column has two 't' as delimiter, i can correct that but however is not two columns!

– Giuseppe Ferrara
Nov 25 '18 at 8:40

@GiuseppeFerrara Try just reading in the data without skipping rows and header=None then slice just the data you need with df.iloc[data_start_row:, [col_indices]]. After that, make sure that the length of the list of column names matches the length of the column index list.

– Austin Mackillop
Nov 25 '18 at 20:59

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ytukyg