Remove first N lines on a character column in a data frame
I have a data frame containing emails. There is a column named "message" that looks like this:
> > dataset$message[1]
>[1] Message-ID:...
>
> Date: ...
>
> From: ...
>
> To:...
>
> Subject: ...
>
> Mime-Version: ...
>
> Content-Type:...
>
> Content-Transfer-Encoding: ...
>
> X-From:...
>
> X-To: ...
>
> X-cc:...
>
> X-bcc: ...
>
> X-Folder: ...
>
> X-Origin: ...
>
> X-FileName: ...
>
> > Some message text
In other words, each entry contains 15 lines of headers and then the text. What I want is to remove these 15 lines from each row and be left only with the text, so that
>dataset$message[1]
looks like this:
> Some message text
r
add a comment |
I have a data frame containing emails. There is a column named "message" that looks like this:
> > dataset$message[1]
>[1] Message-ID:...
>
> Date: ...
>
> From: ...
>
> To:...
>
> Subject: ...
>
> Mime-Version: ...
>
> Content-Type:...
>
> Content-Transfer-Encoding: ...
>
> X-From:...
>
> X-To: ...
>
> X-cc:...
>
> X-bcc: ...
>
> X-Folder: ...
>
> X-Origin: ...
>
> X-FileName: ...
>
> > Some message text
In other words, each entry contains 15 lines of headers and then the text. What I want is to remove these 15 lines from each row and be left only with the text, so that
>dataset$message[1]
looks like this:
> Some message text
r
2
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments toread.table.
– Konrad Rudolph
Nov 22 '18 at 14:13
add a comment |
I have a data frame containing emails. There is a column named "message" that looks like this:
> > dataset$message[1]
>[1] Message-ID:...
>
> Date: ...
>
> From: ...
>
> To:...
>
> Subject: ...
>
> Mime-Version: ...
>
> Content-Type:...
>
> Content-Transfer-Encoding: ...
>
> X-From:...
>
> X-To: ...
>
> X-cc:...
>
> X-bcc: ...
>
> X-Folder: ...
>
> X-Origin: ...
>
> X-FileName: ...
>
> > Some message text
In other words, each entry contains 15 lines of headers and then the text. What I want is to remove these 15 lines from each row and be left only with the text, so that
>dataset$message[1]
looks like this:
> Some message text
r
I have a data frame containing emails. There is a column named "message" that looks like this:
> > dataset$message[1]
>[1] Message-ID:...
>
> Date: ...
>
> From: ...
>
> To:...
>
> Subject: ...
>
> Mime-Version: ...
>
> Content-Type:...
>
> Content-Transfer-Encoding: ...
>
> X-From:...
>
> X-To: ...
>
> X-cc:...
>
> X-bcc: ...
>
> X-Folder: ...
>
> X-Origin: ...
>
> X-FileName: ...
>
> > Some message text
In other words, each entry contains 15 lines of headers and then the text. What I want is to remove these 15 lines from each row and be left only with the text, so that
>dataset$message[1]
looks like this:
> Some message text
r
r
edited Nov 22 '18 at 14:40
Sotos
29.3k51640
29.3k51640
asked Nov 22 '18 at 14:05
Eduardo Javier Huerta YeroEduardo Javier Huerta Yero
177110
177110
2
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments toread.table.
– Konrad Rudolph
Nov 22 '18 at 14:13
add a comment |
2
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments toread.table.
– Konrad Rudolph
Nov 22 '18 at 14:13
2
2
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments to
read.table.– Konrad Rudolph
Nov 22 '18 at 14:13
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments to
read.table.– Konrad Rudolph
Nov 22 '18 at 14:13
add a comment |
1 Answer
1
active
oldest
votes
Something like this would work:
sub("^(?:.*\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
What's the?:inside the regex?
– iod
Nov 22 '18 at 15:06
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432725%2fremove-first-n-lines-on-a-character-column-in-a-data-frame%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Something like this would work:
sub("^(?:.*\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
What's the?:inside the regex?
– iod
Nov 22 '18 at 15:06
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
add a comment |
Something like this would work:
sub("^(?:.*\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
What's the?:inside the regex?
– iod
Nov 22 '18 at 15:06
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
add a comment |
Something like this would work:
sub("^(?:.*\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
Something like this would work:
sub("^(?:.*\n){15}", "", multiline_string_mail, perl = TRUE)
#[1] "Super secret message"
example data: (you should always provide usable example data)
multiline_string_mail =
"hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
hehe
Super secret message"
answered Nov 22 '18 at 14:29
Andre ElricoAndre Elrico
5,68911028
5,68911028
What's the?:inside the regex?
– iod
Nov 22 '18 at 15:06
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
add a comment |
What's the?:inside the regex?
– iod
Nov 22 '18 at 15:06
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
What's the
?: inside the regex?– iod
Nov 22 '18 at 15:06
What's the
?: inside the regex?– iod
Nov 22 '18 at 15:06
1
1
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
stackoverflow.com/questions/36524507/…
– Andre Elrico
Nov 22 '18 at 15:07
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
This worked for a single entry of the data frame. I put it inside a loop and applied to every entry. Thanks a lot, I'm marking it as an answer.
– Eduardo Javier Huerta Yero
Nov 22 '18 at 17:40
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53432725%2fremove-first-n-lines-on-a-character-column-in-a-data-frame%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Please provide a reproducible example along with expected output. Also don't forget to post your attempt that failed. Cheers
– Sotos
Nov 22 '18 at 14:11
Once the data is inside the data.frame it’s too late. You want to remove it before it gets read into the data.frame, e.g. by providing the appropriate arguments to
read.table.– Konrad Rudolph
Nov 22 '18 at 14:13