Regular expressions: find substring in sentence












1















I have many sentences:



 1) the 3d line chart will show area in 3d.
2) udcv123hi2ry32 the this line chart is useful.
3) this chart.
4) a chart.
5) a line chart.
6) this bar chart
7) ...


And I have conditions



 1) substrings start by 'a' or 'the' or 'this' or '[chart name]'
2) '[chart name] chart' is ok but 'this chart', 'a chart' are not accepted.
(e.g. bar chart, line chart, this line chart, a area chart: OK,
this chart, a chart: not accepted)
3) substrings end by '.(dot)'


consequently, i need to find substrings that meet condition.



In this case the string:



"this line chart is very useful.", 
"area chart is very useful." are exactly what I want to receive.


I try to this via regular expression like this(https://regex101.com/r/aX5htr/2):



(a|the|this)* *((?!bthis chartb|bwhich chartb|ba chartb|bthe chartb|bthat chartb|d+).+ chart) .+.


but not matched...



how to solve this situations??










share|improve this question


















  • 1





    Do you need something like this?

    – Wiktor Stribiżew
    Nov 26 '18 at 8:43











  • @Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

    – gukwon
    Nov 26 '18 at 10:06













  • Added with a demo and explanations.

    – Wiktor Stribiżew
    Nov 26 '18 at 10:37
















1















I have many sentences:



 1) the 3d line chart will show area in 3d.
2) udcv123hi2ry32 the this line chart is useful.
3) this chart.
4) a chart.
5) a line chart.
6) this bar chart
7) ...


And I have conditions



 1) substrings start by 'a' or 'the' or 'this' or '[chart name]'
2) '[chart name] chart' is ok but 'this chart', 'a chart' are not accepted.
(e.g. bar chart, line chart, this line chart, a area chart: OK,
this chart, a chart: not accepted)
3) substrings end by '.(dot)'


consequently, i need to find substrings that meet condition.



In this case the string:



"this line chart is very useful.", 
"area chart is very useful." are exactly what I want to receive.


I try to this via regular expression like this(https://regex101.com/r/aX5htr/2):



(a|the|this)* *((?!bthis chartb|bwhich chartb|ba chartb|bthe chartb|bthat chartb|d+).+ chart) .+.


but not matched...



how to solve this situations??










share|improve this question


















  • 1





    Do you need something like this?

    – Wiktor Stribiżew
    Nov 26 '18 at 8:43











  • @Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

    – gukwon
    Nov 26 '18 at 10:06













  • Added with a demo and explanations.

    – Wiktor Stribiżew
    Nov 26 '18 at 10:37














1












1








1


2






I have many sentences:



 1) the 3d line chart will show area in 3d.
2) udcv123hi2ry32 the this line chart is useful.
3) this chart.
4) a chart.
5) a line chart.
6) this bar chart
7) ...


And I have conditions



 1) substrings start by 'a' or 'the' or 'this' or '[chart name]'
2) '[chart name] chart' is ok but 'this chart', 'a chart' are not accepted.
(e.g. bar chart, line chart, this line chart, a area chart: OK,
this chart, a chart: not accepted)
3) substrings end by '.(dot)'


consequently, i need to find substrings that meet condition.



In this case the string:



"this line chart is very useful.", 
"area chart is very useful." are exactly what I want to receive.


I try to this via regular expression like this(https://regex101.com/r/aX5htr/2):



(a|the|this)* *((?!bthis chartb|bwhich chartb|ba chartb|bthe chartb|bthat chartb|d+).+ chart) .+.


but not matched...



how to solve this situations??










share|improve this question














I have many sentences:



 1) the 3d line chart will show area in 3d.
2) udcv123hi2ry32 the this line chart is useful.
3) this chart.
4) a chart.
5) a line chart.
6) this bar chart
7) ...


And I have conditions



 1) substrings start by 'a' or 'the' or 'this' or '[chart name]'
2) '[chart name] chart' is ok but 'this chart', 'a chart' are not accepted.
(e.g. bar chart, line chart, this line chart, a area chart: OK,
this chart, a chart: not accepted)
3) substrings end by '.(dot)'


consequently, i need to find substrings that meet condition.



In this case the string:



"this line chart is very useful.", 
"area chart is very useful." are exactly what I want to receive.


I try to this via regular expression like this(https://regex101.com/r/aX5htr/2):



(a|the|this)* *((?!bthis chartb|bwhich chartb|ba chartb|bthe chartb|bthat chartb|d+).+ chart) .+.


but not matched...



how to solve this situations??







regex perl






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 26 '18 at 5:09









gukwongukwon

454




454








  • 1





    Do you need something like this?

    – Wiktor Stribiżew
    Nov 26 '18 at 8:43











  • @Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

    – gukwon
    Nov 26 '18 at 10:06













  • Added with a demo and explanations.

    – Wiktor Stribiżew
    Nov 26 '18 at 10:37














  • 1





    Do you need something like this?

    – Wiktor Stribiżew
    Nov 26 '18 at 8:43











  • @Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

    – gukwon
    Nov 26 '18 at 10:06













  • Added with a demo and explanations.

    – Wiktor Stribiżew
    Nov 26 '18 at 10:37








1




1





Do you need something like this?

– Wiktor Stribiżew
Nov 26 '18 at 8:43





Do you need something like this?

– Wiktor Stribiżew
Nov 26 '18 at 8:43













@Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

– gukwon
Nov 26 '18 at 10:06







@Wiktor Stribiżew: what a perfect solution...!! but it's absolutely difficult.. can you explain more detail? thank you.

– gukwon
Nov 26 '18 at 10:06















Added with a demo and explanations.

– Wiktor Stribiżew
Nov 26 '18 at 10:37





Added with a demo and explanations.

– Wiktor Stribiżew
Nov 26 '18 at 10:37












1 Answer
1






active

oldest

votes


















2














You may use



my $rx = qr/(?x)                 # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;


See the regex demo.



See the Perl demo online:



use strict;
use warnings;

my $rx = qr/(?x) # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;
while (<DATA>) {
if (/$rx/) {
print "$+{res}n";
}
}

__DATA__
this chart.
this line chart.
this bar chart.
21684564523 this chart.
556465465456 this a line chart.
a chart.
a line chart.
which chart.
all this chart.
a chart.
123123 this chart..
123123 which chart.
all this line chart.
a line chart.
the 3d line chart will show area in 3d.
line chart.
area chart.
the chart.
1221513513 line chart.
1234125135 the chart.
123123 this bar chart.
udcvhi2ry32 the this line chart is useful.
twl chart.


Output:



this line chart
this bar chart
a line chart
a line chart
this line chart
a line chart
line chart will show area in 3d
line chart
area chart
line chart
this bar chart
this line chart is useful
twl chart





share|improve this answer



















  • 1





    really appreciate!!!!

    – gukwon
    Nov 26 '18 at 12:45











  • complete your request thank you :)

    – gukwon
    Nov 26 '18 at 12:57













  • @user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

    – Wiktor Stribiżew
    Nov 26 '18 at 15:32











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53475057%2fregular-expressions-find-substring-in-sentence%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









2














You may use



my $rx = qr/(?x)                 # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;


See the regex demo.



See the Perl demo online:



use strict;
use warnings;

my $rx = qr/(?x) # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;
while (<DATA>) {
if (/$rx/) {
print "$+{res}n";
}
}

__DATA__
this chart.
this line chart.
this bar chart.
21684564523 this chart.
556465465456 this a line chart.
a chart.
a line chart.
which chart.
all this chart.
a chart.
123123 this chart..
123123 which chart.
all this line chart.
a line chart.
the 3d line chart will show area in 3d.
line chart.
area chart.
the chart.
1221513513 line chart.
1234125135 the chart.
123123 this bar chart.
udcvhi2ry32 the this line chart is useful.
twl chart.


Output:



this line chart
this bar chart
a line chart
a line chart
this line chart
a line chart
line chart will show area in 3d
line chart
area chart
line chart
this bar chart
this line chart is useful
twl chart





share|improve this answer



















  • 1





    really appreciate!!!!

    – gukwon
    Nov 26 '18 at 12:45











  • complete your request thank you :)

    – gukwon
    Nov 26 '18 at 12:57













  • @user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

    – Wiktor Stribiżew
    Nov 26 '18 at 15:32
















2














You may use



my $rx = qr/(?x)                 # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;


See the regex demo.



See the Perl demo online:



use strict;
use warnings;

my $rx = qr/(?x) # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;
while (<DATA>) {
if (/$rx/) {
print "$+{res}n";
}
}

__DATA__
this chart.
this line chart.
this bar chart.
21684564523 this chart.
556465465456 this a line chart.
a chart.
a line chart.
which chart.
all this chart.
a chart.
123123 this chart..
123123 which chart.
all this line chart.
a line chart.
the 3d line chart will show area in 3d.
line chart.
area chart.
the chart.
1221513513 line chart.
1234125135 the chart.
123123 this bar chart.
udcvhi2ry32 the this line chart is useful.
twl chart.


Output:



this line chart
this bar chart
a line chart
a line chart
this line chart
a line chart
line chart will show area in 3d
line chart
area chart
line chart
this bar chart
this line chart is useful
twl chart





share|improve this answer



















  • 1





    really appreciate!!!!

    – gukwon
    Nov 26 '18 at 12:45











  • complete your request thank you :)

    – gukwon
    Nov 26 '18 at 12:57













  • @user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

    – Wiktor Stribiżew
    Nov 26 '18 at 15:32














2












2








2







You may use



my $rx = qr/(?x)                 # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;


See the regex demo.



See the Perl demo online:



use strict;
use warnings;

my $rx = qr/(?x) # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;
while (<DATA>) {
if (/$rx/) {
print "$+{res}n";
}
}

__DATA__
this chart.
this line chart.
this bar chart.
21684564523 this chart.
556465465456 this a line chart.
a chart.
a line chart.
which chart.
all this chart.
a chart.
123123 this chart..
123123 which chart.
all this line chart.
a line chart.
the 3d line chart will show area in 3d.
line chart.
area chart.
the chart.
1221513513 line chart.
1234125135 the chart.
123123 this bar chart.
udcvhi2ry32 the this line chart is useful.
twl chart.


Output:



this line chart
this bar chart
a line chart
a line chart
this line chart
a line chart
line chart will show area in 3d
line chart
area chart
line chart
this bar chart
this line chart is useful
twl chart





share|improve this answer













You may use



my $rx = qr/(?x)                 # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;


See the regex demo.



See the Perl demo online:



use strict;
use warnings;

my $rx = qr/(?x) # enable formatting whitespace/comments
(?(DEFINE) # Start DEFINE block
(?<start>a|the|this|which) # Match start delimiters
) # End DEFINE block
(?<res> # Group res holding the match
b(?&start)s+chartb # Match start delims, 1+ whitespace, chart
(*SKIP)(*F) # and skip the match
| # or
b(?:(?&start)s+)? # Optional start delim and 1+ whitespace
w+s+chartb # 1+ word chars, 1+ whitespace, char, word boundary
[^.]* # 0+ chars other than dot
) # End of res group
/;
while (<DATA>) {
if (/$rx/) {
print "$+{res}n";
}
}

__DATA__
this chart.
this line chart.
this bar chart.
21684564523 this chart.
556465465456 this a line chart.
a chart.
a line chart.
which chart.
all this chart.
a chart.
123123 this chart..
123123 which chart.
all this line chart.
a line chart.
the 3d line chart will show area in 3d.
line chart.
area chart.
the chart.
1221513513 line chart.
1234125135 the chart.
123123 this bar chart.
udcvhi2ry32 the this line chart is useful.
twl chart.


Output:



this line chart
this bar chart
a line chart
a line chart
this line chart
a line chart
line chart will show area in 3d
line chart
area chart
line chart
this bar chart
this line chart is useful
twl chart






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 26 '18 at 10:37









Wiktor StribiżewWiktor Stribiżew

326k16146226




326k16146226








  • 1





    really appreciate!!!!

    – gukwon
    Nov 26 '18 at 12:45











  • complete your request thank you :)

    – gukwon
    Nov 26 '18 at 12:57













  • @user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

    – Wiktor Stribiżew
    Nov 26 '18 at 15:32














  • 1





    really appreciate!!!!

    – gukwon
    Nov 26 '18 at 12:45











  • complete your request thank you :)

    – gukwon
    Nov 26 '18 at 12:57













  • @user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

    – Wiktor Stribiżew
    Nov 26 '18 at 15:32








1




1





really appreciate!!!!

– gukwon
Nov 26 '18 at 12:45





really appreciate!!!!

– gukwon
Nov 26 '18 at 12:45













complete your request thank you :)

– gukwon
Nov 26 '18 at 12:57







complete your request thank you :)

– gukwon
Nov 26 '18 at 12:57















@user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

– Wiktor Stribiżew
Nov 26 '18 at 15:32





@user8509666 Please also consider upvoting if my answer proved helpful to you (see How to upvote on Stack Overflow?) as you are entitled to the upvoting privilege after reaching 15 rep points. Note you may upvote all the answers that turned out helpful.

– Wiktor Stribiżew
Nov 26 '18 at 15:32




















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53475057%2fregular-expressions-find-substring-in-sentence%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Wiesbaden

Marschland

Dieringhausen