Nested conditional search and in-place substituition
I'm an awk newbie. I have a file that looks like:
beans and celery
beans and oatmeal
beans and beans
quinoa
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I'm trying to perform a search and replace in-place for the "fo" tag. I want to capture the beginning of the tag, as well as the "src" parameter. Please note that the position of the src tag varies from line to line!
I've been able to get the fields I want using the following:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i}}' inventory.txt
How can I do an in-place substitution of this?
I also want to append a string to new contents of the line. I've tried:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i "misc stuff here"}}' inventory.txt
But it completely messes up the order of the resulting string, which I want to be of the form:
<fo:external-graphic src="url(images/image1.png)" misc stuff here
PS1:
Further clarification about what result I want:
The file contains strings like:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I want to process these in and get an output like:
<fo:external-graphic src="url(images/image1.png)" _completely new stuff here, till end of string_ />
for example:
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M" />
I want the result to ALWAYS begin with:
<fo:external-graphic src="url(images/image1.png)"
then the extra stuff eg:
age="25" sex="M" />
No other part of the original string is needed in the final output.
PS2: Can I pack all this into a gsub? To the best of my knowledge, gsub only take two arguments. I've tried to make a complex expression, for the replace argument, but it keeps failing eg:
gawk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") gsub($0, "boy band"); {print}}}' inventory.txt > testres
PS3: This is just a newbie observation, maybe I'm wrong. Consider a file with the following contents:
Donald Trump
Donald Duck
George Bush
Steve Austin
The regexp to search for all lines that begin with Donald is:
/^Donald/
If I want to replace all occurrences of "Donald" with "Barrack", I could do the following:
gawk -i inplace '{ gsub(/^Donald/, "Barrack"); { print } }' FILENAME
If I want to completely change all lines that contain "Donald" I would do:
gawk -i inplace '{ gsub(/^Donald.*/, "Barrack"); { print } }' FILENAME
gawk and gsub appear to only replace the span or whatever part of the string matches the given regexp. Thus if I want to completely change a whole line, my regexp should span the whole of that line.
PS4: Just to clear any ambiguities about the solution I expect. Given the following file:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic width="6.3in" height="auto" src="url(images/image1.png)" content-width="246px" content-height="322px"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" content-width="246px" content-height="322px"/>
I'm looking for an awk/gawk solution that will replace this file with:
<fo:external-graphic src="url(file:/C:/Users/xxx/images/tip.svg)" age="25" sex="M" />
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
The target file must be changed.
awk
add a comment |
I'm an awk newbie. I have a file that looks like:
beans and celery
beans and oatmeal
beans and beans
quinoa
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I'm trying to perform a search and replace in-place for the "fo" tag. I want to capture the beginning of the tag, as well as the "src" parameter. Please note that the position of the src tag varies from line to line!
I've been able to get the fields I want using the following:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i}}' inventory.txt
How can I do an in-place substitution of this?
I also want to append a string to new contents of the line. I've tried:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i "misc stuff here"}}' inventory.txt
But it completely messes up the order of the resulting string, which I want to be of the form:
<fo:external-graphic src="url(images/image1.png)" misc stuff here
PS1:
Further clarification about what result I want:
The file contains strings like:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I want to process these in and get an output like:
<fo:external-graphic src="url(images/image1.png)" _completely new stuff here, till end of string_ />
for example:
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M" />
I want the result to ALWAYS begin with:
<fo:external-graphic src="url(images/image1.png)"
then the extra stuff eg:
age="25" sex="M" />
No other part of the original string is needed in the final output.
PS2: Can I pack all this into a gsub? To the best of my knowledge, gsub only take two arguments. I've tried to make a complex expression, for the replace argument, but it keeps failing eg:
gawk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") gsub($0, "boy band"); {print}}}' inventory.txt > testres
PS3: This is just a newbie observation, maybe I'm wrong. Consider a file with the following contents:
Donald Trump
Donald Duck
George Bush
Steve Austin
The regexp to search for all lines that begin with Donald is:
/^Donald/
If I want to replace all occurrences of "Donald" with "Barrack", I could do the following:
gawk -i inplace '{ gsub(/^Donald/, "Barrack"); { print } }' FILENAME
If I want to completely change all lines that contain "Donald" I would do:
gawk -i inplace '{ gsub(/^Donald.*/, "Barrack"); { print } }' FILENAME
gawk and gsub appear to only replace the span or whatever part of the string matches the given regexp. Thus if I want to completely change a whole line, my regexp should span the whole of that line.
PS4: Just to clear any ambiguities about the solution I expect. Given the following file:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic width="6.3in" height="auto" src="url(images/image1.png)" content-width="246px" content-height="322px"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" content-width="246px" content-height="322px"/>
I'm looking for an awk/gawk solution that will replace this file with:
<fo:external-graphic src="url(file:/C:/Users/xxx/images/tip.svg)" age="25" sex="M" />
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
The target file must be changed.
awk
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43
add a comment |
I'm an awk newbie. I have a file that looks like:
beans and celery
beans and oatmeal
beans and beans
quinoa
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I'm trying to perform a search and replace in-place for the "fo" tag. I want to capture the beginning of the tag, as well as the "src" parameter. Please note that the position of the src tag varies from line to line!
I've been able to get the fields I want using the following:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i}}' inventory.txt
How can I do an in-place substitution of this?
I also want to append a string to new contents of the line. I've tried:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i "misc stuff here"}}' inventory.txt
But it completely messes up the order of the resulting string, which I want to be of the form:
<fo:external-graphic src="url(images/image1.png)" misc stuff here
PS1:
Further clarification about what result I want:
The file contains strings like:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I want to process these in and get an output like:
<fo:external-graphic src="url(images/image1.png)" _completely new stuff here, till end of string_ />
for example:
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M" />
I want the result to ALWAYS begin with:
<fo:external-graphic src="url(images/image1.png)"
then the extra stuff eg:
age="25" sex="M" />
No other part of the original string is needed in the final output.
PS2: Can I pack all this into a gsub? To the best of my knowledge, gsub only take two arguments. I've tried to make a complex expression, for the replace argument, but it keeps failing eg:
gawk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") gsub($0, "boy band"); {print}}}' inventory.txt > testres
PS3: This is just a newbie observation, maybe I'm wrong. Consider a file with the following contents:
Donald Trump
Donald Duck
George Bush
Steve Austin
The regexp to search for all lines that begin with Donald is:
/^Donald/
If I want to replace all occurrences of "Donald" with "Barrack", I could do the following:
gawk -i inplace '{ gsub(/^Donald/, "Barrack"); { print } }' FILENAME
If I want to completely change all lines that contain "Donald" I would do:
gawk -i inplace '{ gsub(/^Donald.*/, "Barrack"); { print } }' FILENAME
gawk and gsub appear to only replace the span or whatever part of the string matches the given regexp. Thus if I want to completely change a whole line, my regexp should span the whole of that line.
PS4: Just to clear any ambiguities about the solution I expect. Given the following file:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic width="6.3in" height="auto" src="url(images/image1.png)" content-width="246px" content-height="322px"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" content-width="246px" content-height="322px"/>
I'm looking for an awk/gawk solution that will replace this file with:
<fo:external-graphic src="url(file:/C:/Users/xxx/images/tip.svg)" age="25" sex="M" />
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
The target file must be changed.
awk
I'm an awk newbie. I have a file that looks like:
beans and celery
beans and oatmeal
beans and beans
quinoa
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I'm trying to perform a search and replace in-place for the "fo" tag. I want to capture the beginning of the tag, as well as the "src" parameter. Please note that the position of the src tag varies from line to line!
I've been able to get the fields I want using the following:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i}}' inventory.txt
How can I do an in-place substitution of this?
I also want to append a string to new contents of the line. I've tried:
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") print $1 " " $i "misc stuff here"}}' inventory.txt
But it completely messes up the order of the resulting string, which I want to be of the form:
<fo:external-graphic src="url(images/image1.png)" misc stuff here
PS1:
Further clarification about what result I want:
The file contains strings like:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" height="auto" content-width="246px" content-height="322px"/>
I want to process these in and get an output like:
<fo:external-graphic src="url(images/image1.png)" _completely new stuff here, till end of string_ />
for example:
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M" />
I want the result to ALWAYS begin with:
<fo:external-graphic src="url(images/image1.png)"
then the extra stuff eg:
age="25" sex="M" />
No other part of the original string is needed in the final output.
PS2: Can I pack all this into a gsub? To the best of my knowledge, gsub only take two arguments. I've tried to make a complex expression, for the replace argument, but it keeps failing eg:
gawk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") gsub($0, "boy band"); {print}}}' inventory.txt > testres
PS3: This is just a newbie observation, maybe I'm wrong. Consider a file with the following contents:
Donald Trump
Donald Duck
George Bush
Steve Austin
The regexp to search for all lines that begin with Donald is:
/^Donald/
If I want to replace all occurrences of "Donald" with "Barrack", I could do the following:
gawk -i inplace '{ gsub(/^Donald/, "Barrack"); { print } }' FILENAME
If I want to completely change all lines that contain "Donald" I would do:
gawk -i inplace '{ gsub(/^Donald.*/, "Barrack"); { print } }' FILENAME
gawk and gsub appear to only replace the span or whatever part of the string matches the given regexp. Thus if I want to completely change a whole line, my regexp should span the whole of that line.
PS4: Just to clear any ambiguities about the solution I expect. Given the following file:
<fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/>
<fo:external-graphic width="6.3in" height="auto" src="url(images/image1.png)" content-width="246px" content-height="322px"/>
<fo:external-graphic src="url(images/image1.png)" width="6.3in" content-width="246px" content-height="322px"/>
I'm looking for an awk/gawk solution that will replace this file with:
<fo:external-graphic src="url(file:/C:/Users/xxx/images/tip.svg)" age="25" sex="M" />
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
<fo:external-graphic src="url(images/image1.png)" age="25" sex="M"/>
The target file must be changed.
awk
awk
edited Nov 22 '18 at 13:13
user1801060
asked Nov 22 '18 at 8:51
user1801060user1801060
1,29721835
1,29721835
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43
add a comment |
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43
add a comment |
2 Answers
2
active
oldest
votes
Your attempt is right, but assuming your intention is add only on the word starting with src
i.e. on $i
, apply the action only for that field, and keep the other fields as-is
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") $i = $i " misc stuff here"}}1' inventory.txt
The part $i = $i " misc stuff here"
does action of appending the string only on the fields matching your regex condition. Notice the removal of print
and appending of {..}1
at the end. What that basically means is re-construct the whole line based on the modifications done inside {..}
. Since we are modifying only certain fields, the other ones are kept intact.
If you want to re-write the entire field starting with src
and append some string, use a proper regex match with gsub()
and append string after the matched text denoted by &
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) { if ($i ~ "src") gsub(/src="url([^"]*)"/, "& new string", $i ) }}1' inventory.txt
From OP's most recent edit, it seems OP just wants the src
field modified with new string to be appended at the end. The rest of the fields can be ignored it seems. Using match()
on GNU awk
has an added advantage of adding a third parameter to store the captured groups as
awk -v newstr="age="25" sex="M"" 'match($0, /^(<fo:external-graphic).*(src="url([^"]*)").*(/>)$/, arr){ print arr[1]" "arr[2]" "newstr""arr[4] }' file
Note that GNU awk
did not have in-place modifications up until 4.1.2, on which you can just do
gawk -i inplace '{...}' inventory.txt
For versions earlier to that, use a temporary file
awk '{...}' inventory.txt > tmpfile && mv tmpfile inventory.txt
Or if you moreutils
installed, use sponge
to slurp the output from the first command and re-create the file back with the latest.
awk '{...}' inventory.txt | sponge inventory.txt
From your sample text on the question, one can't recognize if its some markup language of sorts (XML, HTML). If its a proper syntax aware language, then you should use a parser that knows that grammar.
add a comment |
EDIT3: Adding one more code here as per OP's new edit.
awk '
/ width.*content-width.*src/{
sub(/ width.*content-width.*src/," src")
sub(//>$/," age="25" sex="M"&")
}
/src.*width/{
match($0,/src[^)]*/)
val=substr($0,RSTART,RLENGTH+2)
sub(/src.*/,"")
$0=$0 OFS val OFS "age="25" sex="M"/>"
}
1
' Input_file
EDIT2: For changing complete line with respect to OP's PS3 could you please try following.
awk '/^Donald/{$0="new_line_value"} 1' Input_file
new_line_value
new_line_value
George Bush
Steve Austin
EDIT: Since OP has changed expected output so adding solution as per that output now too.
awk '/^<fo:external-graphic src=.*/ && match($0,/src=.*)"/){$0=substr($0,1,RSTART+RLENGTH) " new_value_bla_bla_here.. />"} 1' Input_file
Could you please try following(haven't tested it thoroughly since your expected output is not clear).
awk '
/^<fo:/ && match($0,/src=.*>/){
$0=substr($0,1,RSTART-1) OFS "new_value_here.." OFS substr($0,RSTART+RLENGTH+1)
}
1
' Input_file
In this code checking a line which starts from <fo:
string and then trying to catch string from src=
till /
by match
and replacing that captured text with new string here.
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file
in above code too.
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string^<fo:external-graphic src=
check now once?
– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like allsrc
you want to match or any specific string you want it to do?
– RavinderSingh13
Nov 22 '18 at 11:59
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
|
show 11 more comments
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427025%2fnested-conditional-search-and-in-place-substituition%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your attempt is right, but assuming your intention is add only on the word starting with src
i.e. on $i
, apply the action only for that field, and keep the other fields as-is
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") $i = $i " misc stuff here"}}1' inventory.txt
The part $i = $i " misc stuff here"
does action of appending the string only on the fields matching your regex condition. Notice the removal of print
and appending of {..}1
at the end. What that basically means is re-construct the whole line based on the modifications done inside {..}
. Since we are modifying only certain fields, the other ones are kept intact.
If you want to re-write the entire field starting with src
and append some string, use a proper regex match with gsub()
and append string after the matched text denoted by &
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) { if ($i ~ "src") gsub(/src="url([^"]*)"/, "& new string", $i ) }}1' inventory.txt
From OP's most recent edit, it seems OP just wants the src
field modified with new string to be appended at the end. The rest of the fields can be ignored it seems. Using match()
on GNU awk
has an added advantage of adding a third parameter to store the captured groups as
awk -v newstr="age="25" sex="M"" 'match($0, /^(<fo:external-graphic).*(src="url([^"]*)").*(/>)$/, arr){ print arr[1]" "arr[2]" "newstr""arr[4] }' file
Note that GNU awk
did not have in-place modifications up until 4.1.2, on which you can just do
gawk -i inplace '{...}' inventory.txt
For versions earlier to that, use a temporary file
awk '{...}' inventory.txt > tmpfile && mv tmpfile inventory.txt
Or if you moreutils
installed, use sponge
to slurp the output from the first command and re-create the file back with the latest.
awk '{...}' inventory.txt | sponge inventory.txt
From your sample text on the question, one can't recognize if its some markup language of sorts (XML, HTML). If its a proper syntax aware language, then you should use a parser that knows that grammar.
add a comment |
Your attempt is right, but assuming your intention is add only on the word starting with src
i.e. on $i
, apply the action only for that field, and keep the other fields as-is
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") $i = $i " misc stuff here"}}1' inventory.txt
The part $i = $i " misc stuff here"
does action of appending the string only on the fields matching your regex condition. Notice the removal of print
and appending of {..}1
at the end. What that basically means is re-construct the whole line based on the modifications done inside {..}
. Since we are modifying only certain fields, the other ones are kept intact.
If you want to re-write the entire field starting with src
and append some string, use a proper regex match with gsub()
and append string after the matched text denoted by &
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) { if ($i ~ "src") gsub(/src="url([^"]*)"/, "& new string", $i ) }}1' inventory.txt
From OP's most recent edit, it seems OP just wants the src
field modified with new string to be appended at the end. The rest of the fields can be ignored it seems. Using match()
on GNU awk
has an added advantage of adding a third parameter to store the captured groups as
awk -v newstr="age="25" sex="M"" 'match($0, /^(<fo:external-graphic).*(src="url([^"]*)").*(/>)$/, arr){ print arr[1]" "arr[2]" "newstr""arr[4] }' file
Note that GNU awk
did not have in-place modifications up until 4.1.2, on which you can just do
gawk -i inplace '{...}' inventory.txt
For versions earlier to that, use a temporary file
awk '{...}' inventory.txt > tmpfile && mv tmpfile inventory.txt
Or if you moreutils
installed, use sponge
to slurp the output from the first command and re-create the file back with the latest.
awk '{...}' inventory.txt | sponge inventory.txt
From your sample text on the question, one can't recognize if its some markup language of sorts (XML, HTML). If its a proper syntax aware language, then you should use a parser that knows that grammar.
add a comment |
Your attempt is right, but assuming your intention is add only on the word starting with src
i.e. on $i
, apply the action only for that field, and keep the other fields as-is
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") $i = $i " misc stuff here"}}1' inventory.txt
The part $i = $i " misc stuff here"
does action of appending the string only on the fields matching your regex condition. Notice the removal of print
and appending of {..}1
at the end. What that basically means is re-construct the whole line based on the modifications done inside {..}
. Since we are modifying only certain fields, the other ones are kept intact.
If you want to re-write the entire field starting with src
and append some string, use a proper regex match with gsub()
and append string after the matched text denoted by &
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) { if ($i ~ "src") gsub(/src="url([^"]*)"/, "& new string", $i ) }}1' inventory.txt
From OP's most recent edit, it seems OP just wants the src
field modified with new string to be appended at the end. The rest of the fields can be ignored it seems. Using match()
on GNU awk
has an added advantage of adding a third parameter to store the captured groups as
awk -v newstr="age="25" sex="M"" 'match($0, /^(<fo:external-graphic).*(src="url([^"]*)").*(/>)$/, arr){ print arr[1]" "arr[2]" "newstr""arr[4] }' file
Note that GNU awk
did not have in-place modifications up until 4.1.2, on which you can just do
gawk -i inplace '{...}' inventory.txt
For versions earlier to that, use a temporary file
awk '{...}' inventory.txt > tmpfile && mv tmpfile inventory.txt
Or if you moreutils
installed, use sponge
to slurp the output from the first command and re-create the file back with the latest.
awk '{...}' inventory.txt | sponge inventory.txt
From your sample text on the question, one can't recognize if its some markup language of sorts (XML, HTML). If its a proper syntax aware language, then you should use a parser that knows that grammar.
Your attempt is right, but assuming your intention is add only on the word starting with src
i.e. on $i
, apply the action only for that field, and keep the other fields as-is
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) {if ($i ~ "src") $i = $i " misc stuff here"}}1' inventory.txt
The part $i = $i " misc stuff here"
does action of appending the string only on the fields matching your regex condition. Notice the removal of print
and appending of {..}1
at the end. What that basically means is re-construct the whole line based on the modifications done inside {..}
. Since we are modifying only certain fields, the other ones are kept intact.
If you want to re-write the entire field starting with src
and append some string, use a proper regex match with gsub()
and append string after the matched text denoted by &
awk '/<fo:external-graphic.*/ {for (i=1; i<=NF; ++i) { if ($i ~ "src") gsub(/src="url([^"]*)"/, "& new string", $i ) }}1' inventory.txt
From OP's most recent edit, it seems OP just wants the src
field modified with new string to be appended at the end. The rest of the fields can be ignored it seems. Using match()
on GNU awk
has an added advantage of adding a third parameter to store the captured groups as
awk -v newstr="age="25" sex="M"" 'match($0, /^(<fo:external-graphic).*(src="url([^"]*)").*(/>)$/, arr){ print arr[1]" "arr[2]" "newstr""arr[4] }' file
Note that GNU awk
did not have in-place modifications up until 4.1.2, on which you can just do
gawk -i inplace '{...}' inventory.txt
For versions earlier to that, use a temporary file
awk '{...}' inventory.txt > tmpfile && mv tmpfile inventory.txt
Or if you moreutils
installed, use sponge
to slurp the output from the first command and re-create the file back with the latest.
awk '{...}' inventory.txt | sponge inventory.txt
From your sample text on the question, one can't recognize if its some markup language of sorts (XML, HTML). If its a proper syntax aware language, then you should use a parser that knows that grammar.
edited Nov 22 '18 at 16:54
answered Nov 22 '18 at 8:56
InianInian
39.3k63971
39.3k63971
add a comment |
add a comment |
EDIT3: Adding one more code here as per OP's new edit.
awk '
/ width.*content-width.*src/{
sub(/ width.*content-width.*src/," src")
sub(//>$/," age="25" sex="M"&")
}
/src.*width/{
match($0,/src[^)]*/)
val=substr($0,RSTART,RLENGTH+2)
sub(/src.*/,"")
$0=$0 OFS val OFS "age="25" sex="M"/>"
}
1
' Input_file
EDIT2: For changing complete line with respect to OP's PS3 could you please try following.
awk '/^Donald/{$0="new_line_value"} 1' Input_file
new_line_value
new_line_value
George Bush
Steve Austin
EDIT: Since OP has changed expected output so adding solution as per that output now too.
awk '/^<fo:external-graphic src=.*/ && match($0,/src=.*)"/){$0=substr($0,1,RSTART+RLENGTH) " new_value_bla_bla_here.. />"} 1' Input_file
Could you please try following(haven't tested it thoroughly since your expected output is not clear).
awk '
/^<fo:/ && match($0,/src=.*>/){
$0=substr($0,1,RSTART-1) OFS "new_value_here.." OFS substr($0,RSTART+RLENGTH+1)
}
1
' Input_file
In this code checking a line which starts from <fo:
string and then trying to catch string from src=
till /
by match
and replacing that captured text with new string here.
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file
in above code too.
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string^<fo:external-graphic src=
check now once?
– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like allsrc
you want to match or any specific string you want it to do?
– RavinderSingh13
Nov 22 '18 at 11:59
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
|
show 11 more comments
EDIT3: Adding one more code here as per OP's new edit.
awk '
/ width.*content-width.*src/{
sub(/ width.*content-width.*src/," src")
sub(//>$/," age="25" sex="M"&")
}
/src.*width/{
match($0,/src[^)]*/)
val=substr($0,RSTART,RLENGTH+2)
sub(/src.*/,"")
$0=$0 OFS val OFS "age="25" sex="M"/>"
}
1
' Input_file
EDIT2: For changing complete line with respect to OP's PS3 could you please try following.
awk '/^Donald/{$0="new_line_value"} 1' Input_file
new_line_value
new_line_value
George Bush
Steve Austin
EDIT: Since OP has changed expected output so adding solution as per that output now too.
awk '/^<fo:external-graphic src=.*/ && match($0,/src=.*)"/){$0=substr($0,1,RSTART+RLENGTH) " new_value_bla_bla_here.. />"} 1' Input_file
Could you please try following(haven't tested it thoroughly since your expected output is not clear).
awk '
/^<fo:/ && match($0,/src=.*>/){
$0=substr($0,1,RSTART-1) OFS "new_value_here.." OFS substr($0,RSTART+RLENGTH+1)
}
1
' Input_file
In this code checking a line which starts from <fo:
string and then trying to catch string from src=
till /
by match
and replacing that captured text with new string here.
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file
in above code too.
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string^<fo:external-graphic src=
check now once?
– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like allsrc
you want to match or any specific string you want it to do?
– RavinderSingh13
Nov 22 '18 at 11:59
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
|
show 11 more comments
EDIT3: Adding one more code here as per OP's new edit.
awk '
/ width.*content-width.*src/{
sub(/ width.*content-width.*src/," src")
sub(//>$/," age="25" sex="M"&")
}
/src.*width/{
match($0,/src[^)]*/)
val=substr($0,RSTART,RLENGTH+2)
sub(/src.*/,"")
$0=$0 OFS val OFS "age="25" sex="M"/>"
}
1
' Input_file
EDIT2: For changing complete line with respect to OP's PS3 could you please try following.
awk '/^Donald/{$0="new_line_value"} 1' Input_file
new_line_value
new_line_value
George Bush
Steve Austin
EDIT: Since OP has changed expected output so adding solution as per that output now too.
awk '/^<fo:external-graphic src=.*/ && match($0,/src=.*)"/){$0=substr($0,1,RSTART+RLENGTH) " new_value_bla_bla_here.. />"} 1' Input_file
Could you please try following(haven't tested it thoroughly since your expected output is not clear).
awk '
/^<fo:/ && match($0,/src=.*>/){
$0=substr($0,1,RSTART-1) OFS "new_value_here.." OFS substr($0,RSTART+RLENGTH+1)
}
1
' Input_file
In this code checking a line which starts from <fo:
string and then trying to catch string from src=
till /
by match
and replacing that captured text with new string here.
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file
in above code too.
EDIT3: Adding one more code here as per OP's new edit.
awk '
/ width.*content-width.*src/{
sub(/ width.*content-width.*src/," src")
sub(//>$/," age="25" sex="M"&")
}
/src.*width/{
match($0,/src[^)]*/)
val=substr($0,RSTART,RLENGTH+2)
sub(/src.*/,"")
$0=$0 OFS val OFS "age="25" sex="M"/>"
}
1
' Input_file
EDIT2: For changing complete line with respect to OP's PS3 could you please try following.
awk '/^Donald/{$0="new_line_value"} 1' Input_file
new_line_value
new_line_value
George Bush
Steve Austin
EDIT: Since OP has changed expected output so adding solution as per that output now too.
awk '/^<fo:external-graphic src=.*/ && match($0,/src=.*)"/){$0=substr($0,1,RSTART+RLENGTH) " new_value_bla_bla_here.. />"} 1' Input_file
Could you please try following(haven't tested it thoroughly since your expected output is not clear).
awk '
/^<fo:/ && match($0,/src=.*>/){
$0=substr($0,1,RSTART-1) OFS "new_value_here.." OFS substr($0,RSTART+RLENGTH+1)
}
1
' Input_file
In this code checking a line which starts from <fo:
string and then trying to catch string from src=
till /
by match
and replacing that captured text with new string here.
In case you want to save output into Input_file itself then append > temp_file && mv temp_file Input_file
in above code too.
edited Nov 22 '18 at 13:45
answered Nov 22 '18 at 9:30
RavinderSingh13RavinderSingh13
27.2k41438
27.2k41438
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string^<fo:external-graphic src=
check now once?
– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like allsrc
you want to match or any specific string you want it to do?
– RavinderSingh13
Nov 22 '18 at 11:59
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
|
show 11 more comments
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string^<fo:external-graphic src=
check now once?
– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like allsrc
you want to match or any specific string you want it to do?
– RavinderSingh13
Nov 22 '18 at 11:59
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
The second line is ok. The first line prints out: <fo:external-graphic width="auto" height="auto" content-width="36pt" src="url(file:/C:/Users/xxx/images/tip.svg)"/ new_value_bla_bla_here.. />. Please feel free to test your solution against the file I provided at the beginning of my question
– user1801060
Nov 22 '18 at 10:35
@user1801060, check now and let me know then? I was not checking condition for string
^<fo:external-graphic src=
check now once?– RavinderSingh13
Nov 22 '18 at 10:54
@user1801060, check now and let me know then? I was not checking condition for string
^<fo:external-graphic src=
check now once?– RavinderSingh13
Nov 22 '18 at 10:54
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
Perhaps the mistake is mine. Here's a screenshot of my result: imgur.com/aENVTyD
– user1801060
Nov 22 '18 at 11:41
@user1801060, is it like all
src
you want to match or any specific string you want it to do?– RavinderSingh13
Nov 22 '18 at 11:59
@user1801060, is it like all
src
you want to match or any specific string you want it to do?– RavinderSingh13
Nov 22 '18 at 11:59
1
1
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
Great effort! The first one didn't work, the last two worked. I believe getting a regexp to work for all variants is the problem. I'll have a look at it tomorrow.
– user1801060
Nov 23 '18 at 18:56
|
show 11 more comments
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53427025%2fnested-conditional-search-and-in-place-substituition%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
@Inian It doesn't appear to update the file. Please see the my updated question to see the form I expect for the final answer!
– user1801060
Nov 22 '18 at 9:49
@RavinderSingh13 Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 9:59
@Inian Please see the latest update to my question! If you have any doubts, let me know. Thanks
– user1801060
Nov 22 '18 at 10:00
My last update should fix your problem
– Inian
Nov 22 '18 at 16:43