Edit an xml file - Find a string then Delete a block of text / Find a string then Insert a new block of text
Morning All,
I'm new to UNIX so looking to do something I can do in VB but don't have the experience to do in UNIX.
I have an xml spec on a share which requires regular deletion and updates as new reuters RIC codes come online. Two items to achieve:
A. Remove a RIC entry
- Open the file
- Find a specific string
- Delete this found line and the 21 lines below it
- Save file
I though this might work:
sed –e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml >a.xml
B. Add a new RIC entry
- Open a file
- Find the last occurrence of the sting
</source>
- Move up 29 lines to the last RIC entry block
- Copy this line and the 21 lines below (the ric block)
- Insert a new line 22 lines below and paste this bloc (a new block) i.e. directly below the block you copied
- Change the ric on line 1 of the new block to a new Ric string i.e.
<ricid="AAAAA=YBAU"
to<ricid="BBBBB=YBAU"
- Save file
How can I do this?
This is the last section of the file. Note the end of the ric blocks (which I want to manipulate) is when the following strings appear
.
.
.
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
So for A. Remove a RIC entry where I want to remove AUG03250640E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
For B. Add a new RIC entry assuming I want to add the new ric AUG03250641E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250641E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
text-processing sed xml
add a comment |
Morning All,
I'm new to UNIX so looking to do something I can do in VB but don't have the experience to do in UNIX.
I have an xml spec on a share which requires regular deletion and updates as new reuters RIC codes come online. Two items to achieve:
A. Remove a RIC entry
- Open the file
- Find a specific string
- Delete this found line and the 21 lines below it
- Save file
I though this might work:
sed –e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml >a.xml
B. Add a new RIC entry
- Open a file
- Find the last occurrence of the sting
</source>
- Move up 29 lines to the last RIC entry block
- Copy this line and the 21 lines below (the ric block)
- Insert a new line 22 lines below and paste this bloc (a new block) i.e. directly below the block you copied
- Change the ric on line 1 of the new block to a new Ric string i.e.
<ricid="AAAAA=YBAU"
to<ricid="BBBBB=YBAU"
- Save file
How can I do this?
This is the last section of the file. Note the end of the ric blocks (which I want to manipulate) is when the following strings appear
.
.
.
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
So for A. Remove a RIC entry where I want to remove AUG03250640E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
For B. Add a new RIC entry assuming I want to add the new ric AUG03250641E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250641E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
text-processing sed xml
1
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30
add a comment |
Morning All,
I'm new to UNIX so looking to do something I can do in VB but don't have the experience to do in UNIX.
I have an xml spec on a share which requires regular deletion and updates as new reuters RIC codes come online. Two items to achieve:
A. Remove a RIC entry
- Open the file
- Find a specific string
- Delete this found line and the 21 lines below it
- Save file
I though this might work:
sed –e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml >a.xml
B. Add a new RIC entry
- Open a file
- Find the last occurrence of the sting
</source>
- Move up 29 lines to the last RIC entry block
- Copy this line and the 21 lines below (the ric block)
- Insert a new line 22 lines below and paste this bloc (a new block) i.e. directly below the block you copied
- Change the ric on line 1 of the new block to a new Ric string i.e.
<ricid="AAAAA=YBAU"
to<ricid="BBBBB=YBAU"
- Save file
How can I do this?
This is the last section of the file. Note the end of the ric blocks (which I want to manipulate) is when the following strings appear
.
.
.
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
So for A. Remove a RIC entry where I want to remove AUG03250640E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
For B. Add a new RIC entry assuming I want to add the new ric AUG03250641E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250641E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
text-processing sed xml
Morning All,
I'm new to UNIX so looking to do something I can do in VB but don't have the experience to do in UNIX.
I have an xml spec on a share which requires regular deletion and updates as new reuters RIC codes come online. Two items to achieve:
A. Remove a RIC entry
- Open the file
- Find a specific string
- Delete this found line and the 21 lines below it
- Save file
I though this might work:
sed –e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml >a.xml
B. Add a new RIC entry
- Open a file
- Find the last occurrence of the sting
</source>
- Move up 29 lines to the last RIC entry block
- Copy this line and the 21 lines below (the ric block)
- Insert a new line 22 lines below and paste this bloc (a new block) i.e. directly below the block you copied
- Change the ric on line 1 of the new block to a new Ric string i.e.
<ricid="AAAAA=YBAU"
to<ricid="BBBBB=YBAU"
- Save file
How can I do this?
This is the last section of the file. Note the end of the ric blocks (which I want to manipulate) is when the following strings appear
.
.
.
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
So for A. Remove a RIC entry where I want to remove AUG03250640E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
For B. Add a new RIC entry assuming I want to add the new ric AUG03250641E=YBAU, the file would show:
<ric id="AUG03250639E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250640E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
<ric id="AUG03250641E=YBAU">
<securities>
<security>
<issueid>178117</issueid>
<quote-type>YIELD</quote-type>
<complex-logic>
<calculations>
<yield-type>
<type>BID_YIELD</type>
<calculation name="AB" field="RT_YIELD_1" />
</yield-type>
<yield-type>
<type>OFFER_YIELD</type>
<calculation name="AB" field="SEC_YLD_1" />
</yield-type>
</calculations>
</complex-logic>
<derived-type name="PRICE" baseValue="100.0" />
</security>
</securities>
</ric>
</rics>
<topics>
<topic>
<id>default</id>
<type>rmds</type>
<value>IDN_SELECTFEED.ANY.%s.NaE</value>
</topic>
</topics>
</source>
<transformers>
<!-- Name of transformer -->
<transformer></transformer>
</transformers>
<processors>
<!-- Enricher to add additional fields from source query result while
publishing -->
<processor></processor>
</processors>
<endpoints>
<!-- Order of post processor is important. First topic, then mapper -->
<endpoint id="rmds" topic="FI.ANY.%s.YBAU" multicast="true">
<postprocessor>reuters-topic-builder</postprocessor>
<postprocessor>reuters-message-mapper</postprocessor>
<!-- <multitopic id="solace" topic="LN/FI/IP/APS/SSHEET/YIELD/BATS_%s"
/> -->
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/SSHEET/BATS_%s" />
<multitopic id="solace-credit" topic="LN/FI/EP/CREDITBPS/YIELDBROKER/BATS_%s" />
</endpoint>
</endpoints>
<other-properties>
<!-- common formatting of price/yield -->
<property name="formattor-1">(math:pow(INPUT/100+1,0.5)-1)*200</property>
<property name="handle_negative_values">false</property>
<property name="handle_negative_values_output">0.001</property>
</other-properties>
</specification>
text-processing sed xml
text-processing sed xml
edited Nov 11 '15 at 1:49
Pete
asked Nov 10 '15 at 23:17
PetePete
13
13
1
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30
add a comment |
1
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30
1
1
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30
add a comment |
3 Answers
3
active
oldest
votes
For A with POSIX sed
and head
utilities and a regular in
file:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
add a comment |
Except for the fact that parsing and/or manipulating xml with regular expressions is misguided at best (although simple things like this can work if you're careful), you almost had it right:
Using GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
If your sed doesn't support the -i
(--in-place
) option, you can do it with a temporary file (which is how sed does it behind the scenes):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
You can't read from a file and redirect output to it in the shell like you tried - the first thing the shell will do is overwrite the file so it is empty, and that happens before the sed script runs.
For more complex XML parsing tasks, use an XML parser like xmlstarlet
in shell scripts or one of the many XML parsing libraries for perl or python (or almost any other language you can think of)
what sed doesn't support-i
but does support/addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.
– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support-i
. 2. The sample showed only one<ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.
– cas
Nov 12 '15 at 21:12
add a comment |
Don't use regular expressions to parse XML. It's dirty - prone to breaking and creates brittle code. There's a bunch of things that can trip you up trivially, like line counts - it is perfectly valid within XML to format an element:
<calculation name="AB" field="SEC_YLD_1" />
Or:
<calculation
field="SEC_YLD_1"
name="AB"
/>
And a variety of other options - all of which are semantically identical, but ... won't match the same regex.
For your sample, this is shockingly easy if you use a parser. perl
has XML::Twig
which can do this handily:
Delete:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Note - does delete duplicates if any exist.
Now, creating a new one - it looks like you're trying to copy and amend - so:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
Now this reads, and prints to STDOUT - you can print to a particular file:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f242254%2fedit-an-xml-file-find-a-string-then-delete-a-block-of-text-find-a-string-the%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
For A with POSIX sed
and head
utilities and a regular in
file:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
add a comment |
For A with POSIX sed
and head
utilities and a regular in
file:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
add a comment |
For A with POSIX sed
and head
utilities and a regular in
file:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
For A with POSIX sed
and head
utilities and a regular in
file:
{ sed -ne'/^<ric id="AUG03250639E=YBAU">$/q;p'
head -n21 >/dev/null
cat
} <in >out
answered Nov 10 '15 at 23:50
mikeservmikeserv
45.8k668159
45.8k668159
add a comment |
add a comment |
Except for the fact that parsing and/or manipulating xml with regular expressions is misguided at best (although simple things like this can work if you're careful), you almost had it right:
Using GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
If your sed doesn't support the -i
(--in-place
) option, you can do it with a temporary file (which is how sed does it behind the scenes):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
You can't read from a file and redirect output to it in the shell like you tried - the first thing the shell will do is overwrite the file so it is empty, and that happens before the sed script runs.
For more complex XML parsing tasks, use an XML parser like xmlstarlet
in shell scripts or one of the many XML parsing libraries for perl or python (or almost any other language you can think of)
what sed doesn't support-i
but does support/addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.
– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support-i
. 2. The sample showed only one<ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.
– cas
Nov 12 '15 at 21:12
add a comment |
Except for the fact that parsing and/or manipulating xml with regular expressions is misguided at best (although simple things like this can work if you're careful), you almost had it right:
Using GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
If your sed doesn't support the -i
(--in-place
) option, you can do it with a temporary file (which is how sed does it behind the scenes):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
You can't read from a file and redirect output to it in the shell like you tried - the first thing the shell will do is overwrite the file so it is empty, and that happens before the sed script runs.
For more complex XML parsing tasks, use an XML parser like xmlstarlet
in shell scripts or one of the many XML parsing libraries for perl or python (or almost any other language you can think of)
what sed doesn't support-i
but does support/addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.
– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support-i
. 2. The sample showed only one<ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.
– cas
Nov 12 '15 at 21:12
add a comment |
Except for the fact that parsing and/or manipulating xml with regular expressions is misguided at best (although simple things like this can work if you're careful), you almost had it right:
Using GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
If your sed doesn't support the -i
(--in-place
) option, you can do it with a temporary file (which is how sed does it behind the scenes):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
You can't read from a file and redirect output to it in the shell like you tried - the first thing the shell will do is overwrite the file so it is empty, and that happens before the sed script runs.
For more complex XML parsing tasks, use an XML parser like xmlstarlet
in shell scripts or one of the many XML parsing libraries for perl or python (or almost any other language you can think of)
Except for the fact that parsing and/or manipulating xml with regular expressions is misguided at best (although simple things like this can work if you're careful), you almost had it right:
Using GNU sed:
sed –i -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml
If your sed doesn't support the -i
(--in-place
) option, you can do it with a temporary file (which is how sed does it behind the scenes):
TF=$(mktemp)
sed -e '/<ric id="AUG03250639E=YBAU">/,+21d' a.xml > "$TF" && mv "$TF" a.xml
You can't read from a file and redirect output to it in the shell like you tried - the first thing the shell will do is overwrite the file so it is empty, and that happens before the sed script runs.
For more complex XML parsing tasks, use an XML parser like xmlstarlet
in shell scripts or one of the many XML parsing libraries for perl or python (or almost any other language you can think of)
edited May 23 '17 at 12:40
Community♦
1
1
answered Nov 11 '15 at 3:55
cascas
39.3k454101
39.3k454101
what sed doesn't support-i
but does support/addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.
– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support-i
. 2. The sample showed only one<ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.
– cas
Nov 12 '15 at 21:12
add a comment |
what sed doesn't support-i
but does support/addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.
– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support-i
. 2. The sample showed only one<ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.
– cas
Nov 12 '15 at 21:12
what sed doesn't support
-i
but does support /addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.– mikeserv
Nov 12 '15 at 18:50
what sed doesn't support
-i
but does support /addr1/,+#addr2
? also, this doesn't do as I understand is asked - it deletes 21 lines after every occurrence of the matched address, rather than the first.– mikeserv
Nov 12 '15 at 18:50
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support
-i
. 2. The sample showed only one <ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.– cas
Nov 12 '15 at 21:12
1. no idea. i don't expend any effort at all keeping track of features supported in different versions of sed and feel not the slightest regret for that life-choice. i do know that some seds don't support
-i
. 2. The sample showed only one <ric id="AUG03250639E=YBAU">
and it's an id field, which tend to be unique.– cas
Nov 12 '15 at 21:12
add a comment |
Don't use regular expressions to parse XML. It's dirty - prone to breaking and creates brittle code. There's a bunch of things that can trip you up trivially, like line counts - it is perfectly valid within XML to format an element:
<calculation name="AB" field="SEC_YLD_1" />
Or:
<calculation
field="SEC_YLD_1"
name="AB"
/>
And a variety of other options - all of which are semantically identical, but ... won't match the same regex.
For your sample, this is shockingly easy if you use a parser. perl
has XML::Twig
which can do this handily:
Delete:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Note - does delete duplicates if any exist.
Now, creating a new one - it looks like you're trying to copy and amend - so:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
Now this reads, and prints to STDOUT - you can print to a particular file:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );
add a comment |
Don't use regular expressions to parse XML. It's dirty - prone to breaking and creates brittle code. There's a bunch of things that can trip you up trivially, like line counts - it is perfectly valid within XML to format an element:
<calculation name="AB" field="SEC_YLD_1" />
Or:
<calculation
field="SEC_YLD_1"
name="AB"
/>
And a variety of other options - all of which are semantically identical, but ... won't match the same regex.
For your sample, this is shockingly easy if you use a parser. perl
has XML::Twig
which can do this handily:
Delete:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Note - does delete duplicates if any exist.
Now, creating a new one - it looks like you're trying to copy and amend - so:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
Now this reads, and prints to STDOUT - you can print to a particular file:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );
add a comment |
Don't use regular expressions to parse XML. It's dirty - prone to breaking and creates brittle code. There's a bunch of things that can trip you up trivially, like line counts - it is perfectly valid within XML to format an element:
<calculation name="AB" field="SEC_YLD_1" />
Or:
<calculation
field="SEC_YLD_1"
name="AB"
/>
And a variety of other options - all of which are semantically identical, but ... won't match the same regex.
For your sample, this is shockingly easy if you use a parser. perl
has XML::Twig
which can do this handily:
Delete:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Note - does delete duplicates if any exist.
Now, creating a new one - it looks like you're trying to copy and amend - so:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
Now this reads, and prints to STDOUT - you can print to a particular file:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );
Don't use regular expressions to parse XML. It's dirty - prone to breaking and creates brittle code. There's a bunch of things that can trip you up trivially, like line counts - it is perfectly valid within XML to format an element:
<calculation name="AB" field="SEC_YLD_1" />
Or:
<calculation
field="SEC_YLD_1"
name="AB"
/>
And a variety of other options - all of which are semantically identical, but ... won't match the same regex.
For your sample, this is shockingly easy if you use a parser. perl
has XML::Twig
which can do this handily:
Delete:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
$_ -> delete for $twig -> get_xpath('//ric[@id="AUG03250639E=YBAU"]');
$twig -> set_pretty_print('indented');
$twig -> print;
Note - does delete duplicates if any exist.
Now, creating a new one - it looks like you're trying to copy and amend - so:
#!/usr/bin/env perl
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig -> parsefile ( 'your_file.xml' );
#find one to copy - this will just get the first 'ric' element.
my $ric_to_copy = $twig -> get_xpath('//ric',0);
#copy it
my $new_ric = $ric_to_copy -> copy;
#alter the new one
$new_ric -> set_att('id', 'BBBBB=YBAU' );
#paste it
$new_ric -> paste ( 'last_child', $ric_to_copy->parent);
$twig -> set_pretty_print('indented');
$twig -> print;
Now this reads, and prints to STDOUT - you can print to a particular file:
my ( $output, '>', 'new.xml') or die $!;
print {$output} $twig -> sprint;
close ( $output );
edited Nov 13 '15 at 11:40
answered Nov 13 '15 at 11:33
SobriqueSobrique
3,829519
3,829519
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f242254%2fedit-an-xml-file-find-a-string-then-delete-a-block-of-text-find-a-string-the%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
Please edit your question and add a sample of your input file and the output you would like from that file.
– terdon♦
Nov 10 '15 at 23:30