python - remove everything between 2 tags that span branches of an xml tree -

I am trying to extract everything between 2 tags in an XML document, using a python & amp; Lxml. The problem is that tags can be in different branches of the tree (but always at the same depth) can be seen in a document such as this.

  & lt; Root & gt; & Lt; P & gt; Hello World & lt; Start / & gt; This is a paragraph & lt; / P & gt; & Lt; P & gt; goodbye World. & Lt; End / & gt; I'm leaving now & lt; / P & gt; & Lt; / Root & gt;

I want to remove everything between the start and end tag, which will result in a single tag:

  & lt; Root & gt; & Lt; P & gt; Hello world now I am going & lt; / P & gt; & Lt; / Root & gt; Do not anyone know how this can be done? Python?    
  You can try to use it as SAX:  
  Lxml import SkipStartEndTarget the Atry class: def __init __ (self, * Arges, ** kwargs): self.builder = etree.TreeBuilder () self.skip = false def start (self, tag, attrib, nsmap = None): If tag == 'start': self.skip = true if not self.skip: self.builder.start (tag, attrib, nsmap) def data (self, data): if not self.skip: self.builder.data (Data) DEM comment (self, comment): If not self.skip: self.builder.comment (self) def pi (self, target, data): If not self.skip: self.builder.pi (Luck Def, end (self, tag): if not self.skip: self.builder.end (tag) tag if == 'end': self.skip = wrong def off (self): self.skip = false Return self.builder. Use the  SkipStartEndTarget  class to create  parser target  and you can do that with a custom Create  XMLParser , like this: 
   parser = etree.xmlParser (target = SkipStartEndTarget ())  
  If needed, you can still give other parser options to the parser. For example: if you are using, you can provide parser functions to this parser, for example: 
   elem = etree.fromstring (xml_str, parser = parser)  
  works with  etree.XML ()  and  etree.parse () , and you can paste the parser to  etree. Setdefaultparser () can also be set as default parser  (which is probably not a good idea) One thing you can visit: even  etree.parse () With , this is not an elementality Will return only, but always have an element (like  etree.XML ()  and  Etree.fromstring () ). I do not think this can be done (anyway), so if this is an issue for you, then you have to work it anyway. 
  Note that it is possible to use, with Sax events, which is probably something more difficult and slow, unlike the above example, it will return an elementality, but I think it will be  .docinfo  does not provide when you will be receiving  etree.parse ()  while using. I also believe that (currently) does not support comment and PI. (I have not used it yet, so I can not be more precise at this time) 
  Also keep in mind that any SAX-like approach is needed to parse the document that  & Lt; Start / & gt;  and   will still have a well documented, which is the case in your example, but if the second  & lt; P & gt;  for example  & lt; P2 & gt; , as you  & lt; P & gt; .... & lt; / P2 & gt; .






-



02:22


















Get link





Facebook





X





Pinterest





Email





Other Apps




Comments





Post a Comment



Popular posts from this blog




Eclipse CDT variable colors in editor -



    I hope this is a good question to ask here. This programming is related, so I thought it would be better than the superuser, so I am using the CDT C ++ Eclipse plugin. I know how the editor changes colors for some things But is there a way to do that all the variables are different colors? For example in KDEValue, it sets the local variables and square variables in different colors and bold square squares. Can I copy it to CDT?      Eclips provides some level of color customization for text editor syntax   You can see it on the following: There are many elements, including the  Window> Preferences> C / C ++> Editor> Syntax color    code that can be changed, assembly, Comments, Preprocessor and Doxian   Under the code, it is possible to change the color of some variable types, such as global variables, local variable announcements, local variable references, and parameter variables.   But I do not think the code is capable of changing the color of each variable declared....





Read more





AJAX doesn't send POST query -



     $ (दस्तावेज़) .ready (function () {function ajaxselectrss (rssurlvar) {var ajaxRequest; // वेरिएबल जो अजाक्स को संभव बनाता है! कोशिश करें कि {// ओपेरा 8.0+, फ़ायरफ़ॉक्स, सफारी अजाक्स अनुरोध} = नया एक्सएमएलएचटीपीआरइवेस्ट ();} कैच (ई) {// इंटरनेट एक्सप्लोरर ब्राउज़र्स {ajaxRequest = new ActiveXObject ("Msxml2.XMLHTTP");} पकड़ो (ई) {try {ajaxRequest = new ActiveXObject ("Microsoft.XMLHTTP");} कैच (ई) {// कुछ गलत चेतावनी ("आपका ब्राउज़र तोड़ दिया!"); वापसी झूठी;}}} // एक बनाएं फ़ंक्शन जो सर्वर से भेजे गए डेटा प्राप्त करेंगे ajaxRequest.onreadystatechange = function () {if (ajaxRequest.readyState == 4) {var ajaxDisplay = document.getElementById ('news'); ajaxDisplay.innerHTML = ajaxRequest.responseText;}} / / Var rssurlvar = $ (this) .attr ("title"); var queryString = "rurl =" + rssurlvar; var urltofile = "rssget.php"; ajaxRequest.open ("पोस्ट", urltofile, सच); ajaxRequ Est.setRequestHeader ("सामग्री-प्र...





Read more





wpf - Custom Message Box Advice -



    Well, I'm using a custom window box that comes with some custom control boxes that accompany the text The producers who are displayed are called.   I have a defined event, which has been subscribed through the original class, it gets burnt when clicking on the button.   However, I can not see how to use it effectively, I would like to return one yes, whether yes or not, but obviously my code will continue to execute, so this Below is some example code to clarify the problem that the method has subscibed by clicking on the button.    Message box window     Public partial class custom message box: window {public representative message messagebox object sender, event events e); Public event messagebox handler messagebox document; Public Customs Box () {Initial group (); } Public Customsbox (String Message) {InitializeComponent (); This.txtdescription.Text = Message; } Public Customsbox (String Message, String Title, String First BTNtext) {Initialization (); This.lbltitle.Content =...





Read more

Search This Blog

Alcaide

python - remove everything between 2 tags that span branches of an xml tree -

Comments

Post a Comment

Popular posts from this blog

Eclipse CDT variable colors in editor -

AJAX doesn't send POST query -

wpf - Custom Message Box Advice -