python - Is there a way to force lxml to parse Unicode strings that specify an encoding in a tag? -


I have an XML file that specifies an encoding, and I want to convert it to Unicode for Unicode (Memory For reasons I can not store it as a string). I later pass it to LXML but it denies the encoding specified in the file and parsing it as a Unicode, and it raises an exception.

How can I force LXL to parse the document? This behavior seems very restrictive.

You can not parse the Unicode string and can declare an encoding declaration in the string. Therefore, either you make it an encoded string (as you can not apparently store it as a string, you will have to re-sign it before parsing it. Or you can use it as a tree in the form of Unicode You can serial yourself with Limbal: etree.tostring (tree, encoding = Unicode) , without XML declaration. You can re-purge the result with etree.fromunicode

View

Edit: Apparently, you can not control the Unicode string already, and how it can be controlled. You have to encode it again, and it will have to provide parsers with encoding which You use:

  utf8_parser = etree.xmlParser (encoding = 'UTF-8') DRF Paras_frame_unicode (Unicode_st): s = Unicode_State. Encoded ('UTF-8') Return Entry .framestring (s, Parser = UTF 8_parres).  

This will ensure that whatever is inside the XML declaration is ignored, because the parser will always use utf-8.


Comments

Popular posts from this blog

Eclipse CDT variable colors in editor -

AJAX doesn't send POST query -

wpf - Custom Message Box Advice -