[FIX] base: encoding guessing of html module descriptions
I missed a critical issue in #133708: various users had discovered they could already fix description issues by adding an XML declaration to their document which is very cool (though technically not really valid). What is a lot less cool is that lxml gets *extremely* unhappy when asked to parse *strings* with an encoding declaration, raising a ValueError, so the purported fix breaks on any module which does that, which seems to include a lot of OCA modules. Gate the encoding guessing by bailing if the document has an XML declaration, in which case we just assume the author knows what they're doing and we leave them alone. For extra safety, check the encoding declaration in ascii and utf16. Could also have checked for BOMs, but lxml seems to not care about them overly much (in fact it seems to prefer them decoded which is odd). closes odoo/odoo#133918 Reported-by: @rezak400 X-original-commit: 51d37560 Signed-off-by:Xavier Morel (xmo) <xmo@odoo.com>
Please register or sign in to comment