[FIX] base: encoding guessing of html module descriptions
I missed a critical issue in #133708: various users had discovered
they could already fix description issues by adding an XML declaration
to their document which is very cool (though technically not really
valid).
What is a lot less cool is that lxml gets *extremely* unhappy when
asked to parse *strings* with an encoding declaration, raising a
ValueError, so the purported fix breaks on any module which does that,
which seems to include a lot of OCA modules.
Gate the encoding guessing by bailing if the document has an XML
declaration, in which case we just assume the author knows what
they're doing and we leave them alone. For extra safety, check the
encoding declaration in ascii and utf16. Could also have checked for
BOMs, but lxml seems to not care about them overly much (in fact it
seems to prefer them decoded which is odd).
closes odoo/odoo#133900
Reported-by: @rezak400
Signed-off-by:
Xavier Morel (xmo) <xmo@odoo.com>