Skip to content
Snippets Groups Projects
Commit 10ad2506 authored by Xavier Morel's avatar Xavier Morel
Browse files

[FIX] base: encoding guessing of html module descriptions


I missed a critical issue in #133708: various users had discovered
they could already fix description issues by adding an XML declaration
to their document which is very cool (though technically not really
valid).

What is a lot less cool is that lxml gets *extremely* unhappy when
asked to parse *strings* with an encoding declaration, raising a
ValueError, so the purported fix breaks on any module which does that,
which seems to include a lot of OCA modules.

Gate the encoding guessing by bailing if the document has an XML
declaration, in which case we just assume the author knows what
they're doing and we leave them alone. For extra safety, check the
encoding declaration in ascii and utf16. Could also have checked for
BOMs, but lxml seems to not care about them overly much (in fact it
seems to prefer them decoded which is odd).

closes odoo/odoo#133918

Reported-by: @rezak400
X-original-commit: 51d37560
Signed-off-by: default avatarXavier Morel (xmo) <xmo@odoo.com>
parent 9aeae36e
No related branches found
No related tags found
No related merge requests found
......@@ -147,6 +147,12 @@ STATES = [
('to install', 'To be installed'),
]
XML_DECLARATION = (
'<?xml version='.encode('utf-8'),
'<?xml version='.encode('utf-16-be'),
'<?xml version='.encode('utf-16-le'),
)
class Module(models.Model):
_name = "ir.module.module"
_rec_name = "shortdesc"
......@@ -180,11 +186,12 @@ class Module(models.Model):
if path:
with tools.file_open(path, 'rb') as desc_file:
doc = desc_file.read()
try:
contents = doc.decode('utf-8')
except UnicodeDecodeError:
contents = doc
html = lxml.html.document_fromstring(contents)
if not doc.startswith(XML_DECLARATION):
try:
doc = doc.decode('utf-8')
except UnicodeDecodeError:
pass
html = lxml.html.document_fromstring(doc)
for element, attribute, link, pos in html.iterlinks():
if element.get('src') and not '//' in element.get('src') and not 'static/' in element.get('src'):
element.set('src', "/%s/static/description/%s" % (module.name, element.get('src')))
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment