[daisy] HtmlCleaner : strange case ?
christophe blin
cblin at tennaxia.com
Thu Nov 9 04:16:43 CST 2006
Hi,
I am looking for html cleaners and find that the one in daisy is
particulary good :
- no fucking regexps allover the place
- clean configuration by xml
- nice test cases
So I was trying some cases and find that the following seems to behave
strangely :
cleaner = template.newHtmlCleaner();
result =
cleaner.cleanToString("<html><body><p><ul><li><p>hello!</p></li></ul></p></html>");
I am expecting something like :
<ul>
<li>
hello!
</li>
</ul>
but the cleaner answer :
<ul>
<li/>
</ul>
<p>hello!</p>
What I found pretty strange is that the p is put out off the li ?
IMHO, the only mistake here is that p is forbidden inside a li (i.e it
is unlikely that the user wants to have an empty li).
I am currently searching where the behavior comes from but if you have
any hint, do not hesitate to list them here.
Best regards,
chris
--
_____________________________________________________________________
Tennaxia, www.tennaxia.com,
Pilotez vos obligations environnementales
_____________________________________________________________________
Siège social :
6, rue Léonard de Vinci - 53001 Laval Cedex -
Tél : 02 43 49 75 50 - Fax : 02 43 49 75 77
Agence Paris :
19, rue réaumur - 75003 Paris -
Tél : 01 42 77 04 19 - Fax : 08 25 19 19 61
Agence Lyon :
Parc du Chater - 63 rue de la garenne - 69340 FRANCHEVILLE -
Tél : 04 72 39 98 14 - Fax : 04 72 39 93 85
The information in this message sent by TENNAXIA is confidential
and may be legally privileged. It is intended solely for the
addressee(s). Access to this message by anyone else is unauthorized.
If you are not the intended recipient, please delete it and notify
the sender : any disclosure, copying, distribution or any action
taken or omitted to be taken in reliance on it, is prohibited and
may be unlawful.
More information about the daisy
mailing list