[daisy] Importing existing static HTML sites into daisy

Bruno Dumon bruno at outerthought.org
Fri Jan 4 02:44:58 CST 2008


On Fri, 2008-01-04 at 09:22 +0100, Paul Focke wrote:
> Happy New Year
> 
> If I could give you one word of advice, sunscreen would be it. Well not
> really sunscreen but I'd remember to use the htmlcleaner on the html
> before dumping it in daisy. Have the cleaner go over all the html files
> first on a test run and have it report any errors since you can
> sometimes find some really weird html out there.
> If you feel comfortable doing this in javascript it is quite easy. I
> remember doing something vaguely similar for a project we worked on
> here. But it didn't rewrite links (<a href=""> & <img src="">) like you
> would need to (I'm assuming that there might be links between the
> pages). I guess that might be a tricky part of the exercise.
> 

Not so tricky, but a bit of extra work. It requires to do the work in
two steps: first import all the documents into Daisy, and remember for
each path the assigned document ID. In a second step, translate the
links in the documents using this information. And to make it complete,
report broken links.

Seems like there's often this question about an import tool. Maybe a
nice project for someone?

BTW, the code Helma refers to is in applications/jspwiki_import. It's
quite specific to a particular JSP wiki installation though.

-- 
Bruno Dumon                             http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
bruno at outerthought.org                          bruno at apache.org


More information about the daisy mailing list