[daisy] Starter question: move existing content into Daisy?
Kealy, John
John.Kealy at ucsf.edu
Wed Jun 25 18:00:36 CEST 2008
Geert,
We are looking at contributing a java program which scrapes content from
an existing website into daisy. It's not perfect, but it's easier than
having people cut and paste the HTML into daisy.
We will complete the scraper this week and then I hope to check it in.
Of course it could use more refinement, but it's another step on the
iterative ladder to importing existing hand coded HTML sites into daisy.
Cheers,
John Kealy
-----Original Message-----
From: daisy-bounces at lists.cocoondev.org
[mailto:daisy-bounces at lists.cocoondev.org] On Behalf Of Paul Focke
Sent: Wednesday, June 25, 2008 6:56 AM
To: Daisy: open source CMS - general mailinglist
Subject: Re: [daisy] Starter question: move existing content into Daisy?
Hi Geert
I know of cases where a bunch of legacy html has been dumped into Daisy.
However there is no application that reads in a bunch of html files and
imports them. Usually this is done by writing a little java code or
javascript. A good place to start might be to have a look in the daisy
source code. There is an application there that was used to import stuff
from jspwiki, that might be a good starting point for your case. It can
be found in <daisy_src>/applications/jspwiki_import
Things you might want to look out for :
- Daisy is a bit picky when it comes the html she accepts. That's why
there is a html cleaner component that cleans up the html.
- If you use images you should give that some thought too. These can
also be stored in daisy as documents and be referenced to from the html
document.
- Links between documents will have to be rewritten.
- It's always a good idea to keep a reference to the old URLs of your
documents. A good place to store these in a daisy document is in a
custom-field (look in the misc tab of the editor).
hth
Paul
On Wed, 2008-06-25 at 15:16 +0200, COELMONT, Geert wrote:
> Hi all,
>
> I am new to Daisy. I have installed and launched it without problems,
> and done some basic experimenting with new documents etc. All works
> nicely.
>
> However, the main reason for using Daisy was for us to better manage a
> pile of existing, hand-written HTML documents.
> They are about 300 HTML files, with frequent links referring from one
> to the other, and some basic markup.
> How do I get these into the Daisy repository?
>
> I found the export/import utility, but this is only suited for
> daisy-to-daisy importing.
> As far as I can see, no importing mechanism exists that allows a set
> of simple, plain HTML files to be imported quickly into Daisy,
> rewriting hyperlinks where appropriate. This would involve creating
> an empty document with corresponding document-ID for each of the
> files, importing the existing file into this new document, and
> rewriting all the links inside it to new urls, to reflect the new
> document-ID of the linked documents.
>
> Does anyone know whether such a feature exists somewhere?
> Probably nothing rocket science, but before I re-invent the wheel etc
> etc...
> If nothing exists, any ideas on how to best handle this are
> appreciated!
>
> Thanks in advance
>
> Geert
>
> **********************************************************************
>
> All e-mail messages addressed to, received or sent by the Cobelfret
> Group or Cobelfret Group employees are deemed to be professional in
> nature. Accordingly, the sender or recipient of these messages agrees
> that they may be read by other Cobelfret Group employees than the
> official recipient or sender in order to ensure the continuity of
> work-related activities and allow supervision thereof.
>
>
>
> This mail has been checked for viruses by Mailsweeper and Sophos
>
> *********************************************************************
>
> _______________________________________________
> daisy community mailing list
> Professional Daisy support:
> http://outerthought.org/en/services/daisy/support.html
> mail to: daisy at lists.cocoondev.org
> list information: http://lists.cocoondev.org/mailman/listinfo/daisy
_______________________________________________
daisy community mailing list
Professional Daisy support:
http://outerthought.org/en/services/daisy/support.html
mail to: daisy at lists.cocoondev.org
list information: http://lists.cocoondev.org/mailman/listinfo/daisy
More information about the daisy
mailing list