[daisy] Some feedback/questions

Mindaugas Idzelis idzelis at us.ibm.com
Tue Nov 14 15:28:33 CST 2006


I'm having growing pains with Daisy. Our database dump file is now 16meg. 
Our blobstore directory is 1.8GB and 33123 files. A couple of things come 
to mind. It would be nice if each version of a file was stored as a diff 
against the previous version. This may make things more space efficient. 
Storing the blobs in the database itself may also make better space 
utilization and file system access. Having this many small files around 
does create a lot of wasted slack space in the drive (each file entry is 
rounded to the nearest 4K boundary, so a 1K file takes up 4K of space, 
etc) It also takes a long time to copy that many files. It might also make 
the database backup simpler to perform, since all you would not have to 
backup the files. It might also make a "hot backup" possible - if you 
perform the mysqldump using "--single-transaction" it should not back up 
any new blob entries that were added after the start of the backup. So you 
get a consistent snapshot of the data. This is important to us because 
using the backup tool, it takes over a hour to backup the database - 
during which time no new documents can be saved. 

If the repository is locked, a new document can still be created - but not 
saved. This thoroughly confuses our users. 

The following message is also not very user-friendly. Users don't know 
what continuations, repository servers, or blobstores are. When the blob 
write lock is enabled, it should say the site is currently undergoing a 
backup. Please go back, and copy/paste your document in notepad 
temporarily until it is back online. (An ETA would also be helpful - maybe 
as simple as timing the last backup, and using that value) 

Sitemap: error calling continuation
Received exception from repository server.
Problem storing document.
Error storing part data to blobstore.
Write access to the blobstore is currently disabled. Try again later. 

The backup tool does some unnecessary copying. First, it copies all the 
files to the target. Then it zips all those files up, then it deletes the 
copy. Copying and even deleting 30K+ files takes a long time. Especially 
if you are copying these files to a remote filesystem. It also requires a 
lot of temporary space. Instead of copying, ziping the files directly 
would be the best idea. 

The backup tool doesn't provide a lot of feedback. It should be possible 
to determine progress. Maybe controlled with --verbose flag? Timing 
information would be nice to have. Emails on success also nice to have 
(cmd line option?) 

Bug with display of queries... You have a document type that contains a 
multi-valued field. You have some documents that have no values in this 
field, and some that have more than one value. You create a query to 
display the multivalue field. The table is is created by this table 
doesn't generate table cells for these "null" multi valued fields, making 
all the following columns off by one, severely affecting the display. 

Weirdness with the publisherResponse in skins. In my custom 
document-to-html.xsl template, I inherit the base, and add some 
customization. One thing I want to do is display the last modified date of 
the document under the title. The only way I could generate an XPath to 
pick out the field I wanted was by doing something like this:

 <xsl:variable name="lastModified" 
select="/document/p:publisherResponse/d:document/@*[position()=13]"/>

This is because there is no namespace of the attributes of the included 
document (I think - xslt is wacky) Trying to do the following didn't work. 


 <xsl:variable name="lastModified" 
select="/document/p:publisherResponse/d:document/@variantlastmodified"/>

One more thing that would be nice for skinning. A utility to transform 
those pesky XSLT-formatted dates into normal date formats. I haven't tried 
it - but 
http://www-128.ibm.com/developerworks/java/library/x-xalanextensions.html 
looks promising. Turns out you can call java methods directly from xalan. 
A little trickery with SimpleDateFormat should be the way to go. This 
would make a good addition to the util.xsl class. To format a xslt-format 
date into the locale specific date. 

One last comment. Keeping custom modifications (skins, config) away from 
the daisy install location has come a LONG way since Daisy 1.2. But there 
is still 2 thing that I have change every time I upgrade. 

1) I need to make a "work" directory inside of /daisywiki/webapp/WEB-INF/ 
so that jetty doesn't store temp file in /tmp. (Red hat deletes them every 
30 days, and after that file uploads don't work) 

2) drop my custom authentication schme jar into /lib/daisy/jars/ and then 
modify /repository-server/conf/block.xml and add 

<include name="ibmauth" id="daisy:my_auth" version="1.5"/> to <container 
name="authentication"> 

Would be nice if I could somehow specify these in myconfig.xml, and maybe 
have a directory that is put on the classpath for additions like this. 



Thanks,

Mindaugas Idzelis


More information about the daisy mailing list