[daisy] Fulltext search enhancements

Nick dos Remedios nick at cambia.org
Tue Aug 1 19:35:18 CDT 2006


On 02/08/2006, at 2:58 AM, Bruno Dumon wrote:
> Also in Daisy 2.0, there are now some important enhancements to the  
> full
> text search (implemented by Paul, I'm just writing this since he seems
> to have forgotten it ;-) ):
>
>  * it is possible to retrieve relevant fragments from the found
> documents, with matching query words highlighted. This is simply done
> using a function in query language, taking as a parameter the  
> number of
> fragments you want to retrieve.
>
>  * the score of the fulltext search is now accessible via an  
> identifier
> in the query language
>
>  * it is possible to retrieve chunks from query results (to show
> paginated results, was previously only available for the faceted  
> search)
>
>  * the fulltext search page in the wiki has been enhanced to make  
> use of
> these new features
>
>  * upgraded the Lucene engine to version 2.0
>
>  * the problem with the "too many open files" mentioned recently on  
> this
> list is also solved (and this also in the 1.5 branch)
>
> For people working on svn trunk: these changes mean you'll have to
> rebuild your fulltext index: delete the content of the indexstore
> directory and trigger rebuilding via the JMX console.
>
> While I'm at it, another (unrelated) change (which was needed for the
> import/export tools): using the remote Java API no longer requires  
> that
> you have a user with the Administrator role (for the "cache user"),
> which is also a rather important improvement.
>
> -- 
> Bruno Dumon                             http://outerthought.org/

One feature I'd like to be able to see is the ability to search pages  
for a fragment of (Daisy) HTML.

I apologies if this is already possible, I have tried to find such a  
feature in the past but was not successful. It seems (to me) that  
Lucene only indexes the rendered text not the HTML code(?).

My current work around is to search directly against the repo server  
blobstore directory.

Nick


More information about the daisy mailing list