[daisy] [discussion] Paging of documents (without ever loading them all into memory) instead of maxClauseCount error

Geoffrey De Smet ge0ffrey.spam at gmail.com
Thu May 3 08:35:01 CDT 2007


In our application a user might enter "d*" to do a query,
which has 5000+ documents as a result, it could take minutes.

If the users sees the first 20 matches fast,
and be able to see 20 till 40,
he 'd probably never look at the 4960+ other documents.


Facetted browsing currently has some "paging" support in them,
but it looks like it Daisy fetches them all into memory.

I'd like to question if it's needed to fetch them all into memory:
- The MyJDBC driver has scrollable support (which is paging support on 
mysql lvl). If MySQL has the correct indexes, it doesn't require to load 
all results of a query into memory to deliver the first 20.
- Hibernate-JPA supports paging, if you're using a scrollabe JDBC driver.
- Does Lucene have paging/scrolling support?

- Hibernate-Search (which combines JPA with lucene) is actually pulling 
real scrollable paging off at the moment apparently:
   luceneSession.createLuceneQuery(luceneQuery).scroll()
Somehow they seem to have figured out how to combine lucene WHERE's and 
mysql WHERE's without loading all results into memory.

- Of course, daisy would also need to combine ACL into it...

-- 
With kind regards,
Geoffrey De Smet



More information about the daisy mailing list