[daisy] [JIRA] Commented: (DSY-376) Hot backup of repository

Bruno Dumon (JIRA) issues at cocoondev.org
Mon Jun 18 09:12:50 CDT 2007


    [ http://issues.cocoondev.org//browse/DSY-376?page=comments#action_13230 ] 

Bruno Dumon commented on DSY-376:
---------------------------------

This issue could be solved by some more intelligent lock-mode for the blobstore, exploiting the fact that the blobstore never updates blobs, it only adds new blobs and sometimes removes blobs.

>From the point of view of the backup, the important thing is no data is lost:
 - if new blobs are added during the backup, this is not really a problem. In case a backup is restored, a tool could be run to check for, and remove, redundant (= non-referenced) blobs. [to check: would copying the blobstore fail if files are concurrently being written?]
 - blobs which are requested to be deleted during backup could be added to a queue, to be processed after the backup lock is released.

The main problem is how to keep track of this queue:
 - in-memory queue: would be lost when server is killed unexpectedly. Could be solved with the afore-mentioned garbage-cleanup tool. [running such tools is extra admin effort/knowledge, so should be avoided if possible]
 - queue stored in database or in file in blobstore: avoids the problem of queue entries being lost, but might be a problem in case a backup is restored, since the queue would also be part of the backup and hence a queue view of during the backup would be restored. This could be solved by allowing the blobstore to check with the repository if a key is still in use. [would introduce a two-way dependency between blobstore and repository, unless we check directly on the DB]

Note about the blobstore-cleanup tool: care should be taken that, if the repository server is running, this doesn't remove blobs for documents just being added (= non-committed db transactions). This could e.g. be solved by only considering blobs that are older than e.g. one day.

> Hot backup of repository
> ------------------------
>
>          Key: DSY-376
>          URL: http://issues.cocoondev.org//browse/DSY-376
>      Project: Daisy
>         Type: Improvement
>   Components: Backup
>     Versions: 1.5
>     Reporter: Min Idzelis

>
> We have several distributed teams accessing our Daisy document repository from different timezones. Our repository is large, and takes about an hour to backup. During this time, new documents can't be created, changed, etc because the repository is locked. Because someone may always want access to the repository, there is no good time to backup the repository. The best solution would be a way to backup the repository without locking it for writes. 

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.cocoondev.org//secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



More information about the daisy mailing list