[daisy] Daisy Questions

Marc Portier mpo at outerthought.org
Mon Aug 7 02:04:01 CDT 2006



Paul VanWechel wrote:
> Hi,
> 
> I have three questions:
> 

only three? you lucky man!

> 1. Will the planned jbpm support allow the use of the import/export
> command line tool as a component in a workflow? e.g. other command line
> tools may optionally be invoked for additional processing on a batch of
> exported data, with processed data then roundtripped back into the
> repository. We were unsure if jBPM had a component that supported shell
> calls though.
> 

the new import/export tools will probably be called 95% of their time
through the cli, however that didn't prevent Bruno from introducing a
solid design for those: There is enough separation into separate classes
there so you can embed calling those from Java easily.

anyway, I'ld like to understand this workflow use-case you're seeing, I
have to admit export/import was probably one of the last items I would
expect to be part of the workflow:

In general my own idea would be that typical workflows would be more
oriented towards coordinating activities amongst document authors, not
system administrators (import/export is more an administrative kind of
operation in my head)

Having said that however, I'm not seeing any upfront limitations that
would prevent having more sysadmin oriented workflow.  Anyways to get
our thinking straight you might want to share what it is you want to do.


> 2. What is the timeline for jBPM support in Daisy?
> 

for now: roughly getting started on implementing it in the second half
of september

in the mean time we have some low priority threads running around
information gathering and analysis.  Don't be shy to step in and share
your views, comments, use cases...

Current working-doc is here:
http://cocoondev.org/daisyscratchpad/g5/292.html

We're actually hoping that people interested in the workflow would be
creating they're JBPM workflows already and share them.  Those would be
very formal and useful use cases that could guide our design.

Apart from that I'm obliged to point you to section 3 of the
backgrounder FAQ (http://cocoondev.org/daisy/index/110.html) which
indicates ways to ensure and influence those timings :-)

> 3. Were other workflow engines considered, and if so, what were the
> reasons for selecting against them? For example, we noticed that
> Magnolia used OpenWFE.
> 

and we noticed that Alfresco is using jBPM....

Bruno made the decision here, so maybe he can chime in.

Upfront I can tell you that we're the kind of people that loathe the
Syndrome of 'Life-Long Evaluation'. In other words, we believe that it's
often more efficient _not_ to spend a lot of time spent in trying to
evaluate and find the *best* tool, but rather in learning to depth a
*right* or even *good enough* tool so you can make it work for you in an
optimal way.

Dunno, what Bruno will add, but I wouldn't expect a full comparison
matrix kind of document :-)


> We are currently evaluating open source CMS for our intranet, and the
> workflow component is important to us. Our files often require several
> processing steps by cmd line invoked tools, and files can be 100MB+ in

again, I think it would be nice if you could share some more on these
use cases.  What is the flow? Which actions are to be taken? Who are the
actors?

> size. We were unsure if jBPM was more suited to email task lists instead
> of larger batch processing problems (that have task lists in the process
> flow). Some of the commercial products we have considered, claimed that

interesting, which ones ? :-)

> Daisy would not be able to handle large files using jBPM. We understand
> workflow support is not included yet, but our project will not ramp up
> for 1-2 months anyway.
> 

Well, I'm unsure why jBPM should actually be 'handling files'... As far
as I've been thinking around this workflow integration, I'm largely
seeing jBPM persisting workflow state that only _refers_ to documents in
daisy. Those *references* would probably be so small as a
daisy-document-id, ok, maybe add in the branch and language, a version
number, and a repository-url in the case of cross-repository workflows...

In any case, to my knowledge we're not thinking about including the
document-content in the workflow-state. If you have a case were that
would absolutely be required, we'll be glad to learn.

Also, from a document-management consulting perspective I'm unsure if
the overall efficiency of the content authors and automated processes
couldn't benefit from avoiding these large file sizes in a more chunked
down and modularized approach?  Of course that might introduce more
process re-engineering then you are waiting for, and  I honestly don't
have the insight in your requirements to actually make that a hard
claim, just an immediate association...

Again: ensuring we handle your use case properly regarding files sizes
(or whatever else) is to share what it is you're facing.  I think I'm
hearing intermediate document-processing results? I suppose those could
be either stored also in the repo or just plainly on some file-system
location, no?

Apart from that: anybody making claims knowing the upcoming pitfalls of
our yet to be designed workflow integration is either a "Gifted
Visionary" with a "Crystal Ball" that we'ld like to have a look at, or
is just spreading FUD :-)

HTH,
-marc=
-- 
Marc Portier                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at                http://blogs.cocoondev.org/mpo/
mpo at outerthought.org                              mpo at apache.org


More information about the daisy mailing list