[daisy] advice on tagging and Categories.

Chris chrisdi at yahoo.com
Thu Sep 28 15:05:50 CDT 2006



--- Marc Portier <mpo at outerthought.org> wrote:
 
> Chris wrote:
> > 
> > Hi,
> > 
> > I am working on a new Daisy project
> > and have a question about tagging and Categories.
> > 
> > I see in the daisy scratchpad how a field like
> > $DaisyCommWikiCategory is used in the navigation 
> > document query.
> > 
> > I am looking for something similar, but more like tagging.
> > By tagging I mean multiple attributes such as;
> > 
> > Attributes:
> > White-Paper, Meeting-Planner, Tutorial, News.
> > Math, Science, Automobiles.
> > 
> > DocA - Science, White-Paper
> > DocB - Math, White-Paper
> > DocC - News
> > DocD - Science, News
> > 
> > I think I found a way to do this, I created a multi-value field
> > called Tag with this set of values.  Then I made a New Document
> > just like SimpleDocument but with the Tag field and I called it
> > BasicDocument.  Why am I doing this:
> > 
> > Now I can do a query for these 'tags' for navigation 
> > or query includes.
> > 
> > I could also use these tags in a faceted browser? I hope.
> > 
> 
> yes, and IMHO, that's precisely where those come to practical use,
> however I think they become more useful when you actually define
> separate dimensions (or facets) upfront
> 
> (see more below)
> 
> > Or for a 'similar docs' list based on a custom publisher?
> > 
> 
> mm, yummy, nice!
> 
> although I'm unsure of the current state of daisy's query language will
> allow to practically express this notion of 'similar' (which to me
> sounds like a threshold applied to a calculated ratio of matching values
> over the total amount of present or possible values?)

Not sure I will try this(similar) first, but I know I want to add a faceted browser.
The first browser like this I used is when yahoo email added a facet like browser
to their search result.  So if my keyword search returns 1000+ hits it lets me
narrow the results by facets such as sender, folder, date etc.  This works 
very well.

Then I saw the faceted browser on http://cocoondev.org/main/facetedBrowser/default
and thought I want something like this.

> 
> anyways: I'm surely interested in where this might lead us to, do keep
> us informed of how things are going, and don't be shy to propose new
> features (or even better provide patches :-))

ok, I will.

> 
> > I am also thinking if I want this kind of tagging on attachments
> > I would make a new attachment document type with this new Tag 
> > field.
> > 
> > WDYT?
> > 
> 
> I've recently attended GovCamp in Brussels (which was a barcamp like
> unconference about e-Gov stuff) and touched upon the subject of using
> facet browsers as a guide to finding stuff in a cloud of tags
> 
> In the discussion afterwards someone attended me on the more advanced
> use of 'semantical tags' and related it to how rdf tripplets work...
> (and the essence of semantic web approaches...)
> 
> 
> Now, looking at your proposal of having one field (multivalue) with the
>  catch-all-semantics name 'TAG' , I'm getting the feeling you're loosing
> some semantical value compared to a solution where you would allow for
> more separate fields, each with a more precise semantical meaning.
> 
> e.g. by introducing various fields:
>   * 'PublicationType': White-Paper, Meeting-Planner, Tutorial, News.
>   * 'Topics' (multivalue) : Math, Science, Automobiles

I think I will try something like this.  It gives you 
a tripple:
 * DocA - hasTopic - Math
 * DocB - hasPublicationType - White-Paper

> 
> And actually: on the subject of 'topics' you might want to have a look
> at the (see daisy 2.0-dev) hierarchical values that would allow you to
> provide a semantical taxonomy of keywords for people to pick from.  This
> would allow to identify sub-topics as being a member of larger topic
> groups. e.g. Algebra >is> Math >is> Science. With this new feature
> tagging an article as being about 'Algebra' could make it popup in a
> query searching for articles about 'Math' (or one of it's decendants)

ok, I will look at this 2.0 stuff.  At some point it might be
interesting to look at actual semantic tools such as rdf and
how that could be integrated.

> 
> 
> 
> Now, I do understand how above suggestions bring in more formalism to
> this tagging (more then you'ld want?). And making it worse: more
> formalism that is centrally controlled by the one admin or small group
> that is defining the document-types.
> 
> Also: introducing this upfront-formalism feels like breaking with IMHO
> the biggest advantages of the folksonomies and tagging movements that
> are now largely catching on everywhere:
>   - decentralisation and
>   - adding semantics as we go (like in natural human learning and
> interaction)

For my purpose the upfront formalism is ok.  Like you say below with 
the chaos of wide open tag-clouds.  Initially I want reliable tags
that come from an informed decision.  But is sounds like you could
also add a decentralized approach, maybe use both.

> 
> On the other hand: the vast resulting chaos in some tag-clouds out there
> and the growing number of disgruntled users with these free tagging
> systems seem to indicate some more formalism or at least guidance is
> required anyway...  I 'm afraid the web 2.0 hype (nor academic work on
> the semantic web for that matter) has brought us to a decisive point in
> this debate yet...
> 
> 
> 
> Coming back to daisy I think we're in pretty good shape to answer both
> approaches with the structure we have in place: I think we provide quite
> some flexibility here and there to find a balance:
> 
> Note that
> 1/ document-types can pre-define more formal semantical fields (with
> strong types and/or selection lists), but documents still can have free
> added 'custom fields'.
> 2/ document-types can change over time, so ther is a mechanism to
> respond to changing needs
> 3/ document-tasks provide an easy programmable way to do bulk operations
> on documents already in the repo.

document-task; I have not looked at this yet but sounds good.  If I come up with a 
new document type or a new field type and I want to convert existing documents 
this is an api or process that would help?

> 
> > Thanks, Chris.
> > 
> 
> thanx for sharing, and excuse me for being more contemplative and
> filosopher-like then pragmatic and useful in my above 'advice'
> 
> 
> > p.s. The documentation was very good.  I recently installed Daisy 1.5.1
> > on Ubuntu 6 and had no trouble.  There were a few steps but easy
> > to follow.
> > 
> 
> cool! (being a bit of an ubuntu fan myself)
> 
> you might be interested in the fact that our svn-trunk has (since
> recently) a distro/debian section holding the code to build your own
> packages

sounds good!

Thanks for your reply, I'll let you know how it works out,

- Chris





More information about the daisy mailing list