A Tagging System for ZWiki

Here are the details of a prototype tagging system I developed and which is in use at thinkubator.ccsp.sfu.ca. The system is very simple, and uses nothing more than the existing parent/subtopic/page hierarchy system in ZWiki. The difference is that where the existing system encourages users to maintain a clean hierarchy, this system encourages multiple parents per page, which results in a very tangled hierarchy.

The basic concept here is that a "tag" is a "parent." So, any time there is a tag applied to a page, that means there is a wiki page by that name. There is no such thing as a tag without an associated wiki page. The relationship between page and its tags is just page to parent: tags literally are parents. I've switched over to calling them tags because it ends up being less confusing than talking about a page with 12 parents.

So far this has presented no problems, except in a situation where two pages are tagged to one another, which can cause ZWiki's contents listing to throw up. This hasn't occurred in practice very often, though.

Three sections follow:

Basic Functionality

The Tagcloud

Showing lists of tag relationships

I'll present the code and a little explanation in each case:

Basic Functionality

The basic functionality is very simple, composed of just a bit of UI (I wrote it in DTML 'cause I like DTML) and a Script that handles it.

tags is a little piece of UI that lists current tags for the page, and allows a user to add a new one. At one point, I had checkboxes next to the tags in the list, so this could also be used to remove tags (same functionality as ZWiki's backlinks interface), but the cleaner, simpler version won out.

<div id="tags">

<form method="post" action="fuzzyReparent">

<input type="hidden" name="thePage" value="<dtml-var pageId>">

<dtml-if "_.len(parents)!=0">
  <p>Page Tags:</p>
  <dtml-in "parents">
       <input type="hidden" name="myparents:list" value="<dtml-var sequence-item>">
       <a href="<dtml-var sequence-item>"><dtml-var sequence-item></a><br />
  </dtml-in>

<br />
New tag: <br />
<input type="text"
       name="newparent"
       class="formfield"
       size="16"
       maxlength="100" />
<br />
<input type="submit"
       value="Add Tags" />
</form>
</dtml-if>
</div> <!-- tags -->

fuzzyReparent is a python script which evaluates the new tagname coming from the tags UI. It uses ZWiki's pageWithFuzzyName method to see if what the user typed in matches any existing page. If so, it simply adds that page as a tag. If it doesn't match, then two things happen: a new page by that name is created, parented to a placeholder page called TagPages? (FrontPage would do as well), and then the new page is added as a tag to the current context page.

# fuzzyReparent
# Parameter List: thePage, myparents, newparent
request = container.REQUEST
RESPONSE =  request.RESPONSE

theExistingPage = context[thePage]

if newparent:
    if theExistingPage.pageWithFuzzyName(newparent,url_quoted=1, allow_partial=0,ignore_case=0):
        myparents.append(theExistingPage.pageWithFuzzyName(newparent, url_quoted=1,allow_partial=0,ignore_case=0).getId())
    else:
        theNewPageName = context.TagPages.create(newparent,text='This page was created automatically. Feel free to edit it.',REQUEST=None)
        canonicalName = theExistingPage.canonicalIdFrom(theNewPageName)
        theNewPage = context[canonicalName]
        theNewPage.manage_addProperty('showList', 3, 'int', REQUEST=None)
        myparents.append(theNewPage.getId())

    theExistingPage.reparent(myparents)
    RESPONSE.redirect(theExistingPage.absolute_url())

That's it... what follows are simply some mechanisms for seeing what's going on more clearly.

The Tagcloud

The tagcloud is just a filtered count of which pages have the most subtopics -- which is, in my terms, which tag pages are referred to most often. Three methods:

<div id="tagcloud">
<div class="boxtitle" align="left">Site Tags:</div>
<dtml-var "tagCloudstats()">
</div>

Simple enough... just a wrapper for the tagCloudstats code, which does the work:

# tagCloudstats
# Parameter List:
request = container.REQUEST
RESPONSE =  request.RESPONSE

outline = request.PARENTS[0].outline
nodes = outline.nodes()
childmap = outline.childmap()
sortedmap = []

# sortedmap is a list of tuples (page, number of children)

for node in nodes:
    sortedmap.append((node, len(childmap[node])))

sortedmap.sort(lambda x,y: x[1]-y[1])  # sort by number of children
sortedmap.reverse()                    # descending
sortedmap = sortedmap[0:12]            # take the top 12
sortedmap.sort()                       # and sort alphabetically

filterList = ['FrontPage', 'TagPages', 'testing'] # filter these out

for tag in sortedmap:
    if tag[0] not in filterList:
        print '<span title="' + tag[0] + ': ' + str(tag[1]) + \
              ' pages" style="font-size: ' + \
               str(int(container.tagCloudTweak(tag[1]))) + \
              'px;"><a href="' + tag[0] + '">' + \
              tag[0] + '</a> </span>&nbsp;'

return printed

And finally, tagCloudTweak, which does some math to determine font sizes. I broke this out to its own method for easy access and continual tweaking. I'm sure a nice robust algorithm here would be better, but this is what I did:

# tagCloudTweak
# Parameter List: tagnumber
return ( (tagnumber/(3.8)) + 6 )

Showing lists of tag relationships

I modified ZWiki's show_subtopics functionality to do this. Basically, what I wanted was to be able to control, on a page-by-page basis, whether it presented a regular wiki page (without subpages) or a list of subpages (handy for pages created through the tagging process, or any other categorizing pages.

The basic idea is that show_subtopics is on all the time, and then a simple 3-way toggle a page property called showList on each wiki page changes the display:

0 or 1 (or undefined):
 shows a regular wiki page with no subtopics
2:shows a reverse-chronological list of subtopics, including snippets of content, so that the result looks like a blog.
3:shows a simple one-line-per-item list of subtopic pages

This is probably the ugliest bit of code in the system, as I haven't cleaned it up yet. But here it is, subtopics_listing, again in DTML:

<dtml-comment>THERE ARE TWO DIFFERENT INTERFACES HERE, DEPENDING ON showList
     BEING 2 or 3. THE SECOND VERSION IS HALFWAY DOWN THIS PAGE </dtml-comment>

<dtml-if showList>
  <dtml-if "showList==2"> <!-- BLOG VERSION -->
  <div id="showBlogList">

  <dtml-let subtopics=childrenAsList>
  <dtml-if subtopics>
    <dtml-call "REQUEST.RESPONSE.setHeader('Content-Type','text/html; charset=utf-8')">
    <dtml-in "[pageWithName(c) for c in subtopics]" prefix=x sort=creationTime size=12 reverse>

    <div class="entry">
    <dtml-let
      active="lastEditIntervalInDays() < 7"
      creationdate="'%s/%s/%s' % (creationTime().year(),creationTime().month(),creationTime().day())"
    >
      <div class="entryTitle">
        <dtml-var "wikilink('['+pageName()+']')">
      </div> <!--entryTitle -->

      <div class="entryByline">
    <!--     -->
       Posted by <a href="<dtml-var creator>"><dtml-var creator></a>,
       <a href="<dtml-var id>/diff"
          title="Last edited by <dtml-var last_editor> <dtml-var lastEditInterval> ago.">
          <dtml-var creationdate></a>
    </div> <!-- entryByline -->
    </dtml-let>

    <div class="entrySummary">
    <dtml-try>
    <dtml-var "renderedSummary(paragraphs=2)">
    <dtml-except>
    Summary unavailable
    </dtml-try>
    </div> <!-- entrySummary -->

    <div class="entryComments">
        <a href="<dtml-var absolute_url>"><dtml-var commentCount> comments</a>
    </div> <!-- entryComments -->

    </div> <!-- entry -->

    </dtml-in>
    <dtml-else>
      <p>No subtopics</p>
  </dtml-if> <!-- subtopics -->
  </dtml-let>
  </div> <!-- showBlogList -->

  <!--  END #2 BLOG STYLE LIST, BEGIN #3 SHORT SUMMARY TAGGED LIST -->

  <dtml-elif "showList==3"> <!-- TAGGED PAGES VERSION -->
    <div id="showTaggedList">
    <H4>Pages tagged to this page:</H4>

    <dtml-let subtopics=childrenAsList>
    <dtml-if subtopics>
      <dtml-call "REQUEST.RESPONSE.setHeader('Content-Type','text/html; charset=utf-8')">
      <dtml-in "[pageWithName(c) for c in subtopics]" prefix=x sort=creationTime reverse>

      <div class="briefentry">
      <dtml-let
        active="lastEditIntervalInDays() < 7"
        creationdate="'%s/%s/%s' % (creationTime().year(),creationTime().month(),creationTime().day())"
      >
       <span class="brieftitle"><dtml-var "wikilink('['+pageName()+']')"></span>
        by <a href="<dtml-var creator>"><dtml-var creator></a>,
         <a href="<dtml-var absolute_url>/diff"
            title="Last edited by <dtml-var last_editor> <dtml-var lastEditInterval> ago.">
            <dtml-var creationdate></a>
            <a href="<dtml-var absolute_url>"><dtml-var commentCount> comments</a>
      </dtml-let>
      </div> <!-- briefentry -->

      </dtml-in>
      <dtml-else>
        <p>No subtopics</p>
    </dtml-if> <!-- subtopics -->
    </dtml-let>
    </div> <!-- showTaggedList -->

  </dtml-if> <!-- showList == 3 -->
</dtml-if> <!-- showList -->

A little bit of code provides the means of toggling view modes:

# toggleView
# Parameter List: toggleValue=1
request = container.REQUEST
RESPONSE =  request.RESPONSE

if context.hasProperty('showList'):
    context.manage_changeProperties(showList=toggleValue)
else:
    context.manage_addProperty('showList', toggleValue, 'int', REQUEST=None)

RESPONSE.redirect(context.pageUrl())

That's it. Comments would really be appreciated.

We've had this in use for a couple of months. On the main site, which already has close to a thousand pages and an existing community, the uptake has been a little patchy... people had to figure out what to do with the tags, since they hadn't been using them before.

But on my subwikis, which are usually project-specific, research and note-taking environments, this tagging system is invaluable! Starting a wiki from scratch and being able to multiply categorize things on the fly is fantastic... I think it goes well with the original wiki concept of a quick hypertext writing environment.