Edit detail for #1249 search does not work as expected -- use another index revision 3 of 3

1 2 3
Editor: betabug
Time: 2008/05/17 10:08:08 GMT-7
Note: Unicode-aware splitter and adding of ZCTextIndex is in -unstable now

added:

From betabug Sat May 17 10:08:08 -0700 2008
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
From: betabug
Date: Sat, 17 May 2008 10:08:08 -0700
Subject: Unicode-aware splitter and adding of ZCTextIndex is in -unstable now
Message-ID: <[email protected]>

Status: open => closed 


Submitted by : wlang at: 2006-04-09T09:51:24+00:00 (15 years ago)
Name :
Category : Severity : Status :
Optional subject :  
Optional comment :

After an upgrade from zwiki-0.45 to zwiki-0.52 and after following the release notes from zwiki-0.51 to call SOMEPAGE/setupTracker (as my "changes" page wasnt sorted anymore) i found that the wiki search wasnt very usable anymore: the search was case sensitive, and there were no wildcard matching.

The reason for this is the use of the obsolete TextIndex (which is used by "setupCatalog" which in turn is used by "setupTracker"). Thus I propose the following changes:

  • patch to Admin.py, to use ZCTextIndex (if found):
      --- Admin.py.ori        2006-04-02 06:18:23.000000000 +0200
      +++ Admin.py    2006-04-11 00:08:59.000000000 +0200
      @@ -390,12 +390,40 @@
               catalog = self.catalog()
               catalogindexes, catalogmetadata = catalog.indexes(), catalog.schema()
               PluginIndexes = catalog.manage_addProduct['PluginIndexes']
      +        try:
      +            # do we have a ZCTextIndex?
      +            ZCTI = catalog.manage_addProduct['ZCTextIndex']
      +            # yes -- but do we have a lexicon?
      +            lexicon = getattr(catalog, 'lexicon', None)
      +            # define helper class
      +            class E:
      +                def __init__(self, **kw):
      +                    self.__dict__.update(kw)
      +            if lexicon is None:
      +                ZCTI.manage_addLexicon(
      +                    'lexicon',
      +                    elements=[
      +                    E(group='Case Normalizer', name='Case Normalizer'),
      +                    E(group='Stop Words', name=" Don't remove stop words"),
      +                    E(group='Word Splitter', name='HTML aware splitter'),
      +                    ])
      +            # extra info needed for ZCTI index creation
      +            extra=E()
      +            extra.lexicon_id = 'lexicon'
      +            extra.index_type = 'Okapi BM25 Rank'
      +        except AttributeError:
      +            # otherwise set ZCTI to None and use TextIndex from PluginIndexes
      +            ZCTI = None
               for i in TEXTINDEXES:
                   # XXX should choose a TING2 or ZCTI here and set up appropriately
                   # a TextIndex is case sensitive, exact word matches only
                   # a ZCTextIndex can be case insensitive and do right-side wildcards
                   # a TextIndexNG2 can be case insensitive and do both wildcards
      -            if not i in catalogindexes: PluginIndexes.manage_addTextIndex(i)
      +            if not i in catalogindexes:
      +                if ZCTI:
      +                    ZCTI.manage_addZCTextIndex(i, extra)
      +                else:
      +                    PluginIndexes.manage_addTextIndex(i)
               for i in FIELDINDEXES:
                   if not i in catalogindexes: PluginIndexes.manage_addFieldIndex(i)
               for i in KEYWORDINDEXES:
    
  • remove the manage_addTextIndex calls from ZWiki/plugins/tracker/tracker.py(they arent needed anyway)
  • recommend "setupCatalog" instead of "setupTracker" in the release notes (it is confusing to call setupTracker if all i want is a catalog for searching)


comments:

... --wlang, Mon, 10 Apr 2006 15:28:37 -0700 reply

adapted patch to handle setupCatalog calls if the catalog is already there

is this still open? Please investigate! --betabug, Wed, 21 Feb 2007 08:57:03 +0000 reply
Name: #1249 julita => #1249 search does not work as expected -- use another index Category: user-dtmlscripting => user-browsing Severity: critical => normal

I'd prefer a choice of splitters in the add form --betabug, Sat, 19 May 2007 08:02:17 -0700 reply
The problem is for me in this line:

  E(group='Word Splitter', name='HTML aware splitter'),

that splitter is not unicode aware - in fact people probably need a choice of splitters, depending on what language they operate in.

Since we now require a Catalog, my choice would be to have people choose the Lexicon settings in the Zwiki add form - e.g. like COREBlog does it. It makes a lot of sense, as people might have a special CJK splitter installed and they might want to use that.

Unicode-aware splitter and adding of ZCTextIndex? is in -unstable now --betabug, Sat, 17 May 2008 10:08:08 -0700 reply
Status: open => closed