|
INSTALLATION
Installation
- To enable CJKSplitter for your portal:
- Unpack the product in your Products directory and restart Zope.
- Create a new Lexicon in portal_catalog that uses CJKSplitter.
Feel free to enable stop words and case normalizing; non-latin
letters will not be case normalized, and the only stop words are
English.
- Recreate the full-text indexes (Title, SearchableText) and have
them use the new lexicon.
- Recatalog the portal (portal_catalog -> Advanced -> Update Catalog).
This may fail with UnicodeError if your object's don't return
Unicode strings for the methods that get indexed using CJKSplitter,
for example SearchableText and Title.
To be safe you should always return Unicode strings, but if you want
to be brave, set default_encoding in CJKSplitter.py to the encoding
that your non-Unicode text is in; this will make CJKSplitter
make braver (the default is ASCII) assumptions about non-Unicode
strings when converting them into Unicode.
- RECOMMENDED: Add a string property called "management_character_set"
with value "UTF-8" to the root of your Zope installation. This
will enable you to query and view the CJK characters in the lexicon.
- To enable Chinese encodings in Unicode in Python (OPTIONAL):
- Get and install the Chinese Python codecs from
Source Forge
- Add these aliases to .../python2.X/encodings/aliases.py:
'gb2312': 'eucgb2312_cn',
'big5': 'big5_tw',
- To let you print Unicode strings to your terminal (OPTIONAL):
- Change
if 0 to if 1 where it says
Enable to support locale aware default string encodings
- Set your appropriate LANG and LC_* variables to match your system's locale
|