Выпустили [http://www.sai.msu.su/~megera/postgres/gist Gendict] - модуль для генерации шаблонов словарей для нового tsearch v2. Проверил для стеммеров на базе snowball. Вот как теперь можно добавить словарь:
0. cd PGSQL_SRC/contrib/gendict
1. Obtain stem.{c,h} files for Portuguese
wget http://snowball.tartarus.org/portuguese/stem.c
wget http://snowball.tartarus.org/portuguese/stem.h
2. Create template files for Portuguese
./config.sh -n pt -s -p portuguese -v -C'Snowball stemmer for Portuguese'
Note, that argument for -p option should be *the same* as name of stemming
function in stem.c (without _stem)
A bunch of files will be generated and placed in PGSQL_SRC/contrib/dict_pt
directory.
3. Compile and install dictionary
cd ../dict_pt
make
make install
4. Test it
Sample portuguese words with the stemmed forms are available
from http://snowball.tartarus.org/portuguese/stemmer.html
createdb testdict
psql testdict < /usr/local/pgsql/share/contrib/tsearch.sql
psql testdict < /usr/local/pgsql/share/contrib/dict_pt.sql
psql -d testdict -c "select lexize('pt','bobagem');"
lexize
---------
{bobag}