PostgreSQL Fulltext Search
Forum rules
Discuss features as they are added to the new version. Give us your feedback. Don't post bug reports, feature requests, support questions or suggestions here. Feature requests are closed.
Discuss features as they are added to the new version. Give us your feedback. Don't post bug reports, feature requests, support questions or suggestions here. Feature requests are closed.
Re: PostgreSQL Fulltext Search
Thank you wagnerch for your detailed answer.
This is a very clean patch.
Only a class and a few SQL entries.
I will try to port your patch to PostgreSQL 8.3 integrated full-text engine.
Will report back soon. I hope that we can stay in touch.
Very kind regards,
Jean-Michel
This is a very clean patch.
Only a class and a few SQL entries.
I will try to port your patch to PostgreSQL 8.3 integrated full-text engine.
Will report back soon. I hope that we can stay in touch.
Very kind regards,
Jean-Michel
Last edited by jmpoure on Fri Nov 16, 2007 2:50 pm, edited 3 times in total.
Re: PostgreSQL Fulltext Search
Sounds good, if you have an ideas for the "exact phrase" search then I think that would be handy as well.
-
- Registered User
- Posts: 653
- Joined: Wed Sep 21, 2005 3:01 pm
Re: PostgreSQL Fulltext Search
my idea regarding the "exact phrase":
-- get the search string from the user
-- sanitize it by escaping single quotes only
-- pass it as is to the search engine through a query, using single quotes.
in other words: completely skip the "split keywords" step. this step only makes sense for the "native" search mode, which requires single-word search. mysql and postgres fulltext do the search expression parsing themselves, and it makes the most sense to let the user build the search expression using the rules of the method used (mysql or postgres), instead of some arbitrary and incomplete set of rules which lacks, for instance, phrases...
if you use this logic, you might also want to edit the language files and change the explanation shown to the user describing how to build the search expression.
this will make your code shorter (the "split keywords" function is long, ugly, and cumbersome), faster, and better.
-- get the search string from the user
-- sanitize it by escaping single quotes only
-- pass it as is to the search engine through a query, using single quotes.
in other words: completely skip the "split keywords" step. this step only makes sense for the "native" search mode, which requires single-word search. mysql and postgres fulltext do the search expression parsing themselves, and it makes the most sense to let the user build the search expression using the rules of the method used (mysql or postgres), instead of some arbitrary and incomplete set of rules which lacks, for instance, phrases...
if you use this logic, you might also want to edit the language files and change the explanation shown to the user describing how to build the search expression.
this will make your code shorter (the "split keywords" function is long, ugly, and cumbersome), faster, and better.
Re: PostgreSQL Fulltext Search
tsearch2 doesn't appear to support exact phrases as you suggest. See http://archives.postgresql.org/pgsql-sq ... g00165.php, where the author states it has not been implemented yet.code reader wrote:my idea regarding the "exact phrase":
-- get the search string from the user
-- sanitize it by escaping single quotes only
-- pass it as is to the search engine through a query, using single quotes.
in other words: completely skip the "split keywords" step. this step only makes sense for the "native" search mode, which requires single-word search. mysql and postgres fulltext do the search expression parsing themselves, and it makes the most sense to let the user build the search expression using the rules of the method used (mysql or postgres), instead of some arbitrary and incomplete set of rules which lacks, for instance, phrases...
I think it would make sense to maintain the same search operators that already exist in phpBB, they are fairly standard and most users are familiar with them. I can't see making a user use "&" for AND, "!" for NOT, etc. And there still needs to be some sort of post-processing to address the exact phrase matching for PostgreSQL. There is a piece of code out there that does all of this for me, I just didn't want to pull that relatively large chunk of code into the mix. See http://www.digitalstratum.com/oss/fts_parser.
Re: PostgreSQL Fulltext Search
Dear Wagnerch,wagnerch wrote:Attached is a version for RC4, the author_search method has changed.
As for the caching, it would be interesting to see how the caching performs compared against a huge board. Perhaps I should make this an option for fulltext_postgres? I haven't looked at that code too deeply, it is a possibility I broke it anyways I know I had to throw away some MySQL specific stuff that was used to determine whether a result was saved or not.
I am currently testing this code.
Any news or updates?
Re: PostgreSQL Fulltext Search
I could basically install your patch, thanks.
But some problems exist on non-C databases, when using locales:
I am trying to install the French dictionary and will get back.
But some problems exist on non-C databases, when using locales:
Code: Select all
SQL ERROR [ postgres ]
ERREUR: could not find tsearch config by locale []
Re: PostgreSQL Fulltext Search
Okay, problems resolved.
Phpbb 3.0 database is UTF-8
So there should be a dictionnary installed.
In phpBB 3.0 code, there should be a test whether tsearch2 dictionary is installed for the current locale.
I downloaded dictionnaries on tseach2 homepage.
You should only install UTF-8 dictionnaries.
Adapt the script to dictionnary name and location.
Then I installed them using this code:
Now, the only limitation is that I cannot seach values with accents (é).
is this a problem with the lexerize code in search?
In next PostgreSQL 8.3 version, tseach2 will be integrated in core and there will be no need to install dictionaries.
Phpbb 3.0 database is UTF-8
So there should be a dictionnary installed.
In phpBB 3.0 code, there should be a test whether tsearch2 dictionary is installed for the current locale.
I downloaded dictionnaries on tseach2 homepage.
You should only install UTF-8 dictionnaries.
Adapt the script to dictionnary name and location.
Code: Select all
/usr/share/postgresql/8.2/contrib/french_utf8.*
Code: Select all
INSERT INTO pg_ts_cfg VALUES ('default_french', 'default', 'fr_FR.UTF-8');
INSERT INTO pg_ts_dict VALUES ('default_french', 'spell_init(internal)', 'DictFile="/usr/share/postgresql/8.2/contrib/french_utf8.dict",AffFile="/usr/share/postgresql/8.2/contrib/french_utf8.aff",StopFile="/usr/share/postgresql/8.2/contrib/french_utf8.stop"', 'spell_lexize(internal,internal,integer)', NULL);
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'email', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'file', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'float', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'host', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'hword', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'int', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'lhword', '{default_french}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'lpart_hword', '{default_french}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'nlhword', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'nlpart_hword', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'part_hword', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'sfloat', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'uint', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'uri', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'url', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'version', '{simple}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'word', '{default_french}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'nlword', '{default_french}');
INSERT INTO pg_ts_cfgmap VALUES ('default_french', 'lword', '{default_french,simple}');
is this a problem with the lexerize code in search?
In next PostgreSQL 8.3 version, tseach2 will be integrated in core and there will be no need to install dictionaries.
Re: PostgreSQL Fulltext Search
Small bug : when clicking on "View your messages":
I will report later on.
Code: Select all
Erreur Générale
SQL ERROR [ postgres ]
ERREUR: La colonne « t.topic_last_post_time » doit apparaître dans la clause GROUP BY ou doit être utilisé dans une fonction d'aggrégat []
I will report later on.
Re: PostgreSQL Fulltext Search
fulltext_postgressearch.php should be, line 590 and below
Code: Select all
$sql = "SELECT t.topic_id
FROM " . $sql_sort_table . TOPICS_TABLE . ' t '. "
LEFT JOIN " . POSTS_TABLE . " p
ON t.topic_id = p.topic_id
WHERE $sql_author
$sql_topic_id
$sql_firstpost
$m_approve_fid_sql
$sql_fora
$sql_sort_join
$sql_time
ORDER BY $sql_sort";
$field = 'topic_id';
-
- Registered User
- Posts: 2
- Joined: Wed Sep 19, 2007 7:35 pm
Re: PostgreSQL Fulltext Search
Howdy,
Couple of questions and maybe some help from anyone.
First - currently running a phpBB 3.0 board against Postgres 8.2.4 - using the Native-Full text search ( not this patch ) - as I figured 8.3 would be out soon.
OK - installed 8.3 RC1 lastnight on a test server and before i could get to testing this patch I received an error report from one of our board users - specifically using an exclusion search string ( ie "form -close") generates the following SQL statment and error:
So - Tried a similar search on a forum I know uses this patch ( postgresqlforums.com ) with a search ( "vacuum -thresholds" )
Ok , no errors - but the exclusion part of the search is ignored
I take it this is by design - the exclusions function is just not part of the patch at the moment?
Drew
Sorry for being a little OT - now off to open a defect for native full text search.
Couple of questions and maybe some help from anyone.
First - currently running a phpBB 3.0 board against Postgres 8.2.4 - using the Native-Full text search ( not this patch ) - as I figured 8.3 would be out soon.
OK - installed 8.3 RC1 lastnight on a test server and before i could get to testing this patch I received an error report from one of our board users - specifically using an exclusion search string ( ie "form -close") generates the following SQL statment and error:
I get the same error against ps 8.3 RC1 with phpBB native search.SELECT COUNT(DISTINCT p.post_id) AS total_results FROM phpbb_en_search_wordmatch m0, phpbb_en_posts p LEFT JOIN phpbb_en_search_wordmatch m1 ON (m1.word_id = 17255 AND m1.post_id = m0.post_id) WHERE m0.post_id = p.post_id AND m0.word_id = 18017 AND m1.word_id IS NULL
Error message
ERROR: invalid reference to FROM-clause entry for table "m0" at character 182 HINT: There is an entry for table "m0", but it cannot be referenced from this part of the query. []
So - Tried a similar search on a forum I know uses this patch ( postgresqlforums.com ) with a search ( "vacuum -thresholds" )
Ok , no errors - but the exclusion part of the search is ignored
I take it this is by design - the exclusions function is just not part of the patch at the moment?
Drew
Sorry for being a little OT - now off to open a defect for native full text search.