Monday, March 26, 2012

Invalid results in FULLTEXTTABLE Search

Help.
Have the following:
SELECT
(ISNULL(CAST(RANKTBL1.RANK AS decimal),0)) +
(ISNULL(CAST(RANKTBL2.RANK AS decimal),0)/10) AS RANK,
RANKTBL1.RANK AS RANK1,
RANKTBL2.RANK AS RANK2,
items.itempkid,
items.keywords,
menu.menupkid,
menu.title,
items.content
FROM
(menu INNER JOIN items ON menu.itempkid = items.itempkid)
INNER JOIN
(FREETEXTTABLE(items,keywords,'paul drew') AS RANKTBL1 FULL OUTER
JOIN FREETEXTTABLE(items,content,'paul drew') AS RANKTBL2 ON
RANKTBL1.[KEY] = RANKTBL2.[KEY])
ON (items.itempkid = RANKTBL1.[KEY] OR items.itempkid =
RANKTBL2.[KEY])
WHERE
(ISNULL(CAST(RANKTBL1.RANK AS decimal),0)/10) +
(ISNULL(CAST(RANKTBL2.RANK AS decimal),0)/10) > 3
ORDER BY
RANK DESC
But the highest ranked record this returns from my database has
neither the word 'paul' or 'drew' occuring in any of the fields. The
closest I can find is that the top record has 'dre' near the beginning
of the 'content' field.
If I just search for 'drew' this excludes 2 records that contain the
word 'drew' and still returns the aforementioned record.
I've tried doing a full population, and rebuilding the catalog+full
population with exactly the same results returned.
Any ideas?
Andy,
Could you provide the full output of -- SELECT @.@.version -- as this would be
helpful in troubleshooting this SQL FTS issue.
Also, could you post the count(*) value for your FT-enabled table items?
Additionally, while I understand why you're doing the "math" on the returned
RANK values, but it will not get you a higher RANK value for specific words
or phrases that you expect. See SQL Server 2000 BOL title "Full-Text Search
Recommendations" for a more information. As an alternative, you may want to
try the below approach - assuming that your table as a statistically
significantly number of rows - and using the WEIGHT parameter:
SELECT e.LastName, e.FirstName, e.Title, e.Notes, B.[KEY], B.[RANK] as
B_RANK, A.[RANK] as A_RANK
from Employees AS e,
containstable(Employees, Notes, 'ISABOUT (BA weight (.2) )') as A,
containstable(Employees, Title, 'Sales') as B
where
A.[KEY] = e.EmployeeID and
B.[KEY] = e.EmployeeID
--> examples of using RANK and multipling RANK
use pubs
go
SELECT FT_TBL.au_lname, FT_TBL.au_fname, KEY_TBL.RANK, key_tbl1.Rank
FROM authors as FT_TBL,
CONTAINSTABLE (authors,au_lname, 'Ringer' ) AS KEY_TBL,
CONTAINSTABLE (authors,au_fname, 'Michael' ) AS KEY_TBL1
WHERE
FT_TBL.au_id = KEY_TBL.[KEY] or
FT_TBL.au_id = KEY_TBL1.[KEY] and
key_tbl1.Rank < 2* KEY_TBL.RANK
ORDER BY KEY_TBL.Rank, KEY_TBL1.RANK
Regards,
John
"Andy Clark" <andy@.workonmypc.co.uk> wrote in message
news:718c4cf7.0405050614.f103787@.posting.google.co m...
> Help.
> Have the following:
> ----
--
> SELECT
> (ISNULL(CAST(RANKTBL1.RANK AS decimal),0)) +
> (ISNULL(CAST(RANKTBL2.RANK AS decimal),0)/10) AS RANK,
> RANKTBL1.RANK AS RANK1,
> RANKTBL2.RANK AS RANK2,
> items.itempkid,
> items.keywords,
> menu.menupkid,
> menu.title,
> items.content
> FROM
> (menu INNER JOIN items ON menu.itempkid = items.itempkid)
> INNER JOIN
> (FREETEXTTABLE(items,keywords,'paul drew') AS RANKTBL1 FULL OUTER
> JOIN FREETEXTTABLE(items,content,'paul drew') AS RANKTBL2 ON
> RANKTBL1.[KEY] = RANKTBL2.[KEY])
> ON (items.itempkid = RANKTBL1.[KEY] OR items.itempkid =
> RANKTBL2.[KEY])
> WHERE
> (ISNULL(CAST(RANKTBL1.RANK AS decimal),0)/10) +
> (ISNULL(CAST(RANKTBL2.RANK AS decimal),0)/10) > 3
> ORDER BY
> RANK DESC
> ----
--
> But the highest ranked record this returns from my database has
> neither the word 'paul' or 'drew' occuring in any of the fields. The
> closest I can find is that the top record has 'dre' near the beginning
> of the 'content' field.
> If I just search for 'drew' this excludes 2 records that contain the
> word 'drew' and still returns the aforementioned record.
> I've tried doing a full population, and rebuilding the catalog+full
> population with exactly the same results returned.
> Any ideas?
|||SELECT @.@.version:
Microsoft SQL Server 7.00 - 7.00.623 (Intel X86) Nov 27 1998
22:20:07 Copyright (c) 1988-1998 Microsoft Corporation Standard
Edition on Windows NT 4.0 (Build 1381: Service Pack 6)
COUNT(*) of items:
353
The "math" on the RANK values is to weight the importance of different
fields, i.e. keywords are more important than content and therefore
have a greater influence in the overall RANK.
"John Kane" <jt-kane@.comcast.net> wrote in message news:<#cGlT9wMEHA.3944@.tk2msftngp13.phx.gbl>...[vbcol=seagreen]
> Andy,
> Could you provide the full output of -- SELECT @.@.version -- as this would be
> helpful in troubleshooting this SQL FTS issue.
> Also, could you post the count(*) value for your FT-enabled table items?
> Additionally, while I understand why you're doing the "math" on the returned
> RANK values, but it will not get you a higher RANK value for specific words
> or phrases that you expect. See SQL Server 2000 BOL title "Full-Text Search
> Recommendations" for a more information. As an alternative, you may want to
> try the below approach - assuming that your table as a statistically
> significantly number of rows - and using the WEIGHT parameter:
> SELECT e.LastName, e.FirstName, e.Title, e.Notes, B.[KEY], B.[RANK] as
> B_RANK, A.[RANK] as A_RANK
> from Employees AS e,
> containstable(Employees, Notes, 'ISABOUT (BA weight (.2) )') as A,
> containstable(Employees, Title, 'Sales') as B
> where
> A.[KEY] = e.EmployeeID and
> B.[KEY] = e.EmployeeID
> --> examples of using RANK and multipling RANK
> use pubs
> go
> SELECT FT_TBL.au_lname, FT_TBL.au_fname, KEY_TBL.RANK, key_tbl1.Rank
> FROM authors as FT_TBL,
> CONTAINSTABLE (authors,au_lname, 'Ringer' ) AS KEY_TBL,
> CONTAINSTABLE (authors,au_fname, 'Michael' ) AS KEY_TBL1
> WHERE
> FT_TBL.au_id = KEY_TBL.[KEY] or
> FT_TBL.au_id = KEY_TBL1.[KEY] and
> key_tbl1.Rank < 2* KEY_TBL.RANK
> ORDER BY KEY_TBL.Rank, KEY_TBL1.RANK
> Regards,
> John
>
> "Andy Clark" <andy@.workonmypc.co.uk> wrote in message
> news:718c4cf7.0405050614.f103787@.posting.google.co m...
> --
> --
|||Thanks, Andy,
Yes, I suspected that your query was to have on one column "rank" higher
than the other. However, you will never get the expected results with only
353 rows. You should re-test your query on a table with at least 50,000 to
100,000 rows in order to get to a statistically significantly number of
rows.
Regards,
John
"Andy Clark" <andy@.workonmypc.co.uk> wrote in message
news:718c4cf7.0405060223.f3a5d3b@.posting.google.co m...
> SELECT @.@.version:
> Microsoft SQL Server 7.00 - 7.00.623 (Intel X86) Nov 27 1998
> 22:20:07 Copyright (c) 1988-1998 Microsoft Corporation Standard
> Edition on Windows NT 4.0 (Build 1381: Service Pack 6)
> COUNT(*) of items:
> 353
>
>
> The "math" on the RANK values is to weight the importance of different
> fields, i.e. keywords are more important than content and therefore
> have a greater influence in the overall RANK.
>
> "John Kane" <jt-kane@.comcast.net> wrote in message
news:<#cGlT9wMEHA.3944@.tk2msftngp13.phx.gbl>...[vbcol=seagreen]
would be[vbcol=seagreen]
returned[vbcol=seagreen]
words[vbcol=seagreen]
Search[vbcol=seagreen]
to[vbcol=seagreen]
> ----
> ----
|||But surely this still doesn't explain the search coming up with a high
rank result that doesn't contain any of the search terms. Would
applying SQL Server 7 SP4 help?
"John Kane" <jt-kane@.comcast.net> wrote in message news:<uQFHNq3MEHA.3052@.TK2MSFTNGP12.phx.gbl>...[vbcol=seagreen]
> Thanks, Andy,
> Yes, I suspected that your query was to have on one column "rank" higher
> than the other. However, you will never get the expected results with only
> 353 rows. You should re-test your query on a table with at least 50,000 to
> 100,000 rows in order to get to a statistically significantly number of
> rows.
> Regards,
> John
>
> "Andy Clark" <andy@.workonmypc.co.uk> wrote in message
> news:718c4cf7.0405060223.f3a5d3b@.posting.google.co m...
> news:<#cGlT9wMEHA.3944@.tk2msftngp13.phx.gbl>...
> would be
> returned
> words
> Search
> to
> ----
> ----
|||Andy,
While you may want to apply SQL Server 7.0 service pack SP4 for other
reasons, this is not one of them.
You should re-test your query using a much larger table. Additionally, if
you plan to use SQL 7.0 FTS in a production environment, I'd highly
recommend that you consider upgrading to SQL Server 2000 as there are a
number of fixes, feature enhancements and performance improvements in SQL
2000 that are not in SQL 7.0
Regards,
John
"Andy Clark" <andy@.workonmypc.co.uk> wrote in message
news:718c4cf7.0405070446.678531b9@.posting.google.c om...
> But surely this still doesn't explain the search coming up with a high
> rank result that doesn't contain any of the search terms. Would
> applying SQL Server 7 SP4 help?
> "John Kane" <jt-kane@.comcast.net> wrote in message
news:<uQFHNq3MEHA.3052@.TK2MSFTNGP12.phx.gbl>...[vbcol=seagreen]
only[vbcol=seagreen]
to[vbcol=seagreen]
items?[vbcol=seagreen]
specific[vbcol=seagreen]
want[vbcol=seagreen]
as[vbcol=seagreen]
A,
>
----
>
----[vbcol=seagreen]
The[vbcol=seagreen]
beginning[vbcol=seagreen]
the[vbcol=seagreen]
catalog+full[vbcol=seagreen]

No comments:

Post a Comment