本文實例講述了php利用scws實現(xiàn)mysql全文搜索功能的方法。分享給大家供大家參考。具體方法如下:
scws這樣的中文分詞插件比較不錯,簡單的學(xué)習(xí)了一下,它包涵一些專有名稱、人名、地名、數(shù)字年代等規(guī)則集合,可以直接將語句按這些規(guī)則分開成一個一個關(guān)鍵詞,準確率在90%-95%之間,按照安裝說明把scws的擴展放入php的擴展目錄里,下載規(guī)則文件和詞典文件,并在php配置文件中引用它們,就可以用scws進行分詞了.
1) 修改 php 擴展代碼以兼容支持 php 5.4.x
2) 修正 php 擴展中 scws_get_tops 的 limit 參數(shù)不允許少于 10 的問題
3) libscws 增加 scws_fork() 從既有的 scws 實例產(chǎn)生分支并共享詞典/規(guī)則集,主要用于多線程開發(fā).
4) 新增部分版本的 win32 的 dll 擴展
PHP實例代碼如下:
<?php //實例化分詞插件核心類 $so = scws_new(); //設(shè)置分詞時所用編碼 $so->set_charset('utf-8'); //設(shè)置分詞所用詞典(此處使用utf8的詞典) $so->set_dict('/path/dict.utf8.xdb'); //設(shè)置分詞所用規(guī)則 $so->set_rule('/path/rules.utf8.ini '); //分詞前去掉標點符號 $so->set_ignore(true); //是否復(fù)式分割,如“中國人”返回“中國+人+中國人”三個詞。 $so->set_multi(true); //設(shè)定將文字自動以二字分詞法聚合 $so->set_duality(true); //要進行分詞的語句 $so->send_text(“歡迎來到火星時代IT開發(fā)”); //獲取分詞結(jié)果,如果提取高頻詞用get_tops方法 while ($tmp = $so->get_result()) { print_r($tmp); } $so->close(); ?>
注:如以上例子,輸入的文字,詞典,規(guī)則文件這三者的字符集必須統(tǒng)一,另外mysql 4.XX有的是不支持中文全文搜索的,可以存入關(guān)鍵字對應(yīng)的區(qū)位碼以方便全文搜索.
版本列表
版本 類型 平臺 性能 其它
SCWS-1.1.x C 代碼 *Unix*/*PHP* 準確: 95%, 召回: 91%, 速度: 1.2MB/sec
PHP擴展分詞速度: 250KB/sec [下載] [文檔] [安裝說明]
php_scws.dll(1) PHP擴展庫 Windows/PHP 4.4.x 準確: 95%, 召回: 91%,
php_scws.dll(2) PHP擴展庫 Windows/PHP 5.2.x 準確: 95%, 召回: 91%,
php_scws.dll(3) PHP擴展庫 Windows/PHP 5.3.x 準確: 95%, 召回: 91%,?
php_scws.dll(4) PHP擴展庫 Windows/PHP 5.4.x 準確: 95%, 召回: 91%,
PSCWS23 PHP源代碼 不限 (不支持UTF-8) 準確: 93%, 召回: 89%,
PSCWS4 PHP源代碼 不限 準確: 95%, 召回: 91%,
?以上就是php利用scws實現(xiàn)mysql全文搜索功能的方法,的內(nèi)容,更多相關(guān)內(nèi)容請關(guān)注PHP中文網(wǎng)(www.miracleart.cn)!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

The method to get the current session ID in PHP is to use the session_id() function, but you must call session_start() to successfully obtain it. 1. Call session_start() to start the session; 2. Use session_id() to read the session ID and output a string similar to abc123def456ghi789; 3. If the return is empty, check whether session_start() is missing, whether the user accesses for the first time, or whether the session is destroyed; 4. The session ID can be used for logging, security verification and cross-request communication, but security needs to be paid attention to. Make sure that the session is correctly enabled and the ID can be obtained successfully.

To extract substrings from PHP strings, you can use the substr() function, which is syntax substr(string$string,int$start,?int$length=null), and if the length is not specified, it will be intercepted to the end; when processing multi-byte characters such as Chinese, you should use the mb_substr() function to avoid garbled code; if you need to intercept the string according to a specific separator, you can use exploit() or combine strpos() and substr() to implement it, such as extracting file name extensions or domain names.

UnittestinginPHPinvolvesverifyingindividualcodeunitslikefunctionsormethodstocatchbugsearlyandensurereliablerefactoring.1)SetupPHPUnitviaComposer,createatestdirectory,andconfigureautoloadandphpunit.xml.2)Writetestcasesfollowingthearrange-act-assertpat

In PHP, the most common method is to split the string into an array using the exploit() function. This function divides the string into multiple parts through the specified delimiter and returns an array. The syntax is exploit(separator, string, limit), where separator is the separator, string is the original string, and limit is an optional parameter to control the maximum number of segments. For example $str="apple,banana,orange";$arr=explode(",",$str); The result is ["apple","bana

JavaScript data types are divided into primitive types and reference types. Primitive types include string, number, boolean, null, undefined, and symbol. The values are immutable and copies are copied when assigning values, so they do not affect each other; reference types such as objects, arrays and functions store memory addresses, and variables pointing to the same object will affect each other. Typeof and instanceof can be used to determine types, but pay attention to the historical issues of typeofnull. Understanding these two types of differences can help write more stable and reliable code.

std::chrono is used in C to process time, including obtaining the current time, measuring execution time, operation time point and duration, and formatting analysis time. 1. Use std::chrono::system_clock::now() to obtain the current time, which can be converted into a readable string, but the system clock may not be monotonous; 2. Use std::chrono::steady_clock to measure the execution time to ensure monotony, and convert it into milliseconds, seconds and other units through duration_cast; 3. Time point (time_point) and duration (duration) can be interoperable, but attention should be paid to unit compatibility and clock epoch (epoch)

In PHP, to pass a session variable to another page, the key is to start the session correctly and use the same $_SESSION key name. 1. Before using session variables for each page, it must be called session_start() and placed in the front of the script; 2. Set session variables such as $_SESSION['username']='JohnDoe' on the first page; 3. After calling session_start() on another page, access the variables through the same key name; 4. Make sure that session_start() is called on each page, avoid outputting content in advance, and check that the session storage path on the server is writable; 5. Use ses

ToaccessenvironmentvariablesinPHP,usegetenv()orthe$_ENVsuperglobal.1.getenv('VAR_NAME')retrievesaspecificvariable.2.$_ENV['VAR_NAME']accessesvariablesifvariables_orderinphp.iniincludes"E".SetvariablesviaCLIwithVAR=valuephpscript.php,inApach
