How does php distinguish between Simplified Chinese, Traditional Chinese, Japanese and Korean
According to the methods given on the Internet, it seems that Chinese, Japanese, and Korean can be distinguished, but how to distinguish simplified and traditional Chinese?
$s = <<<'EOF'
"memolov 愛書 愛書 あいしょ ?? ??? ?? ??",
EOF;
echo $s.PHP_EOL;
if(preg_match_all('/([\x{4e00}-\x{9fa5}]+)/u',$s,$m)){ //中文簡(jiǎn)體繁體
echo "<pre>";
print_r($m[1]);
echo "</pre>";
}
if(preg_match_all('/([\x{0800}-\x{4e00}]+)/u',$s,$m)){ //日文
echo "<pre>";
print_r($m[1]);
echo "</pre>";
}
if(preg_match_all('/([\x{AC00}-\x{D7A3}]+)/u',$s,$m)){ //韓文
echo "<pre>";
print_r($m[1]);
echo "</pre>";
}
擁有18年軟件開發(fā)和IT教學(xué)經(jīng)驗(yàn)。曾任多家上市公司技術(shù)總監(jiān)、架構(gòu)師、項(xiàng)目經(jīng)理、高級(jí)軟件工程師等職務(wù)。 網(wǎng)絡(luò)人氣名人講師,...
Then here comes the problem小
There is no traditional Chinese for this. So is this considered simplified or traditional?
This is simplified and traditional. . It's not easy to distinguish. Can you build a library corresponding to Simplified and Traditional Chinese?
I have a simple idea:
First convert Chinese into Simplified Chinese. If the string does not change before and after conversion, it is Simplified Chinese, otherwise it is counted as Traditional Chinese.
https://github.com/BYVoid/OpenCC
OpenCC library, used for conversion, very easy to use. Others can also be used.