String interception, for numbers, English, and Chinese characters

In the process of building a website, we will encounter such problems. The number of bytes occupied by English and Chinese characters in the UTF-8 encoding format is different, which will lead to conflicts during the string interception process. The following function can be used perfectly. solve this problem.

function cc_msubstr($str, $length, $start=0, $charset="utf-8", $suffix=true){
        if(function_exists("mb_substr")){
            return mb_substr($str, $start, $length, $charset);
        }elseif(function_exists('iconv_substr')){
            return iconv_substr($str,$start,$length,$charset);
        }
        $re['utf-8']  = "/[/x01-/x7f]|[/xc2-/xdf][/x80-/xbf]|[/xe0-/xef][/x80-/xbf]{2}|[/xf0-/xff][/x80-/xbf]{3}/";
        $re['gb2312'] = "/[/x01-/x7f]|[/xb0-/xf7][/xa0-/xfe]/";
        $re['gbk']    = "/[/x01-/x7f]|[/x81-/xfe][/x40-/xfe]/";
        $re['big5']   = "/[/x01-/x7f]|[/x81-/xfe]([/x40-/x7e]|/xa1-/xfe])/";
        preg_match_all($re[$charset], $str, $match);
        $slice = join("",array_slice($match[0], $start, $length));
        if($suffix){
            return $slice."..";
        }else{
            return $slice;
        }
 }

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324951820&siteId=291194637