好久没有写博客了,最近在做一个window exe程序。
public static int String_length(String value) {
int valueLength = 0;
String chinese = "[\u4e00-\u9fa5]";
for (int i = 0; i < value.length(); i++) {
String temp = value.substring(i, i + 1);
if (temp.matches(chinese)) {
valueLength += 2;
} else {
valueLength += 1;
}
}
return valueLength;
}
String s1 = "abcd我们";
String s2 = "abcdef";
String s3 = "啊波次得我们";
System.out.println("s1 default " + s1.length() + " s.byte " + s1.getBytes().length);
System.out.println("s1 gbk " + s1.length() + " s.byte " + s1.getBytes("GBK").length);
System.out.println("s1 utf-8 " + s1.length() + " s.byte " + s1.getBytes("UTF-8").length);
System.out.println("s2 " + s2.length() + " s.byte " + s2.getBytes().length);
System.out.println("s3 " + s3.length() + " s.byte " + s3.getBytes().length);
System.out.println("func s1 " + String_length(s1));
System.out.println("func s2 " + String_length(s2));
System.out.println("func s3 " + String_length(s3));
得到的结果是:
s1 default 6 s.byte 10 //默认是按utf-8搞
s1 gbk 6 s.byte 8 //gbk固定2个字节中文,英文1个
s1 utf-8 6 s.byte 10 //utf8中文是不固定的,可能是2~3个。英文1个
s2 6 s.byte 6
s3 6 s.byte 18
func s1 8
func s2 6
func s3 12
所以,string.length拿到的是文字的个数;string.getByte().length根据编码来返回字节数;
使用方法函数,使用unicode探测最好。