csv 文件中,如果值中有分隔符,分隔符需要转义,否则会被解析掉.比如:你好\,世界,你是谁?,张三.你好,世界 需要作为一个整体解析.写了一个小方法,记录一下.
public static List<String> parseCSV(String s, char seprator) { char[] cs = s.toCharArray(); List<String> list = new ArrayList<String>(); StringBuilder sb = new StringBuilder(); int pi = -1; for(int i = 0; i < cs.length; i++) { char c = cs[i]; if (c == '\\') { sb.append(c); pi = i; continue; } if (c != seprator) { sb.append(c); continue; } if (pi + 1 == i) { // 是转义符 sb.deleteCharAt(sb.length() - 1); sb.append(c); pi = -1; } else { // 是分隔符 list.add(sb.toString()); sb.delete(0, sb.length()); } } list.add(sb.toString()); return list; }
string.split(",")是正则分割,并且不能检索"\".上面的方法是字符检索,经过测试,上面的方法比split要快一些 (3-4倍).
int i = 0; long start = System.currentTimeMillis(); do { parseCSV(content, ','); } while (i++ < 10000); long end = System.currentTimeMillis(); System.out.println("" + (end - start)); long start2 = System.currentTimeMillis(); i = 0; do { content.split(","); } while (i++ < 10000); long end2 = System.currentTimeMillis(); System.out.println("" + (end2 - start2));
另外一个小工具方法,http query string 解析:
public static Map<String, String> parseQueryString(String queryString) { Map<String, String> result = new HashMap<String, String>(); String[] eles = queryString.split(",|&"); for (String ele : eles) { String[] kv = ele.split(":|=", 2); result.put(kv[0], kv[1]); } return result; }