Java split () method Brief

Copyright: code word is not easy, please indicate the source ~~ https://blog.csdn.net/m0_37190495/article/details/83383562
Reproduced, please indicate the source:
Original starting in: http://www.zhangruibin.com
article from RebornChang's blog

Shamelessly advertising, blogger personal blog address Portal , Welcome

Classification split method

About split method in Java, where roughly divided into three types:
assuming String String = "1,2 ,,,,,", using different methods, then split, which effects as shown below;

split method of js

usage instructions:

var string = “1,2,,,,,”;
var arr = [];
arr = value.split(",");
alert(arr.length);

Can be obtained at this time is the length of the array arr 6, i.e., those in the middle of the empty string behind the comma, does not trim off;

The use of Java

split(String regex)

String offerCodes = “1,2,,,,,”; 
String[] offerCodeString = offerCodes.split(",");
System.out.println("offerCodeString.length"+offerCodeString.length);

Console print out an array of length 2;
you can see, a single parameter, then divided according to a specified character string behind those empty by default removed, then empty string if I want these back how to do? See below this split method;

split(Sting regex,int limit)

 String offerCodes = “1,2,,,,,”; 
 String[] offerCodeString = offerCodes.split(",",-1);
 System.out.println("offerCodeString.length"+offerCodeString.length);

Console print out the array length is 6;
that is not the empty string later removed, but why the latter argument fill a -1 it? You can not be replaced by other values? I think that if it is necessary to take the wave source;

Java jdk1.8 split Source Brief

Source Notes

split Source
string String split method starts from the JDK1.4 present, to the inside of the wave source table:
source given sample string: "boo: and: foo" ; and then, passing different parameters, the results obtained as follows:

Regex Limit Result
: 2 "boo", "and:foo"
: 5 "boo", "and", "foo"
: -2 "boo", "and", "foo"
O 5 "b", "", ":and:f", "", ""
O -2 "b", "", ":and:f", "", ""
O 0 "b", "", ":and:f"

split Source

public String[] split(String regex, int limit) {
      /* fastpath if the regex is a
       (1)one-char String and this character is not one of the
          RegEx's meta characters ".$|()[{^?*+\\", or
       (2)two-char String and the first char is the backslash and
          the second is not the ascii digit or ascii letter.
       */
      char ch = 0;
      //这里是一堆的正则校验,大致是,传入的分割符是单符号位的,才进行下面的分割,否则,return Pattern.compile(regex).split(this, limit)调用另一个分割方法进行字符串分割位分割,文末会PO出此方法
      if (((regex.value.length == 1 &&
           ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
           (regex.length() == 2 &&
            regex.charAt(0) == '\\' &&
            (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
            ((ch-'a')|('z'-ch)) < 0 &&
            ((ch-'A')|('Z'-ch)) < 0)) &&
          (ch < Character.MIN_HIGH_SURROGATE ||
           ch > Character.MAX_LOW_SURROGATE))
      {
          int off = 0;
          int next = 0;
          //从这里开始,进行limit值的入参及split逻辑
          //当传进来的值是正数的时候,limit > 0 == true
          boolean limited = limit > 0;
          //声明一个list集合对返回值结果进行存储,用于最后给String[]赋值
          ArrayList<String> list = new ArrayList<>();
          //当没有按照指定的字符分割到最后一位的时候,执行while循环进行判断,然后使用substring(off, next)方法进行分割
          while ((next = indexOf(ch, off)) != -1) {
          //判断limited 为FALSE,即limit<0,或者,list.size() < limit - 1是否成立
              if (!limited || list.size() < limit - 1) {
                  //若成立则使用substring(off, next)方法进行分割,并且加入到list中
                  list.add(substring(off, next));
                  //此时的初始标识符off为next+1
                  off = next + 1;
              } else {    // last one
                  //assert (list.size() == limit - 1);
                  //不成立的话调用substring(off, value.length),此时value.length值为1
                  list.add(substring(off, value.length));
                  off = value.length;
                  break;
              }
          }
          // 如果不符合,则返回 this
          if (off == 0)
              return new String[]{this};

          // Add remaining segment
          if (!limited || list.size() < limit)
              list.add(substring(off, value.length));

          // Construct result
          int resultSize = list.size();
          if (limit == 0) {
              while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                  resultSize--;
              }
          }
          //将所得到的list集合进行截取,使用toArray()方法赋值到String[] result中,所以这么看来,split方法的效率,是略差的
          String[] result = new String[resultSize];
          return list.subList(0, resultSize).toArray(result);
      }
      return Pattern.compile(regex).split(this, limit);
  } 

In general:
the number of parameters limit control mode application, and therefore affects the length of the resulting array. If the limit is greater than n 0, the mode is most widely n - 1 times, the length of the array will not be greater than n, and a last array will contain all of the input beyond the last delimiter matching. If n is non-positive, then the pattern to be applied as many times, and the array may be any length. If n is 0, then the pattern to be applied as many times, the array may be any length, and a null-terminated string is discarded.
The following source code will not resolve the matter, interested Tell me what you can see on their own:

public String[] split(CharSequence input, int limit) {
      int index = 0;
      boolean matchLimited = limit > 0;
      ArrayList<String> matchList = new ArrayList<>();
      Matcher m = matcher(input);

      // Add segments before each match found
      while(m.find()) {
          if (!matchLimited || matchList.size() < limit - 1) {
              if (index == 0 && index == m.start() && m.start() == m.end()) {
                  // no empty leading substring included for zero-width match
                  // at the beginning of the input char sequence.
                  continue;
              }
              String match = input.subSequence(index, m.start()).toString();
              matchList.add(match);
              index = m.end();
          } else if (matchList.size() == limit - 1) { // last one
              String match = input.subSequence(index,
                                               input.length()).toString();
              matchList.add(match);
              index = m.end();
          }
      }

      // If no match was found, return this
      if (index == 0)
          return new String[] {input.toString()};

      // Add remaining segment
      if (!matchLimited || matchList.size() < limit)
          matchList.add(input.subSequence(index, input.length()).toString());

      // Construct result
      int resultSize = matchList.size();
      if (limit == 0)
          while (resultSize > 0 && matchList.get(resultSize-1).equals(""))
              resultSize--;
      String[] result = new String[resultSize];
      return matchList.subList(0, resultSize).toArray(result);
  }

Guess you like

Origin blog.csdn.net/m0_37190495/article/details/83383562
Recommended