看源码之String.format实现

前言

java对于字符串拼接一般都喜欢用String.format("xx",arg)，如下

1564841489352

那么这个简单实用的小功能底层是作何实现呢？

源码

从String.java源码入手，看到内部调用到的其实是

return new Formatter().format(format, args).toString();

跟进Formatter.java

在java.util.Formatter#format(java.util.Locale, java.lang.String, java.lang.Object...)第2493行方法如下：

public Formatter format(Locale l, String format, Object ... args) {
       ensureOpen();

       // index of last argument referenced
       int last = -1;
       // last ordinary index
       int lasto = -1;

       FormatString[] fsa = parse(format);
       for (int i = 0; i < fsa.length; i++) {
           FormatString fs = fsa[i];
           int index = fs.index();
           try {
               switch (index) {
               case -2: // fixed string, "%n", or "%%"
                   fs.print(null, l);
                   break;
               case -1: // relative index
                   if (last < 0 || (args != null && last > args.length - 1))
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[last]), l);
                   break;
               case 0: // ordinary index
                   lasto++;
                   last = lasto;
                   if (args != null && lasto > args.length - 1)
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[lasto]), l);
                   break;
               default: // explicit index
                   last = index - 1;
                   if (args != null && last > args.length - 1)
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[last]), l);
                   break;
               }
           } catch (IOException x) {
               lastException = x;
           }
       }
       return this;
   }

其中第九行的FormatString[] fsa = parse(format);将字符串按照制定规则拆分成多个，如上诉例子，将返回

我的名字：
%s
,年龄：
%s

查看字符串内部拆分逻辑在java.util.Formatter#parse跟进代码如下：

private FormatString[] parse(String s) {
       ArrayList<FormatString> al = new ArrayList<>();
       Matcher m = fsPattern.matcher(s);
       for (int i = 0, len = s.length(); i < len; ) {
           if (m.find(i)) {
               // Anything between the start of the string and the beginning
               // of the format specifier is either fixed text or contains
               // an invalid format string.
               if (m.start() != i) {
                   // Make sure we didn't miss any invalid format specifiers
                   checkText(s, i, m.start());
                   // Assume previous characters were fixed text
                   al.add(new FixedString(s.substring(i, m.start())));
               }

               al.add(new FormatSpecifier(m));
               i = m.end();
           } else {
               // No more valid format specifiers. Check for possible invalid
               // format specifiers.
               checkText(s, i, len);
               // The rest of the string is fixed text
               al.add(new FixedString(s.substring(i)));
               break;
           }
       }
       return al.toArray(new FormatString[al.size()]);
   }

通过第三行，显然是通过正则表达式拆分的，java.util.Formatter#parse的2537行看到正则表达式为：

"%(\\d+\\$)?([-#+ 0,(\\<]*)?(\\d+)?(\\.\\d+)?([tT])?([a-zA-Z%])";

具体逻辑如下

新建一个“指针” i，指向字符串的首个字符，如我的名字：%s,年龄：%d","张三，初始状态i为0，指向字符'我'
1. 从第i个位置开始，匹配正则表达式
2. 匹配成功
  1. 把第i至m.start()之间的字符添加进来（这段是没有被正则匹配成功的），比如我的名字：
把匹配成功的字符串添加进来，并把指针下标i改为匹配成功的最后一个元素下标，即m.end();比如%s
1. 匹配失败
把下标为i以后的所有字串都添加进来
1. 循环跳转到第2步，直到指针i到字符串末尾
2. 完成字符串拆分

字符串拆分已经完成，将进行变量填充，具体代码如下：

for (int i = 0; i < fsa.length; i++) {
           FormatString fs = fsa[i];
           int index = fs.index();
           try {
               switch (index) {
               case -2: // fixed string, "%n", or "%%"
                   fs.print(null, l);
                   break;
               case -1: // relative index
                   if (last < 0 || (args != null && last > args.length - 1))
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[last]), l);
                   break;
               case 0: // ordinary index
                   lasto++;
                   last = lasto;
                   if (args != null && lasto > args.length - 1)
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[lasto]), l);
                   break;
               default: // explicit index
                   last = index - 1;
                   if (args != null && last > args.length - 1)
                       throw new MissingFormatArgumentException(fs.toString());
                   fs.print((args == null ? null : args[last]), l);
                   break;
               }
           } catch (IOException x) {
               lastException = x;
           }
       }

关于FormatString接口的实现类，根据之前代码得知，普通字符串是java.util.Formatter.FixedString类型，而待替换的变量占位符是java.util.Formatter.FormatSpecifier类型。
接口的java.util.Formatter.FormatString#index方法，对应了上诉switch代码的不同分支，即不同的处理策略。
普通字符串index是 -2，字符串不作处理，直接添加进去
占位符字符串时fs.print((args == null ? null : args[last]), l);即把对应位置的参数填充进来。

String.Format()底层实现原理

看源码之String.format实现

前言

源码

猜你喜欢