JDK1.8 源码 java.lang.String

前言

上一章节, 我们读了java.lang.Integer类. 本章我们读下java.lang.String类.


正文


总览
  • 主要成员变量
// 存储数据类型
private final char value[];
// hash值
private int hash; // Default to 0
// 序列化值
private static final long serialVersionUID = -6849794470754667710L;
// 流对象?
private static final ObjectStreamField[] serialPersistentFields =
        new ObjectStreamField[0];

从成员变量中可以看到, String类的真实的实现和存储数据的地址为char[].

  • 核心方法
    • 构造方法13个重载 public String()/public String(String original)/public String(char value[])/public String(StringBuffer buffer)/public String(StringBuilder builder)
    • public int length() - 字符串长度
    • public boolean isEmpty() - 字符串是否为空
    • public boolean equals(Object anObject) - 数值是否相等
    • public int compareTo(String anotherString) - 字符串比较
    • public int hashCode() - hash 值
    • public int indexOf(int ch, int fromIndex)/public int indexOf(int ch) - 某个字符第一个位置.
    • public int lastIndexOf(int ch, int fromIndex)/public int lastIndexOf(int ch) - 某个字符最后一个位置.
    • public String substring(int beginIndex) - 子字符串.
    • public String replace(char oldChar, char newChar)/ public String replaceFirst(String regex, String replacement) / public String replaceAll(String regex, String replacement) - 替换字符串
    • public String[] split(String regex) - 分割
    • public static String valueOf(int i) - 转换为String类型工具类
    • public native String intern() - 放入常量池
  • 其他常用方法
    • public char charAt(int index) - 某一位值
    • public void getChars(int srcBegin, int srcEnd, char dst[], int dstBegin) - 拷贝到dst[]
    • public String concat(String str) - 拼接字符串
    • public byte[] getBytes(String charsetName)/public byte[] getBytes(Charset charset)/public byte[] getBytes() - 转换为字节流
    • public String trim() - 去除末尾
    • public String toLowerCase()/public String toUpperCase() - 转换位大写字母&小写字母
    • public char[] toCharArray() 转换为char[]
    • public boolean matches(String regex) - 正则匹配
    • public static String format(String format, Object... args) - 格式化字符串

String类声明
public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {}
  • Serializable 接口
public interface Serializable {
}
  • Comparable接口 - 维护一个compare(T o)接口方法.
public interface Comparable<T> {
 public int compareTo(T o);
}
  • CharSequence 接口
public interface CharSequence {
	int length();
    char charAt(int index);
    CharSequence subSequence(int start, int end);
    public String toString();
    // JDK1.8 支持接口内声明default方法 	
	public default IntStream chars() {
		// 略        
    }
   public default IntStream codePoints() {
   	   // 略  
    }
}

方法详细实现

  • 构造方法
    构造方法足足有11个. 只挑选其中几个进行细讲.
 //  默认构造函数 - 空字符串
 public String() {
     this.value = "".value;
 }
// String类型赋值
public String(String original) {
      this.value = original.value;
      this.hash = original.hash;
 }
	// char []
	public String(char value[]) {
        this.value = Arrays.copyOf(value, value.length);
    }
	// char [] ,begin,count
    public String(char value[], int offset, int count) {
        if (offset < 0) {
            throw new StringIndexOutOfBoundsException(offset);
        }
        if (count <= 0) {
            if (count < 0) {
                throw new StringIndexOutOfBoundsException(count);
            }
            if (offset <= value.length) {
                this.value = "".value;
                return;
            }
        }
        // Note: offset or count might be near -1>>>1.
        if (offset > value.length - count) {
            throw new StringIndexOutOfBoundsException(offset + count);
        }
        this.value = Arrays.copyOfRange(value, offset, offset+count);
    }
	// byte[]
    public String(byte bytes[], int offset, int length) {
        checkBounds(bytes, offset, length);
        this.value = StringCoding.decode(bytes, offset, length);
    }
	// StringBuffer & StringBuilder
    public String(StringBuffer buffer) {
        synchronized(buffer) {
            this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
        }
    }
   public String(StringBuilder builder) {
        this.value = Arrays.copyOf(builder.getValue(), builder.length());
    }
  • 注: 具体的StringBufferStringBuilder操作. 后面再做详细解答.

从上面的几个构造函数可以发现. 对于String对象的声明, 主体是对于char[]的赋值和声明. 其中, 除无参构造函数外, 主要都使用了Arrays.copyOf()Arrays.copyOfRange()方法.


  • public int length()
	// 返回数组长度
    public int length() {
        return value.length;
    }

判断char[]数组长度.

  • public boolean isEmpty()
  public boolean isEmpty() {
  // 判断数组长度是否为0
        return value.length == 0;
    }

判断char[]是否为空.

  • charAt() - 某个位置字符号
   public char charAt(int index) {
        if ((index < 0) || (index >= value.length)) {
            throw new StringIndexOutOfBoundsException(index);
        }
        return value[index];
    }

判断char[]数组, 下标为i的值.


  • getChars & getBytes
   void getChars(char dst[], int dstBegin) {
        System.arraycopy(value, 0, dst, dstBegin, value.length);
    }
     public byte[] getBytes(String charsetName)
            throws UnsupportedEncodingException {
        if (charsetName == null) throw new NullPointerException();
        return StringCoding.encode(charsetName, value, 0, value.length);
    }

值得注意的是这里的getChars方法返还的并不是String数组本身. 而是本身的一个拷贝. System.arrayCopy方法. 也就是Arrays.copyOf()方法. 这个方法的具体实现和操作系统有关.


  • equals 方法
    public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = value.length;
            if (n == anotherString.value.length) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = 0;
                while (n-- != 0) {
                    if (v1[i] != v2[i])
                        return false;
                    i++;
                }
                return true;
            }
        }
        return false;
    }

比较方法主要做了如下几个判断:

  • 判断被比较数据类型是否为String类型. anObject instanceof String.
  • 判断被比较数据类型的长度是否与当前对象相等. if (n == anotherString.value.length)
  • 依次比较两者的字符. 判断是否相等
	while (n-- != 0) {
       if (v1[i] != v2[i])
      	 return false;
         i++;
     }

  • compareTo 方法
   public int compareTo(String anotherString) {
        int len1 = value.length;
        int len2 = anotherString.value.length;
        int lim = Math.min(len1, len2);
        char v1[] = value;
        char v2[] = anotherString.value;

        int k = 0;
        while (k < lim) {
            char c1 = v1[k];
            char c2 = v2[k];
            if (c1 != c2) {
                return c1 - c2;
            }
            k++;
        }
        return len1 - len2;
    }

compareTo方法主要做了如下几个步骤:

  • 选择2个字符串中比较段的作为比较参数;
  • 依次比较两个字符串的每个字符.
    • 比较到两者不相同为止. if (c1 != c2) {return c1 - c2;} 返回第一个字符大的.
      这也就是我们经常遇到的123452345比较. 但是只比较了第一个字符12345<2345.
    • 如果两者都相等. 就比较两者的长度. 长度大的值比较大.

基于以上原理. 我们可以写下测试. 便于我们以后开发过程中的编写和排序:

	public static void testEqualsAndCompare() {
		String str1 = "12345";
		String str2 = "1234567";
		System.out.println("str1: 12345"+" - "+"str2:1234567:"+ str1.equals(str2));
		System.out.println("str1: 12345"+" - "+"str2:1234567:"+ str1.compareTo(str2));
		
		String str3="12346";
		String str4="2345";
		System.out.println("str3:12346"+" - "+"str4:2345:"+str3.compareTo(str4));
	}
	// 结果
	// str1: 12345 - str2:1234567:false
	// str1: 12345 - str2:1234567:-2
	// str3:12346 - str4:2345-1
  • 注: 在对于String类型的比较中. 我们仍然需要注意比较==equals方法的区别. 这里, 有时也需要考虑字符串常量池的相关因素. 我们看下下面的这个例子:


  • indexOf &
// indexOf()
 public int indexOf(int ch, int fromIndex) {
        final int max = value.length;
        if (fromIndex < 0) {
            fromIndex = 0;
        } else if (fromIndex >= max) {
            // Note: fromIndex might be near -1>>>1.
            return -1;
        }

        if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
            // handle most cases here (ch is a BMP code point or a
            // negative value (invalid code point))
            final char[] value = this.value;
            for (int i = fromIndex; i < max; i++) {
                if (value[i] == ch) {
                    return i;
                }
            }
            return -1;
        } else {
            return indexOfSupplementary(ch, fromIndex);
        }
    }

char[]头开始搜索匹配. 匹配第一个符合的字符.

   public int lastIndexOf(int ch) {
        return lastIndexOf(ch, value.length - 1);
    }
    public int lastIndexOf(int ch, int fromIndex) {
        if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
            // handle most cases here (ch is a BMP code point or a
            // negative value (invalid code point))
            final char[] value = this.value;
            int i = Math.min(fromIndex, value.length - 1);
            for (; i >= 0; i--) {
                if (value[i] == ch) {
                    return i;
                }
            }
            return -1;
        } else {
            return lastIndexOfSupplementary(ch, fromIndex);
        }
    }

从最后一个字符开始匹配. 倒序搜索, 没什么好说的.


  • subString
    public String substring(int beginIndex, int endIndex) {
        if (beginIndex < 0) {
            throw new StringIndexOutOfBoundsException(beginIndex);
        }
        if (endIndex > value.length) {
            throw new StringIndexOutOfBoundsException(endIndex);
        }
        int subLen = endIndex - beginIndex;
        if (subLen < 0) {
            throw new StringIndexOutOfBoundsException(subLen);
        }
        return ((beginIndex == 0) && (endIndex == value.length)) ? this
                : new String(value, beginIndex, subLen);
    }

通过new String(value, beginIndex, subLen)构造函数, 返回新的String对象.


  • concat
    public String concat(String str) {
        int otherLen = str.length();
        if (otherLen == 0) {
            return this;
        }
        int len = value.length;
        // 拷贝创建一个新长度的char[]
        char buf[] = Arrays.copyOf(value, len + otherLen);
        // 复写
        str.getChars(buf, len);
        // 使用构造函数生成新的对象.
        return new String(buf, true);
    }
     void getChars(char dst[], int dstBegin) {
        System.arraycopy(value, 0, dst, dstBegin, value.length);
    }

  • replace & replaceAll
    public String replace(char oldChar, char newChar) {
        if (oldChar != newChar) {
            int len = value.length;
            int i = -1;
            char[] val = value; /* avoid getfield opcode */

            while (++i < len) {
                if (val[i] == oldChar) {
                    break;
                }
            }
            if (i < len) {
                char buf[] = new char[len];
                for (int j = 0; j < i; j++) {
                    buf[j] = val[j];
                }
                while (i < len) {
                    char c = val[i];
                    buf[i] = (c == oldChar) ? newChar : c;
                    i++;
                }
                return new String(buf, true);
            }
        }
        return this;
    }

replace主要是对单个字符对替换.

  • replaceFirst() & replaceAll()
    public String replaceFirst(String regex, String replacement) {
        return Pattern.compile(regex).matcher(this).replaceFirst(replacement);
    }
    public String replaceAll(String regex, String replacement) {
        return Pattern.compile(regex).matcher(this).replaceAll(replacement);
    }

replaceFirst()replaceAll()主要是使用Pattern工具类. 对其进行正则匹配. 随后进行替换.


  • split
    public String[] split(String regex, int limit) {
        /* fastpath if the regex is a
         (1)one-char String and this character is not one of the
            RegEx's meta characters ".$|()[{^?*+\\", or
         (2)two-char String and the first char is the backslash and
            the second is not the ascii digit or ascii letter.
         */
        char ch = 0;
        if (((regex.value.length == 1 &&
             ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
             (regex.length() == 2 &&
              regex.charAt(0) == '\\' &&
              (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
              ((ch-'a')|('z'-ch)) < 0 &&
              ((ch-'A')|('Z'-ch)) < 0)) &&
            (ch < Character.MIN_HIGH_SURROGATE ||
             ch > Character.MAX_LOW_SURROGATE))
        {
            int off = 0;
            int next = 0;
            boolean limited = limit > 0;
            ArrayList<String> list = new ArrayList<>();
            while ((next = indexOf(ch, off)) != -1) {
                if (!limited || list.size() < limit - 1) {
                    list.add(substring(off, next));
                    off = next + 1;
                } else {    // last one
                    //assert (list.size() == limit - 1);
                    list.add(substring(off, value.length));
                    off = value.length;
                    break;
                }
            }
            // If no match was found, return this
            if (off == 0)
                return new String[]{this};

            // Add remaining segment
            if (!limited || list.size() < limit)
                list.add(substring(off, value.length));

            // Construct result
            int resultSize = list.size();
            if (limit == 0) {
                while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                    resultSize--;
                }
            }
            String[] result = new String[resultSize];
            return list.subList(0, resultSize).toArray(result);
        }
        return Pattern.compile(regex).split(this, limit);
    }
  public String[] split(String regex) {
        return split(regex, 0);
    }    

split(String regex, int limit)后面的limit参数. 可以设置为0,1,2
代表匹配几次.

匹配主要分为如下几种情况:

    1. 单个字符. 通过将结果放入ArrayList即可.
    1. 多个字符. 使用Pattern正则匹配工具类即可.

  • trim
	public String trim() {
        int len = value.length;
        int st = 0;
        char[] val = value;    /* avoid getfield opcode */

        while ((st < len) && (val[st] <= ' ')) {
            st++;
        }
        while ((st < len) && (val[len - 1] <= ' ')) {
            len--;
        }
        return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
    }

从2个while循环我们可以看出:

  • 第一个while循环. 我们去除字符串前面的空字符.
  • 第二个while循环. 我们去除字符串后面的空字符.
    " ABC ", 先去除前面的空值. 再去除后面的空值.

  • toString()
   public String toString() {
        return this;
    }

  • toCharArray()
    public char[] toCharArray() {
        // Cannot use Arrays.copyOf because of class initialization order issues
        char result[] = new char[value.length];
        System.arraycopy(value, 0, result, 0, value.length);
        return result;
    }
  • Format 格式化
    public static String format(String format, Object... args) {
        return new Formatter().format(format, args).toString();
    }
    public static String format(Locale l, String format, Object... args) {
        return new Formatter(l).format(format, args).toString();
    }   

  • valueOf() - 工具类转换
   public static String valueOf(Object obj) {
        return (obj == null) ? "null" : obj.toString();
    }
    public static String valueOf(char data[]) {
        return new String(data);
    }
    public static String valueOf(boolean b) {
        return b ? "true" : "false";
    }
    public static String valueOf(int i) {
        return Integer.toString(i);
    }
    public static String valueOf(double d) {
        return Double.toString(d);
    }    

  • Intern()
 public native String intern();

intern()方法与字符串常量池有关. 表示将当前字符串对象放入常量池内.


Q

Q1: 字符串类型可以变化么?

Q2: 对于字符串类型的equals==有什么区别? 为什么?

Q3: replace()方法与replaceFirst()/replaceAll()有什么区别?

Q4: StringBufferStringBuilder的区别?


Reference

[1]. JDK1.8源码(三)——java.lang.String 类

©️2020 CSDN 皮肤主题: Age of Ai 设计师:meimeiellie 返回首页