简介
在前面的文章学习了NIO的理论知识 socket&io模型 ,Java NIO与epoll的关联 NIO server到epoll源码解析。 这次要来学习Java NIO了。众所周知,Java NIO的三大组件分别是Selector,Channel,Buffer。Selector是多路复用器,用来管理多个连接;Channel负责数据传输,类似火车轨道;Buffer负责数据读写,相当于火车。这次先学习Buffer,在网络中,我们一般都是用ByteBuffer,其他buffer也类似。
详解
1. 创建&分类
创建bytebuffer,我们有以下两种方式。创建出来的byteBuffer分为HeapByteBuffer和DirectByteBuffer。buffer的创建相对都是比较复杂的,尤其是directBuffer,所以尽量复用,减少重复创建和回收
HeapByteBuffer:占用堆内内存,通过malloc申请,属于native memory,属于用户空间 DirectByteBuffer:占用堆外内存
ByteBuffer heapBuffer = ByteBuffer.allocate(1024);
ByteBuffer directBuffer = ByteBuffer.allocateDirect(1024);
复制代码
2. directBuffer的优势
我们可以先看下使用buffer读取网络io数据的源码
static int read(FileDescriptor fd, ByteBuffer dst, long position,
NativeDispatcher nd)
throws IOException
{
if (dst.isReadOnly())
throw new IllegalArgumentException("Read-only buffer");
//如果是DirectBuffer,直接读取
if (dst instanceof DirectBuffer)
return readIntoNativeBuffer(fd, dst, position, nd);
// Substitute a native buffer
//申请directBuffer,先将数据写到directBuffer,再写到hepBuffer
ByteBuffer bb = Util.getTemporaryDirectBuffer(dst.remaining());
try {
int n = readIntoNativeBuffer(fd, bb, position, nd);
bb.flip();
if (n > 0)
dst.put(bb);
return n;
} finally {
Util.offerFirstTemporaryDirectBuffer(bb);
}
}
复制代码
看源码,我们就能很清楚了解directBuffer的优势,当我们读取数据时,会先判断当前buffer是否是directBuffer,是的话就直接写入当前buffer,反之则先申请directBuffer,写到directBuffer,然后再复制到heapBuffer。 简而言之:使用directBuffer可以减少一次数据的复制。
3. 为何使用heapBuffer需要先复制到directBuffer
-
当我们需要和操作系统/内核打交道,调用read,write等方法时需要传入buffer的起始地址和count。 若我们使用heapBuffer,当JVM发生gc时,buffer在堆中的地址可能会发生变化,这样的话内核读取数据的地址就会出错
-
Java中存储对象都是逻辑连续的,而系统调用都是需要物理地址连续的 这也就是为什么要用directBuffer去和内核交互的原因。哪怕上面两个问题可以通过某种方式解决,也是很复杂,并且存在隐患。
4. directBuffer的内存管理
directBuffer虽然是堆外内存,但是也是通过Java代码创建的,所以JVM还是很负责任的把清理堆外内存的任务承担下来了。 我们来看下JVM是如何清理directBuffer的。这时候就得上源码了,
- 我们先看看DirectByteBuffer的构造函数,关注到cleaner = Cleaner.create(this, new Deallocator(base, size, cap))。
DirectByteBuffer(int cap) { // package-private
super(-1, 0, cap, cap);
boolean pa = VM.isDirectMemoryPageAligned();
int ps = Bits.pageSize();
long size = Math.max(1L, (long)cap + (pa ? ps : 0));
Bits.reserveMemory(size, cap);
long base = 0;
try {
// 申请堆外内存空间
base = unsafe.allocateMemory(size);
} catch (OutOfMemoryError x) {
Bits.unreserveMemory(size, cap);
throw x;
}
unsafe.setMemory(base, size, (byte) 0);
if (pa && (base % ps != 0)) {
// Round up to page boundary
//堆外内存的地址
address = base + ps - (base & (ps - 1));
} else {
address = base;
}
//创建了Cleaner
cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
att = null;
}
复制代码
- 我们来看下Cleaner,可以发现它就是一个虚引用。
public class Cleaner
extends PhantomReference<Object>
{
private Cleaner(Object referent, Runnable thunk) {
super(referent, dummyQueue);
this.thunk = thunk;
}
}
复制代码
- PhantomReference的父类是Reference,reference有个static构造方法,启动了ReferenceHandler线程,并设置为守护线程
- ReferenceHandler在while(true)中执行tryHandlePending方法,在此方法中若pending不为null,且instance of cleaner的话,就会赋值给c。
- 在后面就会调用c.clean()。
如果仔细看代码的话,会发现pending没有地方显示赋值,是因为pending是gc赋值的
public abstract class Reference<T> {
/* List of References waiting to be enqueued. The collector adds
* References to this list, while the Reference-handler thread removes
* them. This list is protected by the above lock object. The
* list uses the discovered field to link its elements.
*/
private static Reference<Object> pending = null;
static {
ThreadGroup tg = Thread.currentThread().getThreadGroup();
for (ThreadGroup tgn = tg;
tgn != null;
tg = tgn, tgn = tg.getParent());
//创建referenceHandler,并设置守护线程
Thread handler = new ReferenceHandler(tg, "Reference Handler");
handler.setPriority(Thread.MAX_PRIORITY);
handler.setDaemon(true);
handler.start();
// provide access in SharedSecrets
SharedSecrets.setJavaLangRefAccess(new JavaLangRefAccess() {
@Override
public boolean tryHandlePendingReference() {
return tryHandlePending(false);
}
});
}
private static class ReferenceHandler extends Thread {
private static void ensureClassInitialized(Class<?> clazz) {
try {
Class.forName(clazz.getName(), true, clazz.getClassLoader());
} catch (ClassNotFoundException e) {
throw (Error) new NoClassDefFoundError(e.getMessage()).initCause(e);
}
}
static {
ensureClassInitialized(InterruptedException.class);
ensureClassInitialized(Cleaner.class);
}
ReferenceHandler(ThreadGroup g, String name) {
super(g, name);
}
public void run() {
while (true) {
tryHandlePending(true);
}
}
}
static boolean tryHandlePending(boolean waitForNotify) {
Reference<Object> r;
Cleaner c;
try {
synchronized (lock) {
if (pending != null) {
//pending是引用
r = pending;
//如果pending 是cleaner的话,就赋值给c
c = r instanceof Cleaner ? (Cleaner) r : null;
// unlink 'r' from 'pending' chain
pending = r.discovered;
r.discovered = null;
} else {
if (waitForNotify) {
lock.wait();
}
// retry if waited
return waitForNotify;
}
}
} catch (OutOfMemoryError x) {
Thread.yield();
// retry
return true;
} catch (InterruptedException x) {
// retry
return true;
}
// Fast path for cleaners
if (c != null) {
//如果cleaner不为空,则调用cleaner.clean()方法
c.clean();
return true;
}
ReferenceQueue<? super Object> q = r.queue;
if (q != ReferenceQueue.NULL) q.enqueue(r);
return true;
}
}
复制代码
- 那我们再回头来看创建directBuffer时cleaner.clean(),会调用thunk.run(),此thunk线程就是Deallocator,查看此线程实现,就能看到unsafe.freeMemory(address);这就释放了堆外内存
public void clean() {
if (!remove(this))
return;
try {
//调用Deallocator.run()
thunk.run();
} catch (final Throwable x) {
AccessController.doPrivileged(new PrivilegedAction<Void>() {
public Void run() {
if (System.err != null)
new Error("Cleaner terminated abnormally", x)
.printStackTrace();
System.exit(1);
return null;
}});
}
}
private static class Deallocator implements Runnable
{
private static Unsafe unsafe = Unsafe.getUnsafe();
private long address;
private long size;
private int capacity;
private Deallocator(long address, long size, int capacity) {
assert (address != 0);
this.address = address;
this.size = size;
this.capacity = capacity;
}
public void run() {
if (address == 0) {
// Paranoia
return;
}
//释放堆外内存
unsafe.freeMemory(address);
address = 0;
Bits.unreserveMemory(size, capacity);
}
}
复制代码
总结:jvm负责堆外内存的回收,创建directBuffer时会同步创建虚引用(PhantomReferenceh)cleaner,当directBuffer对象没有引用时,利用Reference的ReferenceHandler,会调用cleaner.clean方法,在clean方法中回收堆外内存
5. byteBuffer常用api
相关api我画了个简图,加强记忆
public abstract class Buffer {
// Invariants: mark <= position <= limit <= capacity
private int mark = -1;
private int position = 0;
private int limit;
private int capacity;
Buffer(int mark, int pos, int lim, int cap) { // package-private
if (cap < 0)
throw new IllegalArgumentException("Negative capacity: " + cap);
this.capacity = cap;
limit(lim);
position(pos);
if (mark >= 0) {
if (mark > pos)
throw new IllegalArgumentException("mark > position: ("
+ mark + " > " + pos + ")");
this.mark = mark;
}
}
//切换到读模式
public final Buffer flip() {
limit = position;
position = 0;
mark = -1;
return this;
}
//与flip的区别在于limit没变,根据场景可能需要访问没有添加元素的位置
public final Buffer rewind() {
position = 0;
mark = -1;
return this;
}
public final Buffer mark() {
mark = position;
return this;
}
public final Buffer reset() {
int m = mark;
if (m < 0)
throw new InvalidMarkException();
position = m;
return this;
}
public final Buffer clear() {
position = 0;
limit = capacity;
mark = -1;
return this;
}
}
public abstract class ByteBuffer extends Buffer implements Comparable<ByteBuffer> {
public abstract byte get();
public abstract ByteBuffer put(byte b);
}
复制代码
byteBuffer有所谓的读和写模式,其实只是方便理解,底层实现就是array,利用mark,limit,position方便读写数据。当写数据之后,想要读取数据,可以直接调用flip/rewind(有所区别,根据具体需求)来切换为读模式,position会置为0。
limit : 写模式下limit=capacity,读模式下limit=position
mark:手动调用,标记位置,用于之后回到该位置
position:读/写模式下的当前位置
总结
主要学习了directBuffer的相关原理,由于gc,在堆内存创建的heapBuffer的物理地址可能发生变化,所以在调用系统指令时,都采用directBuffer。哪怕创建的是heapBuffer,也会先传到directBuffer,再复制到heapBuffer。jvm也负责堆外内存的回收,在创建heapBuffer时,会创建虚引用Cleaner,利用Reference的ReferenceHandler调用Cleaner.clean方法,进行回收堆外内存。Buffer的相关api只要理解了,还是比较清晰的
参考文章
blog.csdn.net/gdutxiaoxu/… juejin.cn/post/684490… www.disheng.tech/blog/java-%…