Netty源码解析-NioEventLoop

一：NioEventLoop的创建

NioEventLoop是由NioEventLoopGroup管理，所以创建操作要去NioEventLoopGroup找，如客户端代码中会先创建NioEventLoopGroup对象，从其构造函数入手，如你调用其无参构造器，一步步往下最终来到

    public NioEventLoopGroup(int nThreads, Executor executor, 
    						final SelectorProvider selectorProvider,
                           final SelectStrategyFactory selectStrategyFactory) {
        super(nThreads, executor, selectorProvider, selectStrategyFactory, 
        	RejectedExecutionHandlers.reject());
    }

其中nThreads = 0，executor = null，以及selectorProvider，selectStrategyFactory，RejectedExecutionHandlers都是它为你选择好的。
继续往上跟踪

MultithreadEventLoopGroup：
    protected MultithreadEventLoopGroup(int nThreads, Executor executor, Object... args) {
        super(nThreads == 0 ? DEFAULT_EVENT_LOOP_THREADS : nThreads, executor, args);
    }
    private static final int DEFAULT_EVENT_LOOP_THREADS;

    static {
        DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
                "io.netty.eventLoopThreads", NettyRuntime.availableProcessors() * 2));

若你没有设置的话nThreads默认会被设置为你cpu核心数的两倍。
继续

MultithreadEventExecutorGroup：
    protected MultithreadEventExecutorGroup(int nThreads, Executor executor, Object... args) {
        this(nThreads, executor, DefaultEventExecutorChooserFactory.INSTANCE, args);
    }

public static final DefaultEventExecutorChooserFactory INSTANCE = new DefaultEventExecutorChooserFactory();

给你加了个选择器的工厂类，group用它来选择NioEventLoop
继续跟

MultithreadEventExecutorGroup：
    protected MultithreadEventExecutorGroup(int nThreads, Executor executor,
                                            EventExecutorChooserFactory chooserFactory, Object... args) {
        ......
        if (executor == null) { // 为true
            executor = new ThreadPerTaskExecutor(newDefaultThreadFactory());
        }
        
        children = new EventExecutor[nThreads];

		//循环创建NioEventLoop
        for (int i = 0; i < nThreads; i ++) {
            boolean success = false;
            try {
                children[i] = newChild(executor, args);
                success = true;
            } catch (Exception e) {
                // TODO: Think about if this is a good exception type
                throw new IllegalStateException("failed to create a child event loop", e);
            } 
            ......
        }

        chooser = chooserFactory.newChooser(children);
		......
    }

有三个重点：ThreadPerTaskExecutor，newChild，chooserFactory.newChooser

一，ThreadPerTaskExecutor

public final class ThreadPerTaskExecutor implements Executor {
    private final ThreadFactory threadFactory;

    public ThreadPerTaskExecutor(ThreadFactory threadFactory) {
        if (threadFactory == null) {
            throw new NullPointerException("threadFactory");
        }
        this.threadFactory = threadFactory;
    }

    @Override
    public void execute(Runnable command) {
        threadFactory.newThread(command).start();
    }
}

threadFactory为DefaultThreadFactory：

    public Thread newThread(Runnable r) {
        Thread t = newThread(FastThreadLocalRunnable.wrap(r), prefix + nextId.incrementAndGet());
		......
        return t;
    }

    protected Thread newThread(Runnable r, String name) {
        return new FastThreadLocalThread(threadGroup, r, name);
    }

ThreadPerTaskExecutor的execute每次启动一个新线程执行任务。

二：newChild
点位到NioEventLoopGroup

    protected EventLoop newChild(Executor executor, Object... args) throws Exception {
        return new NioEventLoop(this, executor, (SelectorProvider) args[0],
            ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2]);
    }

到这里NioEventLoop被创建。参数：NioEventLoop所属的NioEventLoopGroup，executor = ThreadPerTaskExecutor，剩下的三个都是在NioEventLoopGroup初始化时构建的。
跟进

    NioEventLoop(NioEventLoopGroup parent, Executor executor, SelectorProvider selectorProvider,
                 SelectStrategy strategy, RejectedExecutionHandler rejectedExecutionHandler) {
        super(parent, executor, false, DEFAULT_MAX_PENDING_TASKS, rejectedExecutionHandler);
		......
        provider = selectorProvider;
        final SelectorTuple selectorTuple = openSelector();
        selector = selectorTuple.selector;
        unwrappedSelector = selectorTuple.unwrappedSelector;
        selectStrategy = strategy;
    }

在这里Selector被创建，一个NioEventLoop对应一个Selector。

    private SelectorTuple openSelector() {
        final Selector unwrappedSelector;
        try {
            unwrappedSelector = provider.openSelector();

调用JDK创建一个Selector。
DEFAULT_MAX_PENDING_TASKS定义在SingleThreadEventLoop

protected static final int DEFAULT_MAX_PENDING_TASKS = Math.max(16,
            SystemPropertyUtil.getInt("io.netty.eventLoop.maxPendingTasks", Integer.MAX_VALUE));

也就是说最小值为16，不设置maxPendingTasks属性的话，该值为Integer.MAX_VALUE。

沿着super来到SingleThreadEventLoop

SingleThreadEventLoop：
    protected SingleThreadEventLoop(EventLoopGroup parent, Executor executor,
                                    boolean addTaskWakesUp, int maxPendingTasks,
                                    RejectedExecutionHandler rejectedExecutionHandler) {
        super(parent, executor, addTaskWakesUp, maxPendingTasks, rejectedExecutionHandler);
        tailTasks = newTaskQueue(maxPendingTasks);
    }
    protected Queue<Runnable> newTaskQueue(int maxPendingTasks) {
        return new LinkedBlockingQueue<Runnable>(maxPendingTasks);
    }

创建一个链表类型的阻塞队列tailTasks ，容量为maxPendingTasks。

继续跟

    protected SingleThreadEventExecutor(EventExecutorGroup parent, Executor executor,
                                        boolean addTaskWakesUp, int maxPendingTasks,
                                        RejectedExecutionHandler rejectedHandler) {
        super(parent);
        this.addTaskWakesUp = addTaskWakesUp; // false
        this.maxPendingTasks = Math.max(16, maxPendingTasks);
        this.executor = ObjectUtil.checkNotNull(executor, "executor");
        taskQueue = newTaskQueue(this.maxPendingTasks);
        rejectedExecutionHandler = ObjectUtil.checkNotNull(rejectedHandler, "rejectedHandler");
    }

保存了ThreadPerTaskExecutor，创建一个链表类型的阻塞队列taskQueue，super(parent)最终会将NioEventLoopGroup赋给AbstractEventExecutor的parent属性。

三：chooserFactory.newChooser
group管理EventLoop数组，那么它是如何从中做出选择的呢？通过Chooser，即DefaultEventExecutorChooserFactory

DefaultEventExecutorChooserFactory：
    public EventExecutorChooser newChooser(EventExecutor[] executors) {
        if (isPowerOfTwo(executors.length)) {
            return new PowerOfTwoEventExecutorChooser(executors);
        } else {
            return new GenericEventExecutorChooser(executors);
        }
    }

为什么要根据executors数组长度的奇偶来分？这是Netty做的优化，对什么的优化？在上篇文章Netty服务端启动中第三小节注册里，channel需要注册到一个EventLoop中去，group通过chooser来选择这个EventLoop，选择过程调用的就是chooser.next();来看看二者的实现

PowerOfTwoEventExecutorChooser：
public EventExecutor next() {
    return executors[idx.getAndIncrement() & executors.length - 1];
}

GenericEventExecutorChooser：
public EventExecutor next() {
 return executors[Math.abs(idx.getAndIncrement() % executors.length)];
}

group找寻EventLoop的方式就是轮循，而这里的优化与HashMap的优化一样，当数组大小为2的n次幂时，采用&来代替%，提升效率。

贴一张来自《Netty实战》继承图
在这里插入图片描述
NioEventLoop继承自SingleThreadEventLoop

二：NioEventLoop启动

经过上面的过程，group已经创建了NioEventLoop数组，那它有什么用呢？什么地方调用了它？前一篇服务端启动文章分析过，channel注册过程会先将其绑定到一个EventLoop上，之后再绑定阶段会将实际绑定操作封装成一个Runnable，交给eventLoop异步执行，入口再AbstractBootstrap,doBind0()

    private static void doBind0(
            final ChannelFuture regFuture, final Channel channel,
            final SocketAddress localAddress, final ChannelPromise promise) {
            
        channel.eventLoop().execute(new Runnable() {
            @Override
            public void run() {
                if (regFuture.isSuccess()) {
                    channel.bind(localAddress, promise).addListener(ChannelFutureListener.CLOSE_ON_FAILURE);
                } else {
                    promise.setFailure(regFuture.cause());
                }
            }
        });
    }

来看看NioEventLoop的execute方法，定位到SingleThreadEventExecutor

SingleThreadEventExecutor：
    public void execute(Runnable task) {
        if (task == null) {
            throw new NullPointerException("task");
        }

        boolean inEventLoop = inEventLoop();
        if (inEventLoop) {
            addTask(task);
        } else {
            startThread();
            addTask(task);
            if (isShutdown() && removeTask(task)) {
                reject();
            }
        }

        if (!addTaskWakesUp && wakesUpForTask(task)) {
            wakeup(inEventLoop);
        }
    }

先来分析inEventLoop方法

    public boolean inEventLoop() {
        return inEventLoop(Thread.currentThread());
    }

SingleThreadEventExecutor：
    public boolean inEventLoop(Thread thread) {
        return thread == this.thread;
    }

判断是否是NioEventLoop的线程。若是NioEventLoop的线程或是外部线程会调用addTask将任务加入队列，否则就先创建NioEventLoop的线程，再调用addTask将任务加入队列。
先来顺一下前面的流程，1，group初始化创建了ThreadPerTaskExecutor并将其传给它所创建的每一个EventLoop，ThreadPerTaskExecutor的execute会启动一个新线程来执行任务，该线程是FastThreadLocalThread。2，channel创建时会将实际绑定任务封装成Runnable，交给与其绑定的NioEventLoop的线程来执行。那么NioEventLoop的线程什么时候创建的？在其execute第一次被调用时创建，第一次调用代码执行到上述的inEventLoop返回false，调用startThread创建线程并赋给SingleThreadEventExecutor的thread字段

SingleThreadEventExecutor：
    private void startThread() {
        if (state == ST_NOT_STARTED) {
            if (STATE_UPDATER.compareAndSet(this, ST_NOT_STARTED, ST_STARTED)) {
                try {
                    doStartThread();
                } catch (Throwable cause) {
                    STATE_UPDATER.set(this, ST_NOT_STARTED);
                    PlatformDependent.throwException(cause);
                }
            }
        }
    }
private volatile int state = ST_NOT_STARTED;
private static final AtomicIntegerFieldUpdater<SingleThreadEventExecutor> STATE_UPDATER =
            AtomicIntegerFieldUpdater.newUpdater(SingleThreadEventExecutor.class, "state");

利用CAS来避免并发下线程被多次创建，只能由一个线程执行成功调用doStartThread，来创建NioEvenLoop的线程

    private void doStartThread() {
        assert thread == null; // 此时NioEventLoop的线程必须为null
        executor.execute(new Runnable() {
            @Override
            public void run() {
                thread = Thread.currentThread();
                if (interrupted) {
                    thread.interrupt();
                }

                boolean success = false;
                updateLastExecutionTime();
                try {
                    SingleThreadEventExecutor.this.run();
                ......

这里executor指的是NioEventLoopGroup初始化过程中创建的ThreadPerTaskExecutor，它的execte会启动一个线程(FastThreadLocalThread)。当该线程真正运行起来，首先将自己赋给SingleThreadEventExecutor的thread字段，标识自己是该NioEventLoop唯一的线程，之后会调用NioEventLoop的run方法，下面来分析该方法。

三：NioEventLoop执行

    protected void run() {
        for (;;) {
            try {
                switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
                    case SelectStrategy.CONTINUE: // -2
                        continue;
                    case SelectStrategy.SELECT: // -1
                        select(wakenUp.getAndSet(false));
                        if (wakenUp.get()) {
                            selector.wakeup();
                        }
                        // fall through
                    default:
                }

                cancelledKeys = 0;
                needsToSelectAgain = false;
                final int ioRatio = this.ioRatio;
                if (ioRatio == 100) {
                    try {
                        processSelectedKeys();
                    } finally {
                        // Ensure we always run tasks.
                        runAllTasks();
                    }
                } else {
                    final long ioStartTime = System.nanoTime();
                    try {
                        processSelectedKeys();
                    } finally {
                        // Ensure we always run tasks.
                        final long ioTime = System.nanoTime() - ioStartTime;
                        runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                    }
                }
            } 
            ......

在for (;;)无限循环下，每次循环大致可分为三步：

若队列(tailTasks, taskQueue)为空，则调用阻塞式select方法不断循环查找注册到该selector上的所有IO事件。select(wakenUp.getAndSet(false));
处理产生网络IO事件的Channel，processSelectedKeys();
处理任务队列，runAllTasks();

先来看hasTasks()它在tailTasks与taskQueue为空时返回false，关于这两个队列在NioEventLoop初始化时创建，作用之后分析。
将switch判断语句selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())整理为
hasTasks ? selector.selectNow() : SelectStrategy.SELECT;意思是:队列中有任务就没必要阻塞，执行非阻塞式selectNow方法。否则返回SelectStrategy.SELECT，接下来调用阻塞式select方法。

1：select

select(wakenUp.getAndSet(false));
// wakeup为true代表需要唤醒阻塞在select的线程
if (wakenUp.get()) {
   selector.wakeup();
}

wakeUp为private final AtomicBoolean wakenUp = new AtomicBoolean();起到什么作用？控制被selector.select方法阻塞的线程是否该返回，这里的select是阻塞一定时间的select操作。
select()
该方法就是让线程不断轮循查找队列是否被添加了任务？定时任务是否快到期？解决了epool cpu100%的bug

    private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
        try {
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                ......

Netty的定时任务队列按照延迟时间从小到大排列delayNanos(currentTimeNanos)取出第一个任务到期的剩余时间，计算出selectDeadLineNanos，代表最近一个要到期的定时任务的到期时间

来看看for (;;)里第一部分
1）定时任务到期，中断轮循

long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
if (timeoutMillis <= 0) {
    if (selectCnt == 0) {
        selector.selectNow();
        selectCnt = 1;
    }
    break;
}

定时任务队列中有任务到期(此时>=定时任务到期时间+0.5ms)，则跳出，这里跳出前为什么会在selectCnt == 0时调用selectNow方法？selectCnt用于记录无效select的次数，用于避免空轮询bug。
首先在执行Selector的select()方法时，如果与SelectionKey相关的事件发生了，这个SelectionKey就被加入到selected-keys集合与 all-keys集合中。

关于Selector中的三个集合：

一个Selector对象会包含3种类型的SelectionKey集合：
all-keys集合 —— 当前所有向Selector注册的SelectionKey的集合，Selector的keys()方法返回该集合
selected-keys集合 —— 相关事件已经被Selector捕获的SelectionKey的集合，Selector的selectedKeys()方法返回该集合
cancelled-keys集合 —— 已经被取消的SelectionKey的集合，Selector没有提供访问这种集合的方法

所以这里的selectNow的调用就是为了填充selected-keys集合。那么为什么在selectCnt == 0时调用？这说明代码没有往下执行，否则不可能为0，一定是刚开始就发现有定时任务过期，执行一次非阻塞式的selectNow，有准备好的selectKey就放入集合中。

这里叉开一下：
关于selector.selectNow()：*Invoking this method clears the effect of any previous invocations * of the {@link #wakeup wakeup} method.
关于selector.wakeup()：

Causes the first selection operation that has not yet returned to return
immediately.

If another thread is currently blocked in an invocation of the
{@link #select()} or {@link #select(long)} methods then that invocation
will return immediately.  If no selection operation is currently in
progress then the next invocation of one of these methods will return
immediately unless the {@link #selectNow()} method is invoked in the
meantime.  In any case the value returned by that invocation may be
non-zero.  Subsequent invocations of the {@link #select()} or {@link
#select(long)} methods will block as usual unless this method is invoked
again in the meantime.

Invoking this method more than once between two successive selection
operations has the same effect as invoking it just once.  </p>

就是说selectNow会清除之前的wakeup方法的作用。

for (;;)里第二部分
2）有任务加入队列，中断轮循

        if (hasTasks() && wakenUp.compareAndSet(false, true)) {
            selector.selectNow();
            selectCnt = 1;
            break;
        }

wakeUp设为true，上面为什么没有这一步？wakeUp是个标识，它为true则代表应该唤醒阻塞在select的线程
保证了任务队列能够及时执行。

for (;;)里第三部分
3）阻塞式select

int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;

if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
       // - Selected something,
       // - waken up by user, or
       // - the task queue has a pending task.
       // - a scheduled task is ready for processing
       break;
}

执行到这就先阻塞，直到第一个定时任务的截至时间。若这段时间非常长呢？新任务的加入会中断这种等待，外部线程调用该NioEventLoop的execute，会执行wkeUp唤醒操作。
阻塞select结束后，进行一系列是否中断轮循的判断：轮循到IO事件selectedKeys != 0；oldWakenUp 为true：代表唤醒阻塞的select操作；wakenUp.get()用户主动唤醒；有任务被加入队列hasTasks()；第一个定时任务到期要被执行hasScheduledTasks()

for (;;)里第四部分
4）解决jdk的nio bug
https://bugs.java.com/bugdatabase/view_bug.do?bug_id=6595055
该bug会导致Selector一直空轮询，最终导致cpu 100%，nio server不可用。
Netty并非解决而是检测到后进行回避。

long time = System.nanoTime();
if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
     // timeoutMillis elapsed without anything selected.
     selectCnt = 1;
} else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
             selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
      // The selector returned prematurely many times in a row.
      // Rebuild the selector to work around the problem.
      logger.warn(
            "Selector.select() returned prematurely {} times in a row; 
            rebuilding Selector {}.",selectCnt, selector);

      rebuildSelector();
      selector = this.selector;

      // Select again to populate selectedKeys.
      selector.selectNow();
      selectCnt = 1;
      break;
}

currentTimeNanos = time;

计算出一次selector.select(timeoutMillis)的执行时间，若其>=timeoutMillis则代表这是一个有效的select，重置selectCnt为1(selectCnt用于记录无效select的次数)，否则，表明该阻塞方法并没有阻塞这么长时间，可能触发了jdk的空轮询bug，当空轮询的次数超过一个阀值的时候，默认是512，就开始重建selector。

重建逻辑在rebuildSelector();方法，就是利用openSelector()新建一个selector，然后将老的selector上的channel转移到新的上。

总结一下NioEventLoop的select方法：不断地轮询是否有IO事件发生，并且在轮询的过程中检查是否有定时任务和普通任务，保证了netty的任务队列中的任务得到有效执行，轮询过程顺带用一个计数器避开jdk空轮询的bug

后面两个步骤见下一篇文章