前边讲了ByteBuf、Channel、Unsafe、ChannelPipeline、ChannelHandler等核心的类。这篇来学习学习EventLoop(EventLoopGroup)——Netty的线程。Netty的线程模型是经过精心的设计,既提高了框架的并发性能,又能在很大程度上避免死锁,局部还是实现了无锁化设计。非常值得学习的。
一,Reactor线程模型:Netty线程模型本质上也是经典的Reactor线程模型,看下前边转过的一篇文章《Reactor模式详解(转)》。其中包括了,Reactor单线程模型、Reactor多线程模型、Reactor主从多线程模型。大家可以参考Reactor线程模型图进行对比查看。
类型 | 特点 | 不足 |
---|---|---|
Reactor单线程模型 | 1,指所有的I/O操作都是在同一个NIO线程上完成,包括:a,接收客户端的TCP连接;b,向服务端发起TCP连接;c,读取通信对端的请求或应答消息;d,向通信对端发送消息或者应答消息。 2,简单、容易理解并实现 |
1,一个NIO线程同时处理成百上千的链路,性能无法支撑; 2,当NIO线程负载过重,处理速度变慢,导致客户端超时,超时后重发,更加重NIO线程的负载,最终导致大量消息积压和处理超时,成为系统瓶颈; 3,可靠性问题:一旦NIO线程意外,或者进入死循环,会导致整个系统通信模块不可用,造成节点故障。 |
Reactor多线程模型 | 1,有一个专门NIO线程——acceptor线程用于监听服务端,接受客户端的TCP连接; 2,网络IO操作——读写等由一个NIO线程池负责,包含一个任务队列和N个可用线程,由这些NIO线程负责消息的读取、解码、编码和发送等。 3,一个NIO线程可以同时处理N条链路,但是一个链路只对应一个NIO线程,防止发生并发操作问题。 |
1,一个NIO线程负责监听和处理所有的客户端连接可能会存在性能问题。例如百万客户端连接,或者服务端对客户端握手进行安全认证(耗费性能),可能会出现性能不足。 |
主从Reactor多线程模型 | 1,服务端用于接受客户端连接的不再是一个单独的NIO线程,而是一个独立的NIO线程池; 2,sub reactor线程池用来负责SocketChannel的读写、编解码等工作。 |
1,相对于上边两种更加复杂吧。 |
二,Netty线程模型:Netty可以通过配置不同的启动参数,支持上边的几种线程模型的,看下原理图和代码吧:
EventLoopGroup bossGroup = new NioEventLoopGroup();
EventLoopGroup workerGroup = new NioEventLoopGroup();
try{
ServerBootstrap b =new ServerBootstrap();
b.group(bossGroup,workerGroup).channel(NioServerSocketChannel.class).option(ChannelOption.SO_BACKLOG,100).handler(new LoggingHandler(LogLevel.INFO)).childHandler(new ChannelInitializer<SocketChannel>() {
@Override
protected void initChannel(SocketChannel socketChannel) throws Exception {
//解码
socketChannel.pipeline().addLast(MarshallingCodeFactory.buildMarshallingDecoder());
//编码
socketChannel.pipeline().addLast(MarshallingCodeFactory.buildMarshallingEncoder());
socketChannel.pipeline().addLast(new SubReqServerHandler());
}
});
ChannelFuture f =b.bind(port).sync();
f.channel().closeFuture().sync();
类别 | 职责 |
---|---|
用于接收Client请求的线程池 | 1,接收Client的TCP连接,初始化Channel参数; 2,将链路状态变更时间通知给ChannelPipeline; |
用于处理I/O操作的线程池 | 1,异步读取通信对端的数据报,发送读时间到ChannelPipeline; 2,异步发送消息到通信对端,调用ChannelPipeline的消息发送接口; 3,执行系统调用Task; 4,执行定时任务Task,例如链路空闲状态监测定时任务。 |
串行操作(无锁化设计):上篇Netty(十二)——ChannelPipeline之观、Netty(十三)——ChannelHandler之意 我们看到事件的处理是在ChannelPipeline中传输像职责链一样,经过ChannelHandler时进行处理,这种启动多个串行化的线程并行运行(避免锁的竞争),比一个队列一个工作线程性能更优。
三,Netty实践建议:
1,创建两个NioEventLoopGroup,用于逻辑隔离NIO Acceptor和NIO IO操作线程;
2,尽量不要在ChannelHandler中启动用户线程(解码后用于将POJO消息派发到后端业务线程的除外);
3,解码要放在NIO线程调用的解码Handler中进行,不要切换到用户线程中完成消息的解码;
4,如果业务逻辑操作简单,没有复杂的业务逻辑计算,没有可能导致线程被阻塞的磁盘操作、数据库操作、网络操作等,可以直接在NIO线程上完成业务逻辑编排,不需要切换到用户线程。
5,如果业务逻辑处理复杂,不要在NIO线程上完成,建议将解码后的POJO消息封装成Task,派发到业务线程中执行,以保证NIO线程尽快被释放,处理其他的IO操作。
四,NioEventLoop的源码分析:
1,先看下NioEventLoop的类关系图:
2,看一下NioEventLoop的源码:
2.1,首先看下多路复用器Selector在NioEventLoop中的初始化。
/**
* 一,聚合的多路复用器selector
*/
Selector selector;
private SelectedSelectionKeySet selectedKeys;
private final SelectorProvider provider;
/**
* 二,初始化NioEventLoop,selector = openSelector();
*/
NioEventLoop(NioEventLoopGroup parent, Executor executor, SelectorProvider selectorProvider) {
super(parent, executor, false);
if (selectorProvider == null) {
throw new NullPointerException("selectorProvider");
}
provider = selectorProvider;
selector = openSelector();
}
/**
* 二-1,selector = openSelector();
*/
private Selector openSelector() {
final Selector selector;
try {
selector = provider.openSelector();
} catch (IOException e) {
throw new ChannelException("failed to open a new selector", e);
}
//如果没有开启selectedKeys优化开关,直接返回provider.openSelector()。
if (DISABLE_KEYSET_OPTIMIZATION) {
return selector;
}
//如果开启了,通过反射从selector中获取selectedKeys和publicSelectedKeys,将其设置为可写,通过反射的方式使用Netty构造的selectedKeys = selectedKeySet;替换JDK的。
try {
SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();
Class<?> selectorImplClass =
Class.forName("sun.nio.ch.SelectorImpl", false, PlatformDependent.getSystemClassLoader());
// Ensure the current selector implementation is what we can instrument.
if (!selectorImplClass.isAssignableFrom(selector.getClass())) {
return selector;
}
Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");
selectedKeysField.setAccessible(true);
publicSelectedKeysField.setAccessible(true);
selectedKeysField.set(selector, selectedKeySet);
publicSelectedKeysField.set(selector, selectedKeySet);
selectedKeys = selectedKeySet;
logger.trace("Instrumented an optimized java.util.Set into: {}", selector);
} catch (Throwable t) {
selectedKeys = null;
logger.trace("Failed to instrument an optimized java.util.Set into: {}", selector, t);
}
return selector;
}
2.2,分析run方法的实现,这个方法基本调用到NioEventLoop中的所有封装方法,有的调用层级还比较深,不过耐心查看,跟下去,大概流程就清楚了。(这里我在注释中用数字表示了层级跟踪,例如4-1-1 为三级方法调用)
@Override
protected void run() {
//1-将wakeup还原false,并将之前的状态保存到oldWakeUp中
boolean oldWakenUp = wakenUp.getAndSet(false);
try {
//2判断当前消息队列中是否有消息尚未处理,如果有,selectNow返回一次select操作。
if (hasTasks()) {
selectNow();
} else {
//3,轮询,看是否有准备就绪的Channel
select(oldWakenUp);
if (wakenUp.get()) {
selector.wakeup();
}
}
cancelledKeys = 0;
needsToSelectAgain = false;
final int ioRatio = this.ioRatio;
//先后执行IO任务和非IO任务,两类任务的执行时间比由变量ioRatio控制,默认是非IO任务允许执行和IO任务相同的时间
//Netty中控制IO运行比例占比-分析,得到就绪状态的SocketChannel
if (ioRatio == 100) {
//4
processSelectedKeys();
//5
runAllTasks();
} else {
final long ioStartTime = System.nanoTime();
processSelectedKeys();
final long ioTime = System.nanoTime() - ioStartTime;
runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
}
//如果关闭,则进行优雅停机,调用closeAll,释放资源
if (isShuttingDown()) {
//6,
closeAll();
if (confirmShutdown()) {
cleanupAndTerminate(true);
return;
}
}
} catch (Throwable t) {
logger.warn("Unexpected exception in the selector loop.", t);
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
// Ignore.
}
}
scheduleExecution();
}
/**
* 2-1@see {@link Queue#isEmpty()}
*/
protected boolean hasTasks() {
assert inEventLoop();
return !taskQueue.isEmpty();
}
/**
* 2-2 selectNow
*/
void selectNow() throws IOException {
try {
selector.selectNow();
} finally {
// restore wakup state if needed
if (wakenUp.get()) {
selector.wakeup();
}
}
}
/**
* 3-1 select
*/
private void select(boolean oldWakenUp) throws IOException {
Selector selector = this.selector;
try {
int selectCnt = 0;
//取当前系统的纳秒时间
long currentTimeNanos = System.nanoTime();
//调用delayNanos计算定时任务的触发时间
long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
//死循环
for (;;) {
long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
//如果需要立即执行,或已经超时,则selector.selectNow();并退出当前循环
if (timeoutMillis <= 0) {
if (selectCnt == 0) {
selector.selectNow();
selectCnt = 1;
}
break;
}
//将定时任务剩余的超时时间作为参数进行select操作,没完成一次select操作,对selectCnt+1
int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;
//如果有下列情况,则进行退出循环
if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
// - Selected something,
// - waken up by user, or
// - the task queue has a pending task.
// - a scheduled task is ready for processing
break;
}
//thread interrupted也break退出
if (Thread.interrupted()) {
// Thread was interrupted so reset selected keys and break so we not run into a busy loop.
// As this is most likely a bug in the handler of the user or it's client library we will
// also log it.
//
// See https://github.com/netty/netty/issues/2426
if (logger.isDebugEnabled()) {
logger.debug("Selector.select() returned prematurely because " +
"Thread.currentThread().interrupt() was called. Use " +
"NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
}
selectCnt = 1;
break;
}
long time = System.nanoTime();
if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
// timeoutMillis elapsed without anything selected.
selectCnt = 1;
//如果死循环了,则进行重建Selector方式,让系统恢复正常rebuildSelector
} else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
// The selector returned prematurely many times in a row.
// Rebuild the selector to work around the problem.
logger.warn(
"Selector.select() returned prematurely {} times in a row; rebuilding selector.",
selectCnt);
//3-1-1
rebuildSelector();
selector = this.selector;
// Select again to populate selectedKeys.
selector.selectNow();
selectCnt = 1;
break;
}
currentTimeNanos = time;
}
if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
if (logger.isDebugEnabled()) {
logger.debug("Selector.select() returned prematurely {} times in a row.", selectCnt - 1);
}
}
} catch (CancelledKeyException e) {
if (logger.isDebugEnabled()) {
logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector - JDK bug?", e);
}
// Harmless exception - log anyway
}
}
/**
* 3-1-1
* Replaces the current {@link Selector} of this event loop with newly created {@link Selector}s to work
* around the infamous epoll 100% CPU bug.
*/
public void rebuildSelector() {
//如果为其它线程发起,则为了避免多线程并发操作,将rebuildSelector()封装成task放到消息队列中,由NioEventLoop负责调用。
if (!inEventLoop()) {
execute(new Runnable() {
@Override
public void run() {
rebuildSelector();
}
});
return;
}
final Selector oldSelector = selector;
final Selector newSelector;
if (oldSelector == null) {
return;
}
try {
//创建新的Selector
newSelector = openSelector();
} catch (Exception e) {
logger.warn("Failed to create a new Selector.", e);
return;
}
// Register all channels to the new Selector.(将SocketChannel从旧的Selector移动到新的上)
int nChannels = 0;
for (;;) {
try {
for (SelectionKey key: oldSelector.keys()) {
Object a = key.attachment();
try {
if (!key.isValid() || key.channel().keyFor(newSelector) != null) {
continue;
}
int interestOps = key.interestOps();
key.cancel();
SelectionKey newKey = key.channel().register(newSelector, interestOps, a);
if (a instanceof AbstractNioChannel) {
// Update SelectionKey
((AbstractNioChannel) a).selectionKey = newKey;
}
nChannels ++;
} catch (Exception e) {
logger.warn("Failed to re-register a Channel to the new Selector.", e);
if (a instanceof AbstractNioChannel) {
AbstractNioChannel ch = (AbstractNioChannel) a;
ch.unsafe().close(ch.unsafe().voidPromise());
} else {
@SuppressWarnings("unchecked")
NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
invokeChannelUnregistered(task, key, e);
}
}
}
} catch (ConcurrentModificationException e) {
// Probably due to concurrent modification of the key set.
continue;
}
break;
}
selector = newSelector;
try {
// time to close the old selector as everything else is registered to the new one 销毁旧的selector
oldSelector.close();
} catch (Throwable t) {
if (logger.isWarnEnabled()) {
logger.warn("Failed to close the old Selector.", t);
}
}
logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
}
/**
* 4-1
*/
private void processSelectedKeys() {
//如果开启selectedKeys优化功能走processSelectedKeysOptimized,否则走processSelectedKeysPlain
if (selectedKeys != null) {
//4-1-1
processSelectedKeysOptimized(selectedKeys.flip());
} else {
//4-1-2
processSelectedKeysPlain(selector.selectedKeys());
}
}
/**
*4-1-1
**/
private void processSelectedKeysOptimized(SelectionKey[] selectedKeys) {
for (int i = 0;; i ++) {
final SelectionKey k = selectedKeys[i];
if (k == null) {
break;
}
// null out entry in the array to allow to have it GC'ed once the Channel close
// See https://github.com/netty/netty/issues/2363
selectedKeys[i] = null;
final Object a = k.attachment();
if (a instanceof AbstractNioChannel) {
processSelectedKey(k, (AbstractNioChannel) a);
} else {
@SuppressWarnings("unchecked")
NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
processSelectedKey(k, task);
}
if (needsToSelectAgain) {
// null out entries in the array to allow to have it GC'ed once the Channel close
// See https://github.com/netty/netty/issues/2363
for (;;) {
if (selectedKeys[i] == null) {
break;
}
selectedKeys[i] = null;
i++;
}
selectAgain();
// Need to flip the optimized selectedKeys to get the right reference to the array
// and reset the index to -1 which will then set to 0 on the for loop
// to start over again.
//
// See https://github.com/netty/netty/issues/1523
selectedKeys = this.selectedKeys.flip();
i = -1;
}
}
}
/**
*4-1-2
**/
private void processSelectedKeysPlain(Set<SelectionKey> selectedKeys) {
// check if the set is empty and if so just return to not create garbage by
// creating a new Iterator every time even if there is nothing to process.
// See https://github.com/netty/netty/issues/597
if (selectedKeys.isEmpty()) {
return;
}
//循环遍历selectedKeys,进行操作
Iterator<SelectionKey> i = selectedKeys.iterator();
for (;;) {
//获取单个的进行,并从迭代器中删除
final SelectionKey k = i.next();
final Object a = k.attachment();
i.remove();
//如果为AbstractNioChannel类型,进行IO读写相关操作
if (a instanceof AbstractNioChannel) {
//4-1-2-1
processSelectedKey(k, (AbstractNioChannel) a);
} else {
//taks类型
@SuppressWarnings("unchecked")
NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
processSelectedKey(k, task);
}
if (!i.hasNext()) {
break;
}
if (needsToSelectAgain) {
selectAgain();
selectedKeys = selector.selectedKeys();
// Create the iterator again to avoid ConcurrentModificationException
if (selectedKeys.isEmpty()) {
break;
} else {
i = selectedKeys.iterator();
}
}
}
}
/**
*4-1-2-1
**/
private static void processSelectedKey(SelectionKey k, AbstractNioChannel ch) {
final AbstractNioChannel.NioUnsafe unsafe = ch.unsafe();
//判断是否可用,不可用直接关闭返回
if (!k.isValid()) {
// close the channel if the key is not valid anymore
unsafe.close(unsafe.voidPromise());
return;
}
//进行readyOps判断并调用unsafe进行相应的操作,读、写
try {
int readyOps = k.readyOps();
// Also check for readOps of 0 to workaround possible JDK bug which may otherwise lead
// to a spin loop
if ((readyOps & (SelectionKey.OP_READ | SelectionKey.OP_ACCEPT)) != 0 || readyOps == 0) {
unsafe.read();
if (!ch.isOpen()) {
// Connection already closed - no need to handle write.
return;
}
}
if ((readyOps & SelectionKey.OP_WRITE) != 0) {
// Call forceFlush which will also take care of clear the OP_WRITE once there is nothing left to write
ch.unsafe().forceFlush();
}
if ((readyOps & SelectionKey.OP_CONNECT) != 0) {
// remove OP_CONNECT as otherwise Selector.select(..) will always return without blocking
// See https://github.com/netty/netty/issues/924
int ops = k.interestOps();
ops &= ~SelectionKey.OP_CONNECT;
k.interestOps(ops);
unsafe.finishConnect();
}
} catch (CancelledKeyException ignored) {
unsafe.close(unsafe.voidPromise());
}
}
/**
* 5-1 执行定时任务
* Poll all tasks from the task queue and run them via {@link Runnable#run()} method.
*
* @return {@code true} if and only if at least one task was run
*/
protected boolean runAllTasks() {
//5-1-1
fetchFromScheduledTaskQueue();
Runnable task = pollTask();
if (task == null) {
return false;
}
for (;;) {
try {
task.run();
} catch (Throwable t) {
logger.warn("A task raised an exception.", t);
}
task = pollTask();
if (task == null) {
lastExecutionTime = ScheduledFutureTask.nanoTime();
return true;
}
}
}
/**
* 5-1-1将到时间的任务加入到taskqueue中供执行
*/
private void fetchFromScheduledTaskQueue() {
if (hasScheduledTasks()) {
long nanoTime = AbstractScheduledEventExecutor.nanoTime();
for (;;) {
Runnable scheduledTask = pollScheduledTask(nanoTime);
if (scheduledTask == null) {
break;
}
taskQueue.add(scheduledTask);
}
}
}
/**
* 6-1 关闭所有链路,释放线程、各种资源
*/
private void closeAll() {
selectAgain();
Set<SelectionKey> keys = selector.keys();
Collection<AbstractNioChannel> channels = new ArrayList<AbstractNioChannel>(keys.size());
for (SelectionKey k: keys) {
Object a = k.attachment();
if (a instanceof AbstractNioChannel) {
channels.add((AbstractNioChannel) a);
} else {
k.cancel();
@SuppressWarnings("unchecked")
NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
invokeChannelUnregistered(task, k, null);
}
}
for (AbstractNioChannel ch: channels) {
ch.unsafe().close(ch.unsafe().voidPromise());
}
}
Netty的线程模型,Reactor的设计非常牛逼的,直接决定了软件的性能和并发处理能力。多学习,多思考,多反复,多总结…… 继续中……