netty4源码阅读与分析----服务端启动过程

本文是基于4.1.24-final版本,首先我们编写个测试例子,然后根据例子一步步debug过程中阅读源码。
EventLoopGroup bossGroup=new NioEventLoopGroup(1);
        EventLoopGroup workerGroup=new NioEventLoopGroup();
        try {
            ServerBootstrap b=new ServerBootstrap();
            b.group(bossGroup, workerGroup)
                    .channel(NioServerSocketChannel.class)
                    .childHandler(new ChannelInitializer<SocketChannel>() {
                        public void initChannel(SocketChannel ch) throws Exception {
                            ch.pipeline().addLast(new EchoServerHandler());
                        };
                    })
                    .option(ChannelOption.SO_BACKLOG, 128)
                    .childOption(ChannelOption.SO_KEEPALIVE, true);
            ChannelFuture f=b.bind(port).sync();
            f.channel().closeFuture().sync();
        }finally{
            bossGroup.shutdownGracefully();
            workerGroup.shutdownGracefully();
        }

首先看下NioEventLoopGroup这个类,其关系图如下:

其构造函数最终会调用到MultithreadEventExecutorGroup:

protected MultithreadEventExecutorGroup(int nThreads, Executor executor,
                                            EventExecutorChooserFactory chooserFactory, Object... args) {
        children = new EventExecutor[nThreads];

        for (int i = 0; i < nThreads; i ++) {
            boolean success = false;
            try {
                children[i] = newChild(executor, args);
                success = true;
            } catch (Exception e) {
               ......
            } finally {
                ...省略
            }
        }
        chooser = chooserFactory.newChooser(children);
        ...省略
    }

默认的线程数为cpu processor size*2,主要看下newChild方法:

protected EventLoop newChild(Executor executor, Object... args) throws Exception {
        return new NioEventLoop(this, executor, (SelectorProvider) args[0],
            ((SelectStrategyFactory) args[1]).newSelectStrategy(), (RejectedExecutionHandler) args[2]);
    }

主要是构造一个NioEventLoop实例,我们看下这个类:


一个NioEventLoop可以看成是一单线程,线程不停的从队列中获取任务执行,我们来看下其run方法:

protected void run() {
        for (;;) {
            try {
                switch (selectStrategy.calculateStrategy(selectNowSupplier, hasTasks())) {
                    case SelectStrategy.CONTINUE:
                        continue;
                    case SelectStrategy.SELECT:
                        select(wakenUp.getAndSet(false));
                        if (wakenUp.get()) {
                            selector.wakeup();
                        }
                    default:
                }

                cancelledKeys = 0;
                needsToSelectAgain = false;
                final int ioRatio = this.ioRatio;
                if (ioRatio == 100) {
                    try {
                        processSelectedKeys();
                    } finally {
                        runAllTasks();
                    }
                } else {
                    final long ioStartTime = System.nanoTime();
                    try {
                        processSelectedKeys();
                    } finally {
                        final long ioTime = System.nanoTime() - ioStartTime;
                        runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
                    }
                }
            } catch (Throwable t) {
                handleLoopException(t);
            }
            ........
        }
    }
这是一个for循环,首先看下calculateStrategy:
 public int calculateStrategy(IntSupplier selectSupplier, boolean hasTasks) throws Exception {
        return hasTasks ? selectSupplier.get() : SelectStrategy.SELECT;
    }
如果队列中没有任务,则返回SELECT策略,这里我们主要关注这个,接下来执行select方法:
private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
        try {
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
                if (timeoutMillis <= 0) {
                    if (selectCnt == 0) {
                        selector.selectNow();
                        selectCnt = 1;
                    }
                    break;
                }
                if (hasTasks() && wakenUp.compareAndSet(false, true)) {
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }
                int selectedKeys = selector.select(timeoutMillis);
                selectCnt ++;
                if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    break;
                }
                if (Thread.interrupted()) {
                    selectCnt = 1;
                    break;
                }
                long time = System.nanoTime();
                if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
                    selectCnt = 1;
                } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                    rebuildSelector();
                    selector = this.selector;
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }

                currentTimeNanos = time;
            }

            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row for Selector {}.",
                            selectCnt - 1, selector);
                }
            }
        } catch (CancelledKeyException e) {
           ........
        }
首先selectDeadLineNanos=当前时间+1s,接着计算这个时间是否到期了,如果到了,直接selectNow,查看感兴趣的事件是否到来,这是一个非阻塞方法。如果未到期,接着看,
// If a task was submitted when wakenUp value was true, the task didn't get a chance to call
                // Selector#wakeup. So we need to check task queue again before executing select operation.
                // If we don't, the task might be pended until select operation was timed out.
                // It might be pended until idle timeout if IdleStateHandler existed in pipeline.
                if (hasTasks() && wakenUp.compareAndSet(false, true)) {
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }
这里是为了解决一个场景,如果此时wakeup被置为true,线程的task是没有机会唤醒selector的,所以这里需要check下这种情况,接着往下看,线程会在此阻塞timeoutMillis,等待感兴趣的事件到来,如果有事件到来或者有任务到队列中等等条件时,直接跳出循环返回。接下来的代码主要是为了解决nio epoll bug的问题,我们在另外一篇文章中详细说这个问题。select方法返回后,接着往下看,ioRatio这里我们采用默认的值50,即处理感兴趣的事件和执行队列中任务所花cpu时间各占一半。回到我们的main线程中继续看,EventLoopGroup用于管理这些EventLoop。所以上面前两行代码我们可以理解为有一个线程在执行boss的工作,有processor_size*2个线程在执行worker的工作,可以看作是1个boss和N个worker在协作完成任务。接下来的部分,我们主要想知道boss和worker分别都在干啥,为什么要分boss和woker呢?

看下ServerBootstrap,我们这里采用的默认构造函数,所以接着看代码下一行,我们看下这个group方法是干啥的:

public ServerBootstrap group(EventLoopGroup parentGroup, EventLoopGroup childGroup) {
        super.group(parentGroup);
        if (childGroup == null) {
            throw new NullPointerException("childGroup");
        }
        if (this.childGroup != null) {
            throw new IllegalStateException("childGroup set already");
        }
        this.childGroup = childGroup;
        return this;
    }

父类持有bossGroup,子类ServerBootstrap的childGroup就是我们的workGroup,这里采用了builder模式,在可选参数比较多的时候,builder模式能够大大派上用场。

接着channel方法将NioServerSocketChannel类放入到ChannelFactory中,用于后续通过反射构造NioServerSocketChannel实例。

接下来就是设置ServerBootstrap 的childHandler,这里我们标记为ChannelHandler-ChannelInitializer-1

接着ChannelOption.SO_BACKLOG用于构造服务端套接字ServerSocket对象,标识当服务器请求处理线程全满时,用于临时存放已完成三次握手的请求的队列的最大长度。

ChannelOption.SO_KEEPALIVE表示是否启用心跳保活机制。在双方TCP套接字建立连接后(即都进入ESTABLISHED状态)并且在两个小时左右上层没有任何数据传输的情况下,这套机制才会被激活。
 接下来我们看下bind方法:

private ChannelFuture doBind(final SocketAddress localAddress) {
        final ChannelFuture regFuture = initAndRegister();//1
        final Channel channel = regFuture.channel();
        if (regFuture.cause() != null) {
            return regFuture;
        }

        if (regFuture.isDone()) {
            // At this point we know that the registration was complete and successful.
            ChannelPromise promise = channel.newPromise();
            doBind0(regFuture, channel, localAddress, promise);
            return promise;
        } else {
            // Registration future is almost always fulfilled already, but just in case it's not.
            final PendingRegistrationPromise promise = new PendingRegistrationPromise(channel);
            regFuture.addListener(new ChannelFutureListener() {
                @Override
                public void operationComplete(ChannelFuture future) throws Exception {
                    Throwable cause = future.cause();
                    if (cause != null) {
                        // Registration on the EventLoop failed so fail the ChannelPromise directly to not cause an
                        // IllegalStateException once we try to access the EventLoop of the Channel.
                        promise.setFailure(cause);
                    } else {
                        // Registration was successful, so set the correct executor to use.
                        // See https://github.com/netty/netty/issues/2586
                        promise.registered();

                        doBind0(regFuture, channel, localAddress, promise);
                    }
                }
            });
            return promise;
        }
    }

首先看下initAndRegister,省略部分代码:

final ChannelFuture initAndRegister() {
        Channel channel = null;
	channel = channelFactory.newChannel();
	init(channel);
	....
        ChannelFuture regFuture = config().group().register(channel);
        ....
}

首先第一步通过channelFactory创建NioServerSocketChannel实例,我们首先来看下这个类的关系图:


我们来看下它的构造函数:

public NioServerSocketChannel() {
        this(newSocket(DEFAULT_SELECTOR_PROVIDER));
    }
其中newSocket就是调用jdk来创建一个ServerSocketChannel实例,接着看:
public NioServerSocketChannel(ServerSocketChannel channel) {
        super(null, channel, SelectionKey.OP_ACCEPT);//感兴趣的事件是accept
        config = new NioServerSocketChannelConfig(this, javaChannel().socket());
    }
super最终会调用到AbstractChannel:
protected AbstractChannel(Channel parent) {
        this.parent = parent;
        id = newId();
        unsafe = newUnsafe();
        pipeline = newChannelPipeline();
    }

此时的parent=null,我们首先看下newUnsafe,主要是构造NioMessageUnsafe实例,它主要提供了read方法,用于读事件,后面我们会看到。

接着看下pipeline,它是DefaultChannelPipeline类的实例,它维护了双向一个链表,链表中元素类型为AbstractChannelHandlerContext,用于处理channelInbound和channeloutbound:

protected DefaultChannelPipeline(Channel channel) {
        this.channel = ObjectUtil.checkNotNull(channel, "channel");
        succeededFuture = new SucceededChannelFuture(channel, null);
        voidPromise =  new VoidChannelPromise(channel, true);
        tail = new TailContext(this);
        head = new HeadContext(this);
        head.next = tail;
        tail.prev = head;
    }
final class TailContext extends AbstractChannelHandlerContext implements ChannelInboundHandler {
        TailContext(DefaultChannelPipeline pipeline) {
            super(pipeline, null, TAIL_NAME, true, false);
            setAddComplete();
        }
}
final class HeadContext extends AbstractChannelHandlerContext
            implements ChannelOutboundHandler, ChannelInboundHandler {
        private final Unsafe unsafe;
        HeadContext(DefaultChannelPipeline pipeline) {
            super(pipeline, null, HEAD_NAME, false, true);
            unsafe = pipeline.channel().unsafe();
            setAddComplete();
        }
}

TailContext是一个inbound处理器,HeadContext是一个outbound处理器。所以此时pipeline中所维护的channelHandlerContext链表如下:


回到NioServerSocketChannel的构造函数,接下来就是构造NioServerSocketChannelConfig实例,这里我们先看下这个类的关系图,


接着往下看,回到initAndRegister方法,接下来看init方法:

void init(Channel channel) throws Exception {
        .........
        ChannelPipeline p = channel.pipeline();
        final EventLoopGroup currentChildGroup = childGroup;//即workerGroup
        final ChannelHandler currentChildHandler = childHandler;//即ChannelHandler-ChannelInitializer-1
        .........
        p.addLast(new ChannelInitializer<Channel>() {//这里我们标记为ChannelHandler-ChannelInitializer-2
            @Override
            public void initChannel(final Channel ch) throws Exception {
                final ChannelPipeline pipeline = ch.pipeline();
                ChannelHandler handler = config.handler();
                if (handler != null) {
                    pipeline.addLast(handler);
                }

                ch.eventLoop().execute(new Runnable() {
                    @Override
                    public void run() {
                        pipeline.addLast(new ServerBootstrapAcceptor(
                                ch, currentChildGroup, currentChildHandler, currentChildOptions, currentChildAttrs));
                    }
                });
            }
        });
    }
这里向pipeline新添加了一个DefaultChannelHandlerContext,添加后,此时pipeline中维护的链表如下:

上图括号中是我加入的标记,用以区分过程中产生的匿名类实例.接下来回到initAndRegister方法中,看这行代码:

ChannelFuture regFuture = config().group().register(channel);
config().group()这个返回的是bossGroup,channel是之前创建的NioServerSocketChannel实例,下面看下register方法,注意这里有个next方法,是通过chooser.next()来获取EventExecutor(其实就是NioEventLoop实例)。还记得上面的数组children = new EventExecutor[nThreads]吧,这里的next方法是通过index&(children.leng-1)获取一个NioEventLoop实例,这里boss线程只有一个。接着往下看,最终会调用到Unsafe.register方法:
public final void register(EventLoop eventLoop, final ChannelPromise promise) {
           .......
            AbstractChannel.this.eventLoop = eventLoop;//boss对应的NioEventLoop

            if (eventLoop.inEventLoop()) {//初始时线程还没有启动
                register0(promise);
            } else {
                try {
                    eventLoop.execute(new Runnable() {
                        @Override
                        public void run() {
                            register0(promise);
                        }
                    });
                } catch (Throwable t) {
                    ......
                }
            }
        }
可以看到,其做法是main线程将register包装成一个任务,然后丢给boss对应的线程去处理,然后main线程返回,这是个异步的过程。接下来我们看下register0都干了些啥:
private void register0(ChannelPromise promise) {
            try {
                boolean firstRegistration = neverRegistered;
                doRegister();
                neverRegistered = false;
                registered = true;
                pipeline.invokeHandlerAddedIfNeeded();
                pipeline.fireChannelRegistered();
                if (isActive()) {
                    if (firstRegistration) {
                        pipeline.fireChannelActive();
                    } else if (config().isAutoRead()) {
                        beginRead();
                    }
                }
            } catch (Throwable t) {
                .....
            }
        }

doRegister就是调用 ServerSocket进行注册(jdk),就是把当前的NioServerSocketChannel实例与selector建立一定的关联关系。接着往下看,pipeline.invokeHandlerAddedIfNeeded最终会往boss对应的线程中添加任务,该任务就是往pipeline中添加新的context,这个步骤也是异步的,最终会进入到下面的调用:

public void handlerAdded(ChannelHandlerContext ctx) throws Exception {
        if (ctx.channel().isRegistered()) {
            initChannel(ctx);
        }
    }

initChannel这个函数会调用到标记"ChannelHandler-ChannelInitializer-2"的initChannel方法,在这个任务中,向boss线程添加了一个任务,这个任务是向pipeline中添加一个context ServerBootstrapAcceptor,它也是一个inboundHandler,然后把"ChannelHandler-ChannelInitializer-2",从pipeline中删除,ServerBootstrapAcceptor还在任务队列中还没有添加到pipeline中,此时pipeline中维护的链表如下:

接着往下看,pipeline.fireChannelRegistered(),最终会调用到invokeChannelRegistered(findContextInbound()),此时pipeline中只有一个inboundHandler,那就是TailContext,会调用到它的channelRegistered方法,这里什么都没做。接下来就死bind了主要在doBind0方法中,最终会调用unsafe.bind方法,

public final void bind(final SocketAddress localAddress, final ChannelPromise promise) {

            boolean wasActive = isActive();
            try {
                doBind(localAddress);
            } catch (Throwable t) {
               ......
            }
            if (!wasActive && isActive()) {
                invokeLater(new Runnable() {
                    @Override
                    public void run() {
                        pipeline.fireChannelActive();
                    }
                });
            }
		.....
        }
dobind就是调用jdk提供的bind方法,接下来就是把通知active的事件作为一个任务提交给线程执行。然后这个任务会调用到HeadContext.channelActive方法:
public void channelActive(ChannelHandlerContext ctx) throws Exception {
            ctx.fireChannelActive();

            readIfIsAutoRead();
        }

注意,此时pipeline中维护的链表如下:

ctx.fireChannelActive会找出当前的第一个inboundHandler(从左到右),然后执行其channelActive,实际来看,这里目前啥都没做。继续看下一行readIfIsAutoRead(),会调用到AbstractNioUnsafe.doBeginRead:

protected void doBeginRead() throws Exception {
        final SelectionKey selectionKey = this.selectionKey;
        if (!selectionKey.isValid()) {
            return;
        }
        readPending = true;
        final int interestOps = selectionKey.interestOps();//此时的值为0
        if ((interestOps & readInterestOp) == 0) {
            selectionKey.interestOps(interestOps | readInterestOp);//readInterestOp就是OP_ACCEPT
        }
    }

其实就是注册感兴趣的事件accept等待连接到来,即打开的通道NioServerSocket感兴趣的事件是accept!

至此,netty服务端的流程算是启动完成了,下面我们来总结一下:

1,netty启动时主要是boss线程在执行一系列的操作,其初始化与注册等都是封装成一个个任务扔到线程队列中执行。线程主要处理两件事,一是处理感兴趣的事件,二是执行队列中的任务

2,netty的nio主要是将NioServerSocketChannel与selector绑定,然后向通道上注册感兴趣的事件,然后在boss线程中不断轮寻是否有感兴趣的事件到来。



猜你喜欢

转载自blog.csdn.net/chengzhang1989/article/details/80327662