+ Cloud technology community salon Shu parsing Tencent latest technology stack open source project behind

December 21, Tencent and Tencent open source community and so on + Foreign Office of Management jointly organized by technology salon successfully held in Shenzhen Tencent Building. Theme of the current activity is "Tencent open source technology", a number of open source technology experts and engineers from Tencent JDK around Kona, TencentOS tiny, TubeMQ and other open source project development process, sharing the latest achievements Tencent in the open source process and made the road the accumulated practical experience and in-depth to explore the development trend of open source technology in different scenarios big data, networking, medical, etc.
Here Insert Picture Description

Yangxiaofeng: "Kona JDK practice and Tencent development in the field of big data."

Here Insert Picture Description
Tencent expert engineers, TEG JDK team leader Yang Xiaofeng, in his speech briefly introduces the origin Kona JDK projects, analysis of the current OpenJDK technological development hot spots, and the development status and trends in this field in China, and to Kona JDK large data field in Tencent demand pain points, practical experience as well as the future development of the share.
Open JDK is Java SE standard of free and open source reference implementation. In 2006, Sun company's commitment to gradually open source the core Java platform. In 2007, Redhat join the company, and released IcedTea. In 2010, Oracle acquired Sun and took over the project leadership, IBM2, SAP and other vendors have joined. 2014, JDK 8 release, becoming the highest version of the fastest adoption and acceptance. 2017, JDK 9 release and subsequently established a semiannual publication models and new pricing model. 2019, Tencent Kona announced open source.
Tencent Kona has several characteristics, the first free, zero-cost, followed by Tencent will provide reliable long-term support, the third production-ready, it passes the test of internal Tencent ultra-large-scale production environment. Tencent will also uphold the "maximize compatibility" in the future, "gradually contribute big data, cloud computing and other areas of advanced features" principle, and actively embrace open source, sustained contribution to the community.
JDK domestic product from the current application point of view, Oracle JDK still accounts for about 70%, OpenJDK 21% but rising fast. From version dimensions, JDK 7/8 is still the mainstream, but it is worth noting that, JDK 11 has been a certain amount of practice, domestic manufacturers more and more in-depth and confidence in the new technological innovations and landing JVM aspects.

Focus on big data field, the current Java / JVM is worthy of the uncrowned king, mainly due to the advantages of high productivity, high performance, high reliability JVM has provided a perfect cross-platform capabilities, improved tools, mass libraries and frameworks.

However, Tencent large mass of data, demanding technical scenario, the current JVM the ability to short board, has become a bottleneck part of the leading edge of the scene, reflected in many aspects cluster size, SLA, memory density. Classic GC with a particular scenario there is a mismatch, diagnostic and tuning facilities is still a large shortage of capacity.
At the same time, modern hardware with each passing day, JVM is to ensure that current count of force, but still needs a lot of improvements to more efficient use of vectorization technologies, support future sustainable performance improvement needs.
Tencent more technical level continued from JVM optimized to improve the appropriate tools to create the best areas of the Java Runtime Environment and solutions.

Ye Feng: "Based on practice ── TencentOS tiny open source project of IoT quickly create applets from zero."

Here Insert Picture Description
Ye Feng Tencent engineer in his speech introduced the main contents TencentOS tiny project background, software architecture, IoT solutions, development practices.

TencentOS Tiny Tencent streamline real-time operating system open field object-oriented network, it is a key ring networking product matrix underlying Tencent was role of carrying out drainage cloud side massive data platform, reduce development threshold, to enhance development efficiency, that the material networked terminal device and the quick access service can Tencent cloud things internet.

From TencentOS tiny architectural point of view, it has been adapted to mainstream chips and modules, provides the most streamlined RTOS kernel, and a wealth of physical networking components, integrated with the mainstream of things protocol. TencentOS tiny small size, low power consumption, rich IoT components, a reliable security framework, good portability, convenient debugging tools, etc., to meet the different demands of things.

TencentOS tiny September 18 this year officially open source, released a week GitHub open source project has become a hot list ranked second, has received Star 3500 +, Fork 800+, now in collaboration with domestic and international mainstream MCU and hardware vendors , the number of supported hardware platform has exceeded 50.
Tencent TencentOS tiny Currently, there are some things landing solutions, such as smart solutions for container planting and intelligence solutions.

Intelligent solution container, TencentOS tiny with the AI control system and service identification, complete scan code to open the cabinet, extract, closed automated billing process, to build a self service scenario. For real scene uncontrollable circumstances, such as some of the goods results in blocking recognition rate and decreased AI, TencentOS tiny provide more awareness: gravity sensing and other such auxiliary AI decision making to improve the recognition rate of AI.
In the wisdom of planting solutions, TencentOS tiny main service in two areas ── side that is context-aware and adjust the control side. Side context-aware decision algorithm cloud associated greenhouse made by collecting temperature and humidity, soil pH, oxygen and other environmental data, and reports the IoT to the internet cloud according to environmental data adjustment instruction, ultimately controlled by the adjustment-side adjustment completion greenhouse environment. Meanwhile TencentOS tiny program uses a multi-network adapter that supports WiFi / NB-IoT / Lora, achieve full link encryption to ensure data security.

In addition, in order to better understand TencentOS tiny, TencentOS tiny spot and custom development board, a complete end to end development of small-scale farming practice scenarios, including context-aware, device control, data on the cloud, small docking procedures. Use TencentOS tiny developing device side can be simplified, combined with networked platforms Tencent Clouds and the applet cloud development can be achieved to solve things fast, and low-cost on-line iterative scheme.

Zhang Guocheng: "TubeMQ The Apache Way"

Here Insert Picture Description
Tencent senior engineer, project leader Zhang Guocheng TubeMQ in his speech elaborated, analyzed the relevant principles to achieve one trillion messaging middleware TubeMQ and share the thinking for project follow-up questions such as the direction of development.

TubeMQ distributed messaging middleware is one trillion from Tencent, focusing on the storage and transmission of mass data scene, has unique advantages in terms of stability, performance, and cost aspects. Through our practical application of Big Data scenarios defined test scenario testing program, TubeMQ can support throughput of 140,000 TPS, the message delay can be controlled in 10 milliseconds, or even less than 5 milliseconds; the system has been in continuous iterative improvement within the company for seven years time, widely used in real-time advertising recommended, massive data reporting, data stream processing, indicators & monitoring scenarios.
TubeMQ key feature of the system architecture includes a pure Java implementation, there are Master HA coordinator node, weakening zk, management decentralization offset, support server-side filtering, improved message storage mode, adjust the RAID10 disk-based multiple copies + fast consumption, not multiple data reliability programs Broker nodes multiple copies.
Growing amount of pressure is access Tencent self-study TubeMQ of intrinsic motivation: in the order of few data, such as when the order of 1 billion or less, what message queue no problem, but the data to the 10 billion one hundred billion, Wan even when the order of one hundred million or more, there will be a lot of constraints, such as cost, stability, performance, etc., the average daily amount of access TubeMQ from the beginning of 2013 20000000000-2019 November 35 trillion, Tencent this period the message queue and then went through the practice of self-development process from introduction to improvement.

TubeMQ announced open source in September this year on the Apache Con and donated to the Apache Foundation. The reason why the TubeMQ be open source, mainly based on three points to consider: First, in response to the company's technology strategy, actively participate in the open source collaboration to build; secondly we want to TubeMQ Foreign open source can actually help to this one in need of partners to solve business in many practical problems encountered in the piece; the third is by embracing open source, to avoid behind closed doors, with the open source community to enhance their cooperation and external systems. The selection will TubeMQ contribute to Apache, both hope to standardize the incubation process has become a neutral text Foundation, the project is more mature, but also because Apache is known because of the large ecological data, we also benefit from this ecosystem, based on benefit open source open source feedback consideration, the project will be donated to the Apache Foundation.

TubeMQ the subsequent development, will continue to focus on system stability, performance and low-cost expansion, with plans to rely on the power of community to build TubeMQ projects to further improve the project, will TubeMQ access to different upstream and downstream, into the big data ecosystem, etc. and ultimately help to this team need to promote to promote the use of the project.

Chen Sihong: "MedicalNet: 3D medical imaging dedicated pre-training model practice and application."

Here Insert Picture Description
Tencent Senior Fellow Chen Sihong describes the relationship between the basic concepts of medical imaging AI, AI MedicalNet and medical imaging industry pain points in his speech, and the explanation for MedicalNet technical implementation process.

AI actually solve medical imaging is "difficult patients to see a doctor, the doctor diagnosed tired" global common problems.
As the train into a large, long cycle, a significant increase in the number of health care workers in a short time is difficult, and artificial intelligence technology can aid medical work, to alleviate the shortage of health care resources to the current situation. Artificial intelligence for the medical field, the main has two functions, one of which is the basis of population screening, and the other is to enhance the quality of diagnosis. For some simple diseases, artificial intelligence can achieve high performance diagnostics for disease screening of people at work to ease the problem of lack of health care to some extent. And some of the higher difficulty of treating disease, artificial intelligence can provide a reference for doctors to diagnose, serve as a reminder role.

Diagnostic medical imaging contain a wealth of information, it is a very common medical diagnostic tool. Medical imaging AI "made" as follows: collect data tagging, and then to train the artificial intelligence model these data, and ultimately the patient image input in the system, get closer to the senior physician diagnosis.

In recent years, the development of video and image recognition software, provides a great help for the medical imaging AI. But the limited health care resources, annotation data becomes difficult, resulting in the same data can be used to mark the distribution of training is very small, deep learning and data-driven form contradiction, this is the current bottleneck in the development of medical imaging AI lies.

So for medical imaging study of AI, the urgent need to find large data sets and the corresponding model, provide information support for the majority of small data AI medical imaging applications, and this is the motivation for the development of MedicaNet.
Although each public health and identically distributed 3D data sets small amount of data, but multiple medical scene data sets together can form a set of large-scale data sets, MedicaNet development team will be the scene of data sets collected, used to train different pre-training model, and then open the relevant pre-training model. As a result, when users need to train a new model, you can transfer learning directly MedicaNet model, even a small amount of data to new applications, users still can eventually train the model.

However, the implementation process MedicaNet, indeed encountered many problems to be solved by technology, including the meaning of pixels varies, ranging differences, artifacts frequent, low image quality, fuzzy boundaries, low contrast; different data sources , marked lack; the same tissue resolution is inconsistent, different organizations and so on large-scale differences.

MedicaNet main development team to address these challenges through two programs. The first is data collection screening program, the main purpose is to identify with the common knowledge of the data set. Specifically, the following: a small amount of data for each pick and choose from a scene to form a mini data collection agency, fast training through proxy network into small, the quality of the final segmentation result of the prediction data set is determined in accordance with the mini which data sets can be preserved.

After completion of the screening data sets, the use of joint training programs for training. First pixel data and spatial normalization preprocessing. To get more information on labels, MedicalNet in all the divided data sets.
MedicalNet part by the coding and decoding, the coding part of the open source model. For more information concentrated in the coding section, so put most of the parameters they are concentrated in the coding. To solve the mark is not uniform between datasets and data collection problems, the use of multi-tasking form part of the annotation data decoding multiple scenes of isolation. During training, different combinations of skip-connection for ease gradient disappearing. After training, the encoding section may be moved to any of a variety of tasks division model, classification and detection, and the like.

The final experimental results show that, in the 3D medical imaging applications, MedicalNet small data network can help accelerate the convergence of the scene, improve forecasting performance.

Tian Tian: "Tars and GRPC practical application scenarios."

Here Insert Picture Description
Tars-Go early promoter and core developers, Tencent senior engineer Tian Tian in his speech analyzes the system architecture and application framework design Tars and GRPC and share some exploring on the Service Mesh for the design work microarchitecture Selection of ideas and provide technical reference.

Tars Tencent from 2008 to the present has been unified application framework TAF logic behind layers used, is a multi-language support, embedded service management capabilities, micro-services framework and DevOps well coordinated, can help businesses quickly develop, deployment, testing, gradation, on-line.

The current open-source micro-services framework can be divided into four categories, namely, no service governance, focusing on communications framework, RPC or Message Queue mode, part of the framework that supports multi-language development; Service Mesh, support services management, multilingual solve the problem by Sidecar mode , it is currently in development maturity; language with a single service management class, on the basis of the communications framework to support service management capabilities, a single programming language, Java language for the mainstream; with multi-language service management class, mainly Tars, be in communication framework governance support services and a variety of programming languages ​​basis.

Tars overall architecture can be divided into several parts DevOps, OSS, development framework and language, which is mainly associated with DevOps, including code management, code compilation, automated testing, and so on; OSS mainly some governance and service support agreement, including load balanced, fuse, service configuration, and so on; the language in this part of the main support C ++, Java, Node.js, PHP , GO and so on, the future will develop more languages.
Then take a look at GRPC, GRPC is an open source remote procedure call system initiated by Google, HTTP / 2-based transport protocol, using Protocol Buffers as an interface description language. Protocol Buffers is Google launched the serialization framework, regardless of the development language, platform, has good scalability. As with all Protocol Buffers framework sequence, it can be used for data storage, communication protocols.

GRPC overall structure is relatively simple, but if you want to apply on-line, then some difficulties, so Tencent relevant team used after GRPC framework also do a lot of things. For example scaffold generated code, service code can be generated automatically frame running through the scaffolding, just filling service business logic; automatic service registration, automatic and manual framework provides the service registration, the UI management service instances offline; built based service middleware log service, service call tracking services, basic ACL implementation, real-time reports Panic recovery services built into middleware; multi-language client & server-side code for code generation, by describing producing multi-lingual client and server code file a bond; HTTP, GRPC protocol conversion, while providing the HTTP service and the outlet GRPC two protocols, corresponding to a logic code; SDK common implementation of the service, access to the sector, the common components of the SDK business, reducing the cost.

These are mainly dependent on the PB plug-in implementation and GRPC interceptors, PB plugin can customize code generation rules can be implemented by the interceptor micro-services requires major functions, including remote logging, monitoring, reporting, link tracking, authentication service , service discovery, and so on.

Finally, Service Mesh. Service Mesh is a relatively low-level architecture, you can do some link to track such things directly from the ground, such applications enhance the applicability of the actual sinking. Compared with the traditional architecture Service Mesh architecture, service Direct Connect Architecture major source framework built using the services of a high-performance infrastructure services, while Service Mesh architecture allows changes to the code zero, by way of proxy network communication Sidecar meet diverse applications needs.

The future, the traditional framework will continue to exist, but the function will be more business oriented. Tars can have full use of the micro-service system, and the use GRPC will need to build their own peripheral system. Service Mesh gradually and slowly absorb the competency framework for substitution, Service Mesh technology will continue to reduce the cost of micro-management services, but will become more complex network architecture, which is the entry point for the next-generation network architecture.

+ Cloud technology community salon

In recent years, Tencent in accelerating the pace of open source. As of December 2019, Tencent total of foreign revenue of 92 items, including micro-letters, Tencent cloud, big data, game, AI, security and other fields, to obtain a total of more than 260,000 Star on GitHub, the number of Star in the world of open source enterprise ranked in the forefront.

Continues to contribute high-quality projects apart, Tencent also a positive contribution to the open source community, to play the power of Chinese science and technology enterprises. Up to now, Tencent has joined the Linux, Apache and other 9 major open source foundation, the depth of cooperation to become the highest levels of membership, and donated three major outstanding open source project Open Source Foundation. Tencent also actively involved in existing projects in major open-source foundation, and make an important contribution.

+ Cloud technology community salon is a "cloud + Community" Sharon planned technical line activities organized by the hope that by sharing technology to allow more developers to learn and exchange, as a platform Tencent cloud-connected developers together to create technical influence. Sharon each issue will focus on different areas of technology and direction, it is a developer gathering and sharing favorite platform.

2020 will be held portfolios salon, please note that the organizers "cloud + Community" related dynamic information.

Published 43 original articles · won praise 17 · views 50000 +

Guess you like

Origin blog.csdn.net/tencent__open/article/details/103742590