Internet back-end infrastructure technology

Java back-end technology purpose is to build business applications, to provide users with online and offline services. Therefore, a business application which needs technology, which rely on the infrastructure determines the need to know what the back-end technology. Throughout the entire Internet system technology combined with the company's current situation, I believe that essential or critical back-end infrastructure technology / facilities as shown below:

Here's the back-end infrastructure mainly refers to the stable operation of online applications need to rely on the key components or services. Develop or build more good back-end infrastructure, in general, it is capable of supporting the business in a very long period of time. In addition, for a complete architecture, there are still many applications imperceptible system of basic services, such as load balancing, automated deployment, and system security, and not included in the description of the scope of this chapter.

1. unified requests entry gateway -API

During development of the mobile APP, the interface is generally required to provide back-end support the following features:

• Load balancing
• API access control
• user authentication

Here Insert Picture Description

The general practice, using Nginx to do load balancing, and then do the API interface in each business application in access control and user authentication, a little more optimized way is to put the latter two made a public library for all business calls. But on the whole, the public needs three characteristics are all business, more desirable way is integrated together as a service, either dynamically modify access control and authentication mechanisms, can also reduce each of these business integration cost mechanisms. This service is the API Gateway, you can choose your own implementation. You can also use open source software, such as Kong and Netflix Zuul. API gateway general architecture as shown below:

Here Insert Picture Description

But the problem more than a solution because all API requests to go through a gateway, it can easily become a bottleneck in system performance. Therefore, the program can be taken: to remove the API gateway, allowing business applications direct docking unified authentication center, on the basis of the framework level to ensure that each API call needs to be certified by the unified authentication center, where you can take the results of the authentication cache way to avoid unified authentication center requests excessive pressure.

2. The business applications and back-end infrastructure framework

Business applications are divided into: online business applications and internal business applications.

  • Online business applications: Internet directly to the user's applications, interfaces, etc., the typical feature: a request volume, high concurrency, low tolerance for failure.

  • Internal business applications: mainly for internal users of the company's application. For example, internal data management platform, advertising platforms. Compared to online business applications, which is characterized by: data high security, low pressure, a small amount of concurrency, allow failure.
    Business application development framework based on the basis of back-end, back-end for Java, you should have the following framework:

  • MVC framework: unified development process, improve development efficiency, to shield some key details of the Web / back-end framework. Typical of such SpringMVC, Jersey and JFinal people develop and Ali WebX.

  • IOC framework: dependency injection / inversion of control framework. Java is the most popular Spring Framework is the core function of IOC.

  • ORM framework: details can shield the underlying database, database operations framework provides uniform data access interface, additionally support the distributed nature of clients from the master, sub-libraries, sub-table and the like. MyBatis is the most popular ORM framework. In addition, JdbcTemplate Spring ORM provided is also very good. Of course, for a sub-library sub-table, separated from the main these requirements, generally need to implement, there are open source Ali TDDL, Dangdang sharding-jdbc (to solve the sub-library sub-table from the datasource level, literacy separate issue, transparent to the application, zero intrusion). In addition, in order to solve the unified service level sub-library sub-table, separate read and write, standby switching, caching, fault recovery and other issues, many companies all have their own database middleware, such as Ali Cobar, 360 of the Atlas (based on MySQL-Proxy), a DDB netease the like; open source there myCat (based of Cobar) and Kingshard, wherein Kingshard has a certain line using a scale. MySQL official also provides the Proxy MySQL, custom scripts can be used lua master from separate read and write, these logical partitions, but its performance is poor, use less current.

  • Frame buffer: for Redis, Memcached unified cache software package these operations, capable of supporting distributed client programs, and the like from the master. Spring is generally used to RedisTemplate can also use Jedis do their own package supports distributed client program, such as master and slave.

  • JavaEE application performance testing framework: For JavaEE application line, there is a need to be integrated into a unified framework for each service in each request detection method calls, JDBC connection, like connection takes the Redis, status. Jwebap is a performance testing tool that can be used, but because it has not been updated for many years, and if possible make recommendations based on the secondary development of this project.

In general, more than a few framework which can complete a prototype of the back-end application.

3. caching, database, search engine, message queues

Caching, database, search engine, message queues, these four applications are dependent on back-end infrastructure services, their performance directly affects the overall performance of the application, sometimes you write better code is perhaps because these services can not cause application performance upgraded.

  • Cache: Cache is commonly used to solve the problem of hot data access, data is a powerful weapon to improve query performance. The backend application high concurrency, persistence data is loaded into the buffer layer, and the ability to isolate high concurrent requests backend database, the database to avoid the large number of requests to be crushed. The most commonly used in addition to the local cache in memory, more generally, the focus has Memcached caching software and Redis. Which has become the most mainstream of Redis caching software.

  • Database: database back-end application can be said to be the most basic infrastructure. Basically the majority of business data is stored in a persistent database. Mainstream database includes traditional relational databases (MySQL, PostgreSQL), and in recent years became popular NoSQL (MongoDB, HBase). Wherein HBase is a column for a large database of data fields, queries are limited in their performance, it is not generally used for database operations.

  • Search engine: The search engine query software designed for full-text search and data in various dimensions. Currently used more and Solr open source software is Elasticsearch, are based on Lucence to achieve, the main difference lies in support termIndex storage, distributed architecture and so on. Elasticsearch due to good support for clustering and high-performance implementation, has gradually become the mainstream open-source search engine program.

  • Message queues: one way is to transfer data through the message queue. Now more common with the log message queue comprises Kafka designed (or the ActiveMQ) and RabbitMQ heavy matters and the like. In the message is lost is not particularly sensitive message transactions and does not require the scene, select Kafka can achieve higher performance; otherwise, RabbitMQ is a better choice. Further, it is an implementation ZeroMQ network programming Pattern message queue library, the Socket located above, below MQ.

4. File Storage

Whether business applications rely on back-end services or other services will still be dependent on the underlying file storage. In general, we need to meet the file storage features are: reliability, disaster recovery, stability, that is, to ensure that the stored data will not easily be lost, even if a failure occurs can have a rollback plan, but also to ensure high availability. At the bottom can be used as a traditional RAID solutions to the next level, Hadoop's HDFS is currently the most widely distributed file storage solution, of course, NFS, Samba shared file system that also provides a simple distributed storage features.

In addition, if the file is indeed becoming a bottleneck must improve the performance of applications or files stored so as to enhance overall system performance, the most direct and simplest way is to abandon the traditional mechanical hard disk, replace the hard disk with SSD. Like many companies in solving business performance issues, the final key point is often SSD. This is also the time and labor costs in exchange for the most direct and most effective way to use money. SSDB in the database is part of the following description LevelDB package, using a high-performance database KV SSD drive characteristics.

As for HDFS, if you want to use the data above, is the need for Hadoop. Some techniques similar xx on Yarn is to run on non-Hadoop HDFS technology solutions.

5. Unified Authentication Center

Unified authentication center, mainly for APP certification service users, internal users, such as APP, including:

  • User registration, login authentication, Token Authentication
  • Internal information systems management and user login authentication
  • APP management, including generation secret of APP, APP authentication information (e.g., signature verification Interface) and the like.

The need for unified authentication center, is to be able to centrally manage all of this information will be used APP, but also provides a unified authentication service to all applications. In particular, there are many business users need to share data when building a unified authentication center is necessary. In addition, to build a mobile APP single sign-on is a natural thing through unified authentication center: Web imitate the mechanism, the encrypted authentication information stored in the local storage for multiple use APP.

6. SSO

At present, many large Web sites are online single sign-on system, it is the only time a user logs popular, can enter multiple business applications (permissions can be different), very user-friendly operation. In the mobile Internet companies, a variety of internal management, information systems and even external applications also require a single sign-on system.

Currently, more mature, the largest single sign-on systems should be open source CAS, Yale University

Basically, the principle is similar single sign shown below:

Here Insert Picture Description

7. unified configuration center

In Java back-end application, a configuration more common way is to read and write configuration files written in Propeties, YAML, HCON other documents, modify the update files only when you need to re-deploy them, you can do not involve the code level the purpose of the changes. Uniform distribution center, is based on the unity of all business-related configuration files, or back-end service basis for unified service management over this way, with the following features:

  • Online can dynamically modify the configuration file and take effect
  • Profiles can be distinguished environment (development, test, production, etc.)
  • In Java annotations can, XML configuration way to introduce configuration

Baidu and Ctrip open source Disconf Apollo program is in production use, they can also develop their own distribution center according to their needs, usually selected as the Zookeeper configuration storage.

8. Service Governance Framework

For external API calls or the client access to the backend API, you can use the HTTP protocol or RESTful (of course, can be called directly by the most primitive socket). But for calls between internal services, generally through the RPC mechanism to call. The current mainstream RPC protocol are:

  • RMI
  • Hessian
  • Thrift
  • Dubbo

The RPC protocols have advantages and disadvantages, we need to make the best choice for business needs.

这样,当你的系统服务在逐渐增多,RPC调用链越来越复杂,很多情况下,需要不停的更新文档来维护这些调用关系。一个对这些服务进行管理的框架可以大大减少因此带来的繁琐的人力工作。

传统的ESB(企业服务总线)本质就是一个服务治理方案,但ESB作为一种proxy的角色存在于Client和Server之间,所有请求都需要经过ESB,使得ESB很容易成为性能瓶颈。因此,基于传统的ESB,更好的一种设计如下图所示:

Here Insert Picture Description

如图,以配置中心为枢纽,调用关系只存在于Client和提供服务的Server之间,就避免了传统ESB的性能瓶颈问题。对于这种设计,ESB应该支持的特性如下:

  • 服务提供方的注册、管理
  • 服务消费者的注册、管理
  • 服务的版本管理、负载均衡、流量控制、服务降级、资源隔离
  • 服务的容错、熔断

阿里开源的Dubbo则对以上做了很好的实现,也是目前很多公司都在使用的方案;当当网的扩展项目Dubbox则在Dubbo之上加入了一些新特性。目前,Dubbo已经被阿里贡献给Apache,处于incubating状态。在运维监控方面,Dubbo本身提供了简单的管理控制台dubbo-admin和监控中心dubbo-monitor-simple。Github上的dubboclub/dubbokeeper则是在其之上开发的更为强大的集管理与监控于一身的服务管理以及监控系统。

此外,Netflix的Eureka也提供了服务注册发现的功能,其配合Ribbon可以实现服务的客户端软负载均衡,支持多种灵活的动态路由和负载均衡策略。

9. 统一调度中心

在很多业务中,定时调度是一个非常普遍的场景,比如定时去抓取数据、定时刷新订单的状态等。通常的做法就是针对各自的业务依赖Linux的Cron机制或者Java中的Quartz。统一调度中心则是对所有的调度任务进行管理,这样能够统一对调度集群进行调优、扩展、任务管理等。Azkaban和Yahoo的Oozie是Hadoop的流式工作管理引擎,也可以作为统一调度中心来使用。当然,你也可以使用Cron或者Quartz来实现自己的统一调度中心。

  • 根据Cron表达式调度任务
  • 动态修改、停止、删除任务
  • 支持任务分片执行
  • 支持任务工作流:比如一个任务完成之后再执行下一个任务
  • 任务支持脚本、代码、url等多种形式
  • 任务执行的日志记录、故障报警

对于Java的Quartz这里需要说明一下:这个Quartz需要和Spring Quartz区分,后者是Spring对Quartz框架的简单实现也是目前使用的最多的一种调度方式。但其并没有做高可用集群的支持。而Quartz虽然有集群的支持,但是配置起来非常复杂。现在很多方案都是使用Zookeeper来实现Spring Quartz的分布式集群。

此外,当当网开源的elastic-job则在基础的分布式调度之上又加入了弹性资源利用等更为强大的功能。

10. 统一日志服务

日志是开发过程必不可少的东西。打印日志的时机、技巧是很能体现出工程师编码水平的。毕竟,日志是线上服务能够定位、排查异常最为直接的信息。

通常的,将日志分散在各个业务中非常不方便对问题的管理和排查。统一日志服务则使用单独的日志服务器记录日志,各个业务通过统一的日志框架将日志输出到日志服务器上。

可以通过实现Log4j或者Logback的Appender来实现统一日志框架,然后通过RPC调用将日志打印到日志服务器上。

11. 数据基础设施

数据是最近几年非常火的一个领域。从《精益数据分析》到《增长黑客》,都是在强调数据的非凡作用。很多公司也都在通过数据推动产品设计、市场运营、研发等。这里需要说明的一点是,只有当你的数据规模真的到了单机无法处理的规模才应该上大数据相关技术,千万不要为了大数据而大数据。很多情况下使用单机程序+MySQL就能解决的问题非得上Hadoop即浪费时间又浪费人力。

这里需要补充一点的是,对于很多公司,尤其是离线业务并没有那么密集的公司,在很多情况下大数据集群的资源是被浪费的。因此诞了 xx on Yarn 一系列技术让非Hadoop系的技术可以利用大数据集群的资源,能够大大提高资源的利用率,如Docker on Yarn。

  • 数据高速公路
    接着上面讲的统一日志服务,其输出的日志最终是变成数据到数据高速公路上供后续的数据处理程序消费的。这中间的过程包括日志的收集和传输。

    • 收集:统一日志服务将日志打印在日志服务上之后,需要日志收集机制将其集中起来。目前,常见的日志收集方案有:Scribe、Chukwa、Kakfa和Flume。对比如下图所示:
      Here Insert Picture Description
      此外,Logstash也是一个可以选择的日志收集方案,不同于以上的是,它更倾向于数据的预处理,且配置简单、清晰,经常以ELK(Elasticsearch + Logstash + Kibana)的架构用于运维场景中。
      • 传输:通过消息队列将数据传输到数据处理服务中。对于日志来说,通常选择Kafka这个消息队列即可。
      此外,这里还有一个关键的技术就是数据库和数据仓库间的数据同步问题,即将需要分析的数据从数据库中同步到诸如Hive这种数据仓库时使用的方案。可以使用Apache Sqoop进行基于时间戳的数据同步,此外,阿里开源的Canal实现了基于binlog增量同步,更加适合通用的同步场景,但是基于Canal还是需要做不少的业务开发工作。
  • 离线数据分析
    离线数据分析是可以有延迟的,一般针对的是非实时需求的数据分析工作,产生的也是延迟一天的报表。目前最常用的离线数据分析技术除了Hadoop还有Spark。相比Hadoop,Spark性能上有很大优势,当然对硬件资源要求也高。其中,Hadoop中的Yarn作为资源管理调度组件除了服务于MR还可以用于Spark(Spark on Yarn),Mesos则是另一种资源管理调度系统。
    对于Hadoop,传统的MR编写很复杂,也不利于维护,可以选择使用Hive来用SQL替代编写MR。而对于Spark,也有类似Hive的Spark SQL。
    此外,对于离线数据分析,还有一个很关键的就是数据倾斜问题。所谓数据倾斜指的是region数据分布不均,造成有的结点负载很低,而有些却负载很高,从而影响整体的性能。处理好数据倾斜问题对于数据处理是很关键的。

  • 实时数据分析
    相对于离线数据分析,实时数据分析也叫在线数据分析,针对的是对数据有实时要求的业务场景,如广告结算、订单结算等。目前,比较成熟的实时技术有Storm和Spark Streaming。相比起Storm,Spark Streaming其实本质上还是基于批量计算的。如果是对延迟很敏感的场景,还是应该使用Storm。除了这两者,Flink则是最近很火的一个分布式实时计算框架,其支持Exactly Once的语义,在大数据量下具有高吞吐低延迟的优势,并且能够很好的支持状态管理和窗口统计,但其文档、API管理平台等都还需要完善。
    实时数据处理一般情况下都是基于增量处理的,相对于离线来说并非可靠的,一旦出现故障(如集群崩溃)或者数据处理失败,是很难对数据恢复或者修复异常数据的。因此结合离线+实时是目前最普遍采用的数据处理方案。Lambda架构就是一个结合离线和实时数据处理的架构方案。
    此外,实时数据分析中还有一个很常见的场景:多维数据实时分析,即能够组合任意维度进行数据展示和分析。目前有两种解决此问题的方案:ROLAP和MOLAP。

    • ROLAP:使用关系型数据库或者扩展的关系型数据库来管理数据仓库数据,以Hive、Spark SQL、Presto为代表。
    • MOLAP:基于数据立方体的多位存储引擎,用空间换时间,把所有的分析情况都物化为物理表或者视图。以Druid、Pinot和Kylin为代表,不同于ROLAP(Hive、Spark SQL), 其原生的支持多维的数据查询。
      如上一小节所述,ROLAP的方案大多数情况下用于离线数据分析,满足不了实时的需求,因此MOLAP是多维数据实时分析的常用方案。对于其中常用的三个框架,对比如下:
      Here Insert Picture Description
      其中,Druid相对比较轻量级,用的人较多,比较成熟。
  • 数据即席分析

离线和实时数据分析产生的一些报表是给数据分析师、产品经理参考使用的,但是很多情况下,线上的程序并不能满足这些需求方的需求。这时候就需要需求方自己对数据仓库进行查询统计。针对这些需求方,SQL上手容易、易描述等特点决定了其可能是一个最为合适的方式。因此提供一个SQL的即席查询工具能够大大提高数据分析师、产品经理的工作效率。Presto、Impala、Hive都是这种工具。如果想进一步提供给需求方更加直观的ui操作界面,可以搭建内部的Hue。Here Insert Picture Description

12. 故障监控

对于面向用户的线上服务,发生故障是一件很严重的事情。因此,做好线上服务的故障检测告警是一件非常重要的事情。可以将故障监控分为以下两个层面的监控:

  • 系统监控:主要指对主机的带宽、CPU、内存、硬盘、IO等硬件资源的监控。可以使用Nagios、Cacti等开源软件进行监控。目前,市面上也有很多第三方服务能够提供对于主机资源的监控,如监控宝等。对于分布式服务集群(如Hadoop、Storm、Kafka、Flume等集群)的监控则可以使用Ganglia。此外,小米开源的OpenFalcon也很不错,涵盖了系统监控、JVM监控、应用监控等,也支持自定义的监控机制。

  • 业务监控:是在主机资源层面以上的监控,比如APP的PV、UV数据异常、交易失败等。需要业务中加入相关的监控代码,比如在异常抛出的地方,加一段日志记录。

监控还有一个关键的步骤就是告警。告警的方式有很多种:邮件、IM、短信等。考虑到故障的重要性不同、告警的合理性、便于定位问题等因素,有以下建议:

  • 告警日志要记录发生故障的机器ID,尤其是在集群服务中,如果没有记录机器ID,那么对于后续的问题定位会很困难。
  • 要对告警做聚合,不要每一个故障都单独进行告警,这样会对工程师造成极大的困扰。
  • 要对告警做等级划分,不能对所有告警都做同样的优先级处理。
  • The use of micro-channel software as an alarm, SMS can save costs in the case, to ensure the arrival rate alarms.
    After the fault alarm, the most critical is the deal with the. For startups, on call 24 hours is an essential quality, when faced with an alarm, the need for failure to respond as quickly as possible, to find the problem and solve the problem in a controlled time. For troubleshooting failure problems, they are substantially dependent on the log. As long as the log hit reasonable, under normal circumstances it is able to quickly locate the problem, but if the service is distributed, and the amount of log data under particularly large, how to locate log has become a problem. Here are a few options:
  • Establish ELK (Elasticsearch + Logstash + Kibana) centralized log analysis platform for fast search, locate logs. Yelp Elastalert with open source can achieve alarm function.
  • Build a distributed system for tracking requests (also known as full-Link monitoring system), distributed systems, especially for micro-service architecture, capable of great convenience to quickly locate and collect the requested information in a single abnormal mass call, you can quickly locate a requesting performance bottleneck link. Mercury will be the only product of Ali's Hawkeye, Sina WatchMan, Twitter open source Zipkin basically based on Google's Dapper papers from public comment in real-time application monitoring platform in support of CAT is distributed request tracking (Code intrusive) He joined the call performance statistics fine-grained basis. In addition, Apache is hatching HTrace is such as HDFS file system for large distributed systems, tracing scheme HBase distributed storage engine design. And if you micro-service implementation uses Spring Cloud, then Spring Cloud Sleuth is the best of distributed tracking program. Also be mentioned that, Apache Incubator in SkyWalking is based on a comprehensive APM (Application Performance Monitoring) system distributed track, its biggest feature is based on Java agent + instrument api, without any invasion of the business code, Pinpoint APM is another similar system has been used in the production environment.

Guess you like

Origin blog.csdn.net/qq_37651267/article/details/94766789