What is architecture

Introduction : In this era of knowledge-sharing explosion , in view of the integrity and prosperity java ecology , frameworks, middleware and toolkits for our use. Even new trained people know ssm, micro-services, clustering, multiple threads, queues, high concurrency techniques intermittent, technology is getting smaller , as if we need to go step by step using someone else said frameworks technology can solve the problem. If excluded Redis, RabbitMQ, Kafka used to live, Dubbo, what springcloud these specific technical framework, did you stop and really thought about architecture is it? These frameworks are playing exactly how kind of role? If you could give under the framework of a definition , how would you choose to describe the structure?

Background : A Practitioner nearly four years and I remember when I just came from industry. , Was a popular framework is the Spring, struts2, Hibernate, front-end using jsp, business is not so complicated, complicated by the entire project amount is not large , QPS is not more than 5, the prevailing practice all the front and rear ends on deployment together , labeled as war package to be deployed directly on centos on tomcat can run , so full request the actual amount of the production environment can carry , the problem is not large . Later, when I changed the company , the company uses dubbo of micro-services , all services are divided into a number service to provide to the interface calls , when the business complex scenarios than before , using two 3 servers deployed separately , hold the line load in about a few million . service logic no longer have to be modified as required so much trouble to repackage the entire project before re-deployed. after the road in the micro-services go further, all the independent service deploy, packaged into image files become docker example, deployed on a separate server docker, combined with git, deployment, operation and maintenance, the development of a rapid increase efficiency

table of Contents 

A: Architecture in the end what is

II: Common architecture technology

III: Security Issues

Four: Mistakes architecture design

Five: summary

 

A: Architecture in the end what is

Architecture in the end is what? First look at the Wikipedia explanation of the architecture:

Software architecture: is a sketch of a system. Objects software architecture describes the abstract structure of the system components directly. The connection between the various components are relatively clear and detailed description of the communication between the components. In the implementation phase, these abstract components are refined to the actual component, such as a class or to specific image.

Software Architect: interactive, user interface style, external interface method, the object of innovative design features, as well as high-level things, operations, and process logic between software architects and design software-defined modular, module

 According to Wikipedia: software architecture is actually a mutual cooperation between each other and with the components of the individual components, is a logical flow abstraction level of transactions. In simple terms architecture is how the various system components communicate, coordinate, control and logic operations. Can be compared to the body's structure give a simple example to understand is this: the human body is composed of multiple vital organs heart liver and spleen lung and stomach, ear, nose and mouth, etc., various organs carry out their duties, cooperate with each other to maintain their common among people of normal life . Here the heart, stomach and so is the entire body of the component architecture, data is blood vessels is the medium for transferring data, human skin is the user interface looks and style, is the logical organization of food to enter the stomach to digest first before it enters the large intestine rather than into the kidneys. According to this interpretation, kafka, redis, ssm, rabbitmq, xxljob are all the components we use, these components each have their own role, each assume their responsibilities to efficiently together to complete the transfer of the entire system.

II: Common architecture technology

2.1: Distributed

   Distributed : The same set of service code by business function or custom dimensions split different subsystems, each system deployed separately , each subsystem is called service , each service typically by between rpc or invoke webservice

2.1.1: Distributed advantage

Distributed advantage is the decoupling of the original system, so as to facilitate the deployment and operation and maintenance level extensions that provide scalable software, and even services can be achieved by different languages. Various modules to different people to develop, everyone divisions their job, problems can quickly locate

2.1.1: Distributed shortcomings

Distributed and not without shortcomings, the following problems:

: service call invoked by the network, the use of micro-rpc between general service calls, and the underlying TCP protocol rpc is, if a network failure or delay a little high, then there is a service call timeout possibility, if such dubbo there will be RpcException

: costs will be distributed is a disaster-type linear development, development, operation and maintenance of the rise in small business than the body size or division of case

: distributed data and transactional consistency is more difficult to guarantee business is currently the most used is the two-phase commit 2pc, need to submit a comprehensive local transactions and remote transactions, poor performance

: Distributed session maintenance in single project is not necessary to consider the safety of the session, but in a distributed environment it is necessary to consider how to maintain the same session of

⑤: Distributed Transaction issues : distributed how to ensure data consistency of individual services is a challenge, when the program appears abnormal crash to ensure each service is able to roll back to normal very important.

Distributed The most common technologies: distributed cache, distributed storage, distributed computing, distributed static resource

2.2: Cluster

  Cluster: the same set of code deployed on multiple servers and multiple servers can provide more cpu, memory, hard disk and other resources, so as to enhance the ability to process the request finishing. Each server in the cluster, called nodes, each node provides the same service, the relationship between the nodes is a simple copy, which specific node processing, it is determined according to the load balancing strategy; decline in the processing capability of the site when a simple increase in the number of servers in the cluster can significantly increase the overall amount of data processing capabilities (though there is an upper limit);

 The principle is simple: the so-called Many hands make light, polymerization with multiple servers to handle the amount of data and the ability to concurrently access will certainly be much higher than a single server; request processing bottlenecks encountered, they can be a simple increase in the number of server units improve the ability of parallel processing, but it needs to be noted that the increase in the number of servers appear after more than a certain number of performance will stagnate.

 

 

2.3:缓存

  缓存是提高软件的性能第一手段,最有效和最具代表性的方法,缓存分为单机缓存和分布式缓存。最常见的分布式缓存技术为redis、memorycache等,单机缓存比如hashmap、concurrentHashmap、guava等单机缓存的承载容量有限,而分布式缓存的伸缩性和的存储容量会比较可观,就算缓存的空间不足了,也可以通过增加服务器来扩展。

缓存最显著的作用有两个:①加快数据的访问速度②分担后端的数据访问和存储的负载能力,保护数据库

使用缓存需要注意以下几点:

①:缓存雪崩

缓存雪崩指的是所有的缓存在统一时间全部失效,导致大量的请求直接涌入数据库,数据库被击垮。

解决缓存雪崩的方法:缓存过期值在一定的基础上设置随机值

②:缓存击穿

  缓存击穿是指某些热点key在某一时间全部失效了,导致大量的请求涌入后台DB数据库

解决缓存击穿的方法:热点数据设置永不过期

③:缓存穿透

  一直请求不存在的数据,最终走的还是数据库就是缓存穿透

解决缓存穿透的方法:采用布隆过滤器(bloomFilter),布隆过滤会有一定的误差,但是可以晒选出一定不存在的数据,缺点是无法判定某个key是否确定存在。

 2.4:队列

   试想这样一个的高请求量场景:各大电商的双11,在双11的那一刻,有大量订单涌入,后端会接受请求,然后写入数据库,等待数据库的返回.如果请求量非常大的话,数据库读写IO就会阻塞,那么程序就会出现卡死,数据库崩溃等问题

如果采用队列的话,将下单请求发送到队列中,然后立刻返回(可以按照业务决定,比如返回处理中,等到真正成功再通知用户),这样就不需要等待后端必须返回成功。消费端可以按照请求的顺序平滑的去消费,缓解了高峰的请求,并且实现了请求下单和实现下单的解耦从以下图可以看出使用队列以后处理起来比较平滑~

 2.5:多线程 

    多线程真正的意义有两个①提高cpu的利用率  ②:加快程序执行效率,目前已经是多核的时代,服务器六核、八核屡见不鲜,在多核的cpu中如果使用单线程那么无疑是对多核cpu的浪费,多线程能够有效提高cpu利用的效率,多个任务分给多个cpu去处理,可以实现真正的并行处理。如果在单核cpu中,只是cpu在不停的切换cpu时间。假设我们有十个表格的数据需要分析处理(计算密集型),采用单线程需要一个个的轮询表格,而多线程在合理分配线程数的情况下就可以同时处理,提高开发的效率

  2.6:限流

    限流是面对高并发的利器之一,例如秒杀场景:在大量的请求涌入后台,QPS高达几十万,如果不能做到有效控制就可能导致请求击垮数据库,DB基本上是一个网站的命脉。缓存、队列、限流等方式的本质其实都是为了保护DB。限流的简单理解其实就是过滤掉无效的请求,将请求限制在一个可以控制的范围内,最常见的限流有以下方式: 

 ①: Redis限流

  Redis限流的基本思路是采用redis的key过期策略,将业务id和业务值放入到redis中设置一定的过期时间,等请求再次进入的时候,如果能从redis获取到值,那么我就因为是重复性请求。Redis过滤限流是最基础的限流手段,适用于过滤同一个用户请求的场景

②:令牌桶算法

令牌桶算法的思路是在一定的时间内生成以固定的速度生成有限个令牌数量放入桶中,所有的请求首先从令牌桶中去尝试获取令牌,如果能获取到就可以继续执行,否则请求就会被抛弃。Google开源的guava中有RateLimter可以实现单机限流,令牌桶算法是限流非常有效的手段,

③:  漏桶算法

漏桶算法的基本原理是将请求直接存放在一个漏斗中,请求过多的话,那么就会漏斗就会溢出,溢出的请求则会被拒绝服务。漏桶算法可以控制端口的流量输出速率,平滑请求的突发流量,实现流量整形.

因为漏桶算法的漏出速率是有效的,因此漏桶算法相比于令牌桶算法有一个显著的缺点是无法应对突发性的流量.可令牌桶算法是可以的

④:  滑动窗口

http为了控制流量的速率采用的方法就是滑动窗口机制。如果要分布式限流,可采用阿里的Sentinel框架,其基本原理是滑动窗口机制,利用Entry映射资源来平滑的限流

   此外还有nginx限流,比如使用参数来限制某一个ip的在时间范围内的访问频率。客户端限流:发起请求按钮点击后,在后面的几秒内(由业务决定)设置为disabled,这一操作步骤虽然很小,但是带来的限流作用很可观

2.7:服务降级和熔断

    很多人会忽视这个问题,对自己设计出来的架构盲目自信,认为不可能出问题。而事实上,一旦随着微服务和分布式架构的持续推进,服务器会越来越多,宕机的概率和可能性会逐步提升,虽然出现宕机的可能性基本上很渺茫,不过也应该做好服务降级和熔断的准备,以防止那万分之一的概率宕机。假设有1000台服务器发生宕机的概率是0.001%,就因为存在0.001%的概率会导致我们的服务并非100%高可用。

服务的降级和熔断一般采用的是netfly(没错,就是那个美剧巨头公司)出的hystrix,可以实现服务熔断和降级

三:安全性问题

   安全性的问题总是不被重视,其实安全的问题要比我们想像的要严重的多.大公司每时每刻都会有不同程度被攻击者发起攻击,一旦被黑客获取到数据库信息,那么将会有丢失用户信息、服务器被植入木马病毒、服务瘫痪等不容小觑的危险

3.1  sql注入

sql注入是目前所有方式中最频繁也是最严重的攻击手段,sql注入如果被居心叵测的黑客攻击很可能整个数据库都会被删除掉,其情节和结果十分恶劣。防止sql注入的有效方式就是
采用jdbc提供的preparementStatement进行预编译,它能有效保证sql的整体结构不会被破坏,万一被sql攻击也可以在预编译阶段失败,而不会执行成功

3.2 跨域攻击 

 与主站的域名、端口、协议不一致性的请求都可以理解为跨域访问,浏览器有同源策略:浏览器会限制来自于不同源的documet和脚本对当前的document读取或设置部分属性,但是比如src\form表单提交\< img >\< iframe >\< link >是没有跨域限制的。

csrf攻击:登陆网站A,获取到了网站A的cookie用户信息,然后点击了一个恶意网站外链B,网站B可以利用csrf漏洞模拟A网站的用户信息去请求A的某些敏感接口,比如转账、发送消息、邮件、获取部分信息、发起恶意代码等。

如何防止csrf攻击:①接口请求加上随机的token值或者token约束的规则,或者是有时效性的token码。这样的话,外链去访问接口在拦截器中验证token是否有效,②在http的头部加入自定义参数:放到 HTTP 头中自定义的属性里。通过 Ajax,可以一次性给所有该类请求加上 csrftoken 这个 HTTP 头属性,并把 token 值放入其中

③减少使用get提交,get提交会降低门槛

3.3:XSS攻击

   xss攻击指的是攻击者对包含有漏洞的服务器注入js代码,会诱使受害者打开攻击的服务器URL,其中里面的URL会包含一些恶意代码,比如植入病毒、添加广告片段代码、篡改接口信息等。

预防xss攻击的方法:对于用户提交的内容,需要过滤任何有执行能力的脚本或者影响页面的CSS,

 四:  架构设计的误区

 4.1:为了高大上而设计出复杂的架构

试想如果在业务体量不是特点大的情况下,如果一味的追求时髦,追求新颖,采用分布式微服务架构,那么将会增加业务开发的难度,为了维护大量的微服务而多出很多成本。好的架构一定是适应于自身的业务发展的,而高于业务的,它具有顺应业务发展的前瞻性。

4.3:用技术可以解决一切问题

   企图用技术解决一切问题,认为技术是一切的解决之道,是万能的,其实有的时候技术解决不了的问题可以用从业务角度来考虑解决。比如之前楼主做个售票系统,主要是卖某个知名景点的票的业务。后期上线后发现很多人买了很多特价免费票,特价票是针对导游带领的团员的,每个导游每天只能买一张特价免费票和一张半价票,之后看数据发现了很多导游配了两张票,然后那张半价票被退掉,只剩下一张免费票被刷了。技术总监就决定查这部分数据是怎么回事?结果发现部分人利用导游证这个特惠故意购买无价票,这个问题如何从技术上解决呢?如果不允许导游买特惠票不合理,不允许退票也不合理。技术上貌似没有好的手段去杜绝这个问题,只能从线下去处理。

五:总结

    本篇文章的主要概略图我总结了一下,大概如下,其中包括分布式、集群、缓存、微服务、队列等。架构的话题弥足长远和复杂,不是一篇简单的文章能描述清楚的。如果要讲明白,估计得写个连载了。本篇文章只是提纲挈领以下,说实话也是蜻蜓点水,希望能起到抛砖引玉的效果,不过在工作中思考、在实践中总结学习,是有助于提高我们的内功心法的。

 

     最后:如果对学习java有兴趣可以加入群:618626589.本群不同于其他的垃圾群,旨在打造绝无培训广告、无闲聊扯淡、无注水斗图的纯技术交流群,群里每天会分享有价值的问题~目前已分享200多道,欢迎各位技术er随时加入 

 

Guess you like

Origin www.cnblogs.com/wyq178/p/12151160.html