Thoughts on the technological evolution of large-scale websites (8)--Storage bottlenecks (8)

Source: Summer Forest

                                    (Sync issues, delays)

Dual-core design of storage layer-----------------------------------"Main and Standby

 

Before starting the main content of this article, let's take a look at the following screenshots, the first is the first picture, as shown below:

 

  This is the homepage of an e-commerce website. When we open this homepage for the first time, the website will pop up a mandatory dialog box for the user to select the delivery address of the goods. If it is Taobao and Jingdong, then the delivery address is selected. The options are in the product, as shown in the following figure, Taobao chooses the delivery location:

 

  The following picture shows Jingdong's choice of delivery location:

 

  So what is the difference between Figure 1 and JD.com and Taobao? After the e-commerce in Figure 1 forces users to select a region, then when we search for this product, the displayed query results will be different because of the different regions. This is a bit similar to the internationalization of a website, but the internationalization of a website is all languages ​​and languages. Related static resources, but the selection of the e-commerce region is related to the business, and the query results of different regions are different. The pop-up box for selecting a region is very similar to a router. In contrast, Taobao and JD.com related the delivery of goods to goods, so when we searched for goods on these websites, we actually searched according to the national query, and the results obtained by querying the same condition in different parts of the country were consistent. From a business point of view, this shows that the first e-commerce business has not been rolled out nationwide. Even if it is rolled out, regional differences will also affect logistics problems, while Taobao and JD.com are large-scale e-commerce companies of national significance. website.

  Going back to the technical point of view, is it possible that these two different approaches are also related to technical issues? Today I will discuss this issue.

  No matter how big or small a website is, a website can definitely be divided into a client, a server and a storage. These different components are connected to the network. The network is a communication facility. The distance will directly affect the efficiency of network transmission . If so, technologies like CDN have appeared, and many large Internet companies will also build computer rooms in different cities. The purpose of these measures is to solve the impact of distance on network transmission efficiency, but when such a solution to the nearest problem falls. When it comes to the storage layer, the problem comes. In the last article, I talked about the horizontal expansion of web services. This horizontal expansion is designed based on a stateless principle, but when it comes to the storage layer, no matter how we split it, it is difficult to eliminate the state problem, that is, The state of the storage layer is its natural property. Especially when encountering a competitive storage resource, this state will become very stubborn, such as the inventory problem of commodities. If we move the inventory data to data centers in different regions, how can we ensure that the data in different places is guaranteed? Inventory information is always accurate, and that becomes the conundrum. This kind of problem is not a problem in a small country, but it is very problematic in China, which has a vast land and abundant resources. So storage is the shortcoming of this approach.

  I once learned that when a large information company in China designed its first-generation system, it took into account the impact of such regional differences on system design. Their first-generation system was designed as a dual-core system at the storage layer. , What is the dual-core system of the storage layer? Their approach is to establish two data centers in Beijing and Shanghai, respectively, and the storage layer of the system is deployed in the data center in Beijing and the data center in Shanghai. The two data centers are equivalent, so the transaction in northern China will go to Beijing Data. Center, transactions in southern China go to the data center in Shanghai. However, after the system was launched, it was found that this dual-core design had become a nightmare for the entire system. The core problem of this nightmare was data synchronization . Because the enterprise is a national business enterprise, there are a large number of transactions that require north-south data. The data can be completed normally after the center has finished synchronizing the data, but the efficiency of synchronizing data from Beijing and Shanghai is extremely inefficient. I once read a document, which said that an organization has done a test. When the distance between the two data centers exceeds After 80 kilometers, the delay of the network is basically unbearable. Of course, companies with good money can lay a dedicated line to connect the two data centers . The cost of this kind of dedicated line is terrifyingly high. If the dedicated line is from Pudong to Puxi, then this dedicated line is basically paved in RMB, not to mention the dedicated line from Beijing to Shanghai. Even if the company is not bad, the delay will seriously affect the development of the company's business. In addition to delay, large-scale data transmission through the network, the reliability of data is difficult to guarantee, that is, packet loss is often unreasonable during network transmission , which results in a lot of repetitive transmission, making the efficiency of synchronizing data more efficient. inefficient.

  因为存储层这种双核设计缺陷,该企业马上从事了二代系统的设计和开发,而这个二代系统核心业务就是解决这个存储层的双核问题。那到底该怎么解决了?把双核变成单核,既然两个数据中心这么麻烦,那我们就搞一个数据中心算了,既省钱有没那么多麻烦事情,这个肯定不是解决问题的正确思路了,双核设计的出发点是非常有现实意义和价值的,最后该公司使用了一个新的方案替代双核,这个方案称之为主备方案存储层仍然部署到两个数据中心,到了业务运行阶段,一个数据中心为主,一个数据中心为辅,不过这个主备方案绝不是通常意义的数据备份方案,他其实是吸收了单核和双核方案的优点,同时尽量避免单核和双核的缺点,那么这点上这个主备方案是如何做到的呢?

  首先我们还是要把系统业务交易分下类,系统有些交易对于实时性啊,数据的正确性啊要求非常高,那么这样的业务场景使用单核存储系统比较合适,一个业务系统不可能全是这样的实时性交易,也有一些交易对实时性要求比较差,当然我们还是得要考察下这种交易对于延时容忍度,具体就是一般延时多久用户是可以接受的,这点非常重要,因为就算是主备方案,那么数据还是会有同步的操作,只不过这个同步的时间粒度上会更粗些,我们可以以系统和业务角度合理设置一个同步时间间隔,如果延时性交易的延时时间超过了这个间隔时间的话,那么这样的业务场景其实是可以就近处理的,没有必要将这些请求都发送到主数据中心,这样可以减轻主数据中心的运行压力。该企业的二代信息系统还有个要求就是过了每天的零点,前一天的数据必须在两个数据中心完成同步,换句话说,两个数据中心数据的差异性最大容忍度是天,为什么要这样做了?有的朋友看到了一定认为这是为了备份数据,的确这是目的之一,但这个做法还有更大的深意,双核设计除了解决距离对网络效率的影响外,还有个重要的目的就是容灾,我记得几年前,有个朋友告诉我他们公司网站挂了6个小时,我当时很奇怪,我就问你们系统难道不是分布式吗?他说他们线上系统没有单点,那为什么网站还会整个挂掉了?答案真的让人不敢相信,因为他们的机房漏雨了,机房的线路短路了,那个朋友告诉我这件事情以后,他们公司又在附近租了个新机房做容灾,防止此类事情再发生了。这种情况真的可以称之为天灾了,不过这样的事情概率很低,可是一旦发生就会非常致命,记得日本爆发九级大地震的时候,我看到一个新网报道,报道里面有好多大型计算机倒掉了,而这个机房的机器的作用几乎关系到亚洲互联网系统的命脉,大家都知道每个网站都有自己的域名,域名是一个网站的入口,而日本那个机房放置的服务器就是全球赫赫有名的13台服务器之一,专门用来解析域名的DNS服务器,如果这些机器挂掉了,可能发生一整个国家都不能正常使用互联网。但是天灾毕竟是局部的,因此全国甚至全球设立不同的数据中心用来容灾是很多大型互联网公司必须走的道路,回到本文的主备方案,为了保证数据中心的容灾性,那么我们再设计主备方案同时还要保证主备数据中心可以迅速切换,当一个数据中心出现问题时候可以马上把辅助的数据中心转化为主数据中心。为了保证这种切换的可靠性,该企业经常在晚上交易量小的时候,把主备来回切换跑跑。

  回到开篇提到的那三张截图,那个一开始弹出地域选择框的电商网站,当我们选择不同的地域时候,查询同样的商品最后显示的商品列表是不同,而京东虽然也有地域选择,但是我们切换地域后查询商品后结果基本没有变化,至于淘宝和天猫压根就没有让我们选择地域的选项,配送都是在商品这边进行选择的。可能淘宝和天猫没有自营业务,因此天猫很难控制里面商家的地域区别,京东和前面哪家电商网站因为大部分是直营业务,因此配送地址和他们仓储所在地是有关系的,其实这个做法衍生下的话,地域其实还可以做到数据中心的划分,例如江沪浙用一个数据中心,中部地区用一个数据中心,那么这种方式就可以帮助我们解决存储层的就近问题,从这里我们似乎也可以看出B2C和C2C的业务场景的一些区别。

  由此我可以做一个总结,首先存储层做到对等多核的体系基本是不可能的,主备的方案可以解决单核和多核的缺点,同时可以发扬单核和多核的优点,距离的远近也能产生业务的差异性,我们可以通过这种差异性把数据中心变成分散式,这样还可以解决数据访问的就近原则

  美国的互联网公司规模很大,他们从一开始就是全球化的,那么对于美国的大型互联网公司将数据中心分散化和本地化就变的非常重要,所以好的存储层的分布设计方案是完成网站全球布局任务的基础。但是对于很多中小企业,或者是刚刚创业的公司能在不同地域建立数据中心,或者不差钱但是能快速的建立不同地域的数据中心其实是非常难的事情,那么这个时候我们找一家全球性的云平台例如亚马逊的云平台,或者我们的业务就局限在中国,使用个本土优秀的云平台也是一种不错的选择,云计算的推广使得创业者的成本越来越低了。

  好了,本系列的文章到此为止,本系列都是在讲数据库的问题,我曾经说过任何程序或软件都是计算和存储的结合体,本系列着重讲到的是存储,时下很多大型互联网公司在存储这块已经发生了很大的变化,在关系数据库这块都已经做到了去商业关系数据库,而使用开源的关系数据库,并将这些开源的关系数据进行了大规模的改造,这个做法应该算是互联网领域关系数据库发展的前沿了,同时将关系数据库很难做到的事情用Nosql数据库来替代也是一种大趋势。

  本系列讲述时候设置了一个很大的前提,那就是尽量保持关系数据库存储的本性,因此我将很多计算建议迁移到应用层,这个观点我有很多理由说明它的好处,但是现实中是否是最好的方法,这个就要具体看了,因此我不想去苛求这么做的合理性,但是逻辑上合理的方案总是会有很多借鉴意义的,这就是我想表达的,至于关于存储层的计算我倾向于在数据访问层里做,因此按照我的思路,最终这个关系数据库存储层就会变成一个分布式数据库,数据访问层当然也是使用分布式系统原理来做,讲解分布式系统也是本文章后续想讨论,如果我有时间接着写这个大系列博客我会在分布式系统这块继续讲解数据访问层的设计问题。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326333966&siteId=291194637