Why use distributed cluster task scheduling?

As a developer, the problem of scheduled tasks cannot be avoided , and the most crude and straightforward solution is crontab. Of course, when there are few machines, few tasks, and few associations between scheduled tasks, crontab is more efficient and convenient. However, when there are more machines, more scheduled tasks, and the more closely related tasks are connected, using crontab to manage and configure scheduled tasks will be very confusing, which will seriously affect work efficiency.


When there are many machines and many scheduled tasks, the following problems will be encountered:

1. The management of crontab tasks under each user of each server is chaotic, and the life cycle cannot be managed in a unified manner

2. It is difficult to connect the abnormal alarms of scheduled tasks in a unified manner

3. If there is a mutual exclusion relationship between task A and task B, it is difficult for crontab to process mutual exclusion.

4. With the growth of time, when the number of timed tasks reaches thousands or tens of thousands, the timed tasks are very difficult to manage. There are multiple timed tasks running online, when does each timed task run, which application belongs to and which development is responsible, etc. etc. the problem becomes difficult to solve.


Comparison of Linux native Crontab scheduling system and Quartz:

1. In terms of execution granularity:

Crontab: Process Scheduling

Quartz: Thread Scheduling

Thread scheduling advantages: First, it saves more resources, and second, it can do data exchange within the process, which is very important to do data exchange.


2. Platform dependencies:

Crontab supports Linux systems

Since Quartz is implemented in Java, it supports cross-platform.


3. On the scheduling operation set:

Quartz's settings are more flexible, it can easily complete various custom requirements through code, and it completely closes Crontab.

However, the minimum scheduling unit of Crontab is the minute level, and Quartz can be more detailed. It is more troublesome for Crontab to implement custom requirements.


4. Job monitoring:

Quartz supports Listener, which is very convenient to monitor the running status of jobs, and can use JobStores to persist scheduling information (both memory and DB), and then realize job visualization.


5. High availability:

It's important that Quartz supports distributed clusters, and it's easy to implement.


Why use distributed cluster task scheduling?

想象一下,现在有 A 、B、 C  3 台机器同时作为集群服务器对外统一提供 SERVICE,3 台机器上各有一个 QUARTZ  Job,它们会按照即定的 SCHEDULE 自动执行各自的任务。由于三台SERVER 里都有 QUARTZ ,因此会存在重复处理 TASK 的现象。一般的解决方案是只在一台服务器上装 QUARTZ ,其它两台不装,这样的话其实就是单机了,quartz会存在单点问题,一旦装有quartz的服务器宕了,服务就无法提供了。当然通过改 QUARTZ JOB 的代码也可以实现,但是这对开发人员要求比较高,而且可能会出现其他问题。然而quartz本身就提供了很好的集群方案。quartz集群需要数据库的支持(JobStore TX或者JobStoreCMT),从本质上来说,是使集群上的每一个节点通过共享同一个数据库来工作而达到高可用的。分布式集群任务调度,quartz是一个比较好的选择。简单,强大,稳定。


分布式集群时有个问题,就是所有服务器时钟应当要同步,以免出现离奇且不可预知的问题。


数据统计分析系统架构图

标签:大数据,架构设计,缓存 发布于 2018-05-01 09:08:17


课程所用到的开发环境和用到的技术:

系统:window7,

开发工具是:eclipse,IDEA,

本课程项目是一个综合项目,技术涵盖java web,大数据,虚拟化,linux服务器等

具体包含: spring,spark,spark streaming,spark mlib,hive,flume,kafka,hadoop,hbase,mongodb,dubbo,分布式缓存,redis,docker,nginx,easyui,highcharts等等。

此课程是按照真实企业级开发项目流程进行讲解,通过学习此课程可以体会到真实的大型大数据项目开发流程,学完此课程可以熟练掌握大数据技术,java web技术,docker虚拟化技术,分布式技术,缓存技术,linux等。


images/2bpFkktSwzJBKrFyzaWm2wepXN62t58E.png

spring cloud 脱离注册中心,指定IP来测试

标签:微服务,SpringCloud 发布于 2018-04-27 12:00:22

在开发项目的过程中,我们经常会共用注册中心,这样方便和节省资源。但是这样也往往会碰到,服务会调用到其他同事的服务,这样对于我们自己测试来说,非常之不方便。

在dubbo里面,它提供用本地连接的方式,可以解决这个问题。

spring cloud里面,当然也有类似的解决方式,如下:

# 禁用注册中心
ribbon.eureka.enabled=false
service-name1.ribbon.listOfServers=http://127.0.0.1:7710
service-name2.ribbon.listOfServers=http://127.0.0.1:7720

说明:原理是禁用注册中心(这里使用了eureka),为每个服务指定请求路径即可!


Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326563204&siteId=291194637