Pig design pattern summary and comparison with SQL design pattern

1 Overview mode

The summary mode is actually the acquisition of the overall information of the data, which is mainly divided into three types:

1.1 Numerical overview

1.2 Summary of inverted index

1.3 Counter overview

2 filter mode

The filter mode is a design mode that seeks subsets without changing the original records, and is mainly used in the following aspects:

2.1 Filtering

2.2 Bloom filtering

2.3 TopN mode

2.4 Deduplication

3 Data Organization Patterns

The data organization pattern is to reorganize a set of data, focusing on amplifying the value of individual records to the overall situation. There are mainly the following design patterns:

3.1 Layering

3.2 Partition

3.3 binning

3.4 Full sorting

3.5 Mixing

4 connection modes

The connection mode is a method of organizing data in multiple places, mainly in the following ways:

4.1 Terminal connection

4.2 Copy connection

4.3 Organizational Connections

4.4 Descartes

5 patterns of patterns

5.1 Job chain

5.2 Folding chain

5.3 Merge chain

6IO mode

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325835470&siteId=291194637