SQL Server: Analysis Functions

First, the analysis functions

Packet-based analysis functions, calculating the aggregate value of the packet data, and a window function is often used in conjunction OVER (), using the analysis function can easily calculate the year and the chain, obtained median, maximum and minimum values ​​obtained packet. Different analysis and aggregate functions, GROUP BY clause is not required, the result of the SELECT clause set by OVER clause packet ().

Insert sample data using the following script:

;with cte_data as 
(
select 'Document Control' as Department,'Arifin' as LastName,17.78 as Rate 
union all 
select 'Document Control','Norred',16.82 
union all 
select 'Document Control','Kharatishvili',16.82
union all 
select 'Document Control','Chai',10.25 
union all 
select 'Document Control','Berge',10.25 
union all 
select 'Information Services','Trenary',50.48
union all 
select 'Information Services','Conroy',39.66 
union all 
select 'Information Services','Ajenstat',38.46
union all 
select 'Information Services','Wilson',38.46
union all 
select 'Information Services','Sharma',32.45
union all 
select 'Information Services','Connelly',32.45
union all 
select 'Information Services','Berg',27.40
union all 
select 'Information Services','Meyyappan',27.40
union all 
select 'Information Services','Bacon',27.40
union all 
select 'Information Services','Bueno ',27.40
)
select Department,LastName,Rate into #data from cte_data
go

Analysis function normally and OVER () function with the use, SQL Server There are four types of analysis functions.

In the OVER () function of the data within the window will normally be sorted, the order data from the sequence considered as a downwardly on the current line, in sequence after the upward, downward in front of the sequence. For the current group, the uppermost row in the first group, the lowermost end of the line group.

Note: The order of execution is distinct clause after analysis functions.

1、CUME_DIST 和PERCENT_RANK

  • CUME_DIST logic calculation is: less than or equal to the value of the number of rows / the number of the current packet of rows
  • PERCENT_RANK calculation logic is a packet in the current row within :( RANK -1) / (total number of lines within packets -1) is the ranking value RANK () function result value sorting.

The following code for calculating cumulative distribution percentage and ranking:

select Department,LastName ,Rate
    ,cume_dist() over(partition by Department order by Rate) as CumeDist
    ,percent_rank() over(partition by Department order by Rate) as PtcRank
    ,rank() over(partition by Department order by Rate asc) as rank_number
    ,count(0) over(partition by Department) as count_in_group
from #data
order by DepartMent
    ,Rate desc

2、PERCENTILE_CONT和PERCENTILE_DISC

PERCENTILE_CONT and PERCENTILE_DISC is to calculate percentile values, such as the calculation in a field when a percentile value is.

  • PERCENTILE_CONT is continuous, CONT representative of Continuous, continuous values, which means that the interval is considered, the absolute value of the intermediate value;
  • PERCENTILE_DISC is discrete, DISC representative of discrete, discrete values. So consider it more up or down to choose, without considering the interval.

The following script is used to obtain quantile:

select Department  ,LastName  ,Rate
    ,PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY Rate) OVER (PARTITION BY Department) AS MedianCont
    ,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY Rate) OVER (PARTITION BY Department) AS MedianDisc
    ,row_number() over(partition by Department order by Rate) as rn
from #data order by DepartMent ,Rate asc

3, LAG and LEAD

In a query, the data table is sorted, the sorted data from the down is seen as a sequence, for the current row, in the above sequence is, below the previous sequence is.

In the same group, for the current row:

  • LAG () function is used to obtain the N-th row counting backwards from the current line (or upward).
  • Lead () function is used to obtain the current line from the forward (or down) count, line N.
LAG (scalar_expression [,offset] [,default])    OVER ( [ partition_by_clause ] order_by_clause )
LEAD ( scalar_expression [ ,offset ] , [ default ] )  OVER ( [ partition_by_clause ] order_by_clause )

Parameter Notes:

  • sclar_expression: scalar expression
  • offset: The default value is 1, you must be a positive integer, for LAG () function represents the number of rows from the current row (current row) fallback for LEAD () represents the number of rows from the current row to move forward.
  • default: when the offset value exceeds the range of the partition to return. If the default value is specified, NULL is returned. default can be a column, subquery, or other expression, but it must be compatible with sclar_expression type.

Results date, these two functions are particularly suitable for calculating the year and the chain.

select DepartMent ,LastName,Rate
    ,lag(Rate,1,0) over(partition by Department order by LastName) as LastRate
    ,lead(Rate,1,0) over(partition by Department order by LastName) as NextRate
from #data order by Department ,LastName

Grouped by DepartMent, for Document Control this team for analysis:

  • The first row, for LastRate field, no data back row exists, return the value of the parameter Default value field is the value NextRate Rate field in the second row.
  • The second line, LastRate Rate field is the value of the first row, NextRate Rate field is the value of the third row. For the middle row, and so on.
  • The last line, LastRate Rate field is the inverse of the second row, for NextRate field, because the last line of the forward data line does not exist, the return value of the parameter Default.

4、FIRST_VALUE和LAST_VALUE

Gets the line came in the end most of the inner group and ranked first line:

LAST_VALUE ( [scalar_expression ) OVER ( [ partition_by_clause ] order_by_clause rows_range_clause )
FIRST_VALUE ( [scalar_expression ] ) OVER ( [ partition_by_clause ] order_by_clause [ rows_range_clause ] )

Guess you like

Origin www.cnblogs.com/springsnow/p/12426959.html