PostgreSQL自定义操作符

PostgreSQL的create operator语法允许用户可以定义自己需要的操作符,利用这些自定义的操作符对需要使用的数据类型进行操作。而实际上完成这些操作的并不是操作符本身，而是通过调用函数来实现的。比如：

bill=# \do+ @>
                                          List of operators
   Schema   | Name | Left arg type | Right arg type | Result type |      Function       | Description 
------------+------+---------------+----------------+-------------+---------------------+-------------
 pg_catalog | @>   | aclitem[]     | aclitem        | boolean     | aclcontains         | contains
 pg_catalog | @>   | anyarray      | anyarray       | boolean     | arraycontains       | contains
 pg_catalog | @>   | anyrange      | anyelement     | boolean     | range_contains_elem | contains
 pg_catalog | @>   | anyrange      | anyrange       | boolean     | range_contains      | contains
 pg_catalog | @>   | box           | box            | boolean     | box_contain         | contains
 pg_catalog | @>   | box           | point          | boolean     | box_contain_pt      | contains
 pg_catalog | @>   | circle        | circle         | boolean     | circle_contain      | contains
 pg_catalog | @>   | circle        | point          | boolean     | circle_contain_pt   | contains
 pg_catalog | @>   | jsonb         | jsonb          | boolean     | jsonb_contains      | contains
 pg_catalog | @>   | path          | point          | boolean     | path_contain_pt     | contains
 pg_catalog | @>   | polygon       | point          | boolean     | poly_contain_pt     | contains
 pg_catalog | @>   | polygon       | polygon        | boolean     | poly_contain        | contains
 pg_catalog | @>   | tsquery       | tsquery        | boolean     | tsq_mcontains       | contains
(13 rows)

我们可以看到，对两个数组进行@>操作，其实是调用了arraycontains函数。
今天我们主要先看看定义操作符的一些注意事项。首先是语法：

CREATE OPERATOR name (
    {FUNCTION|PROCEDURE} = function_name
    [, LEFTARG = left_type ] [, RIGHTARG = right_type ]
    [, COMMUTATOR = com_op ] [, NEGATOR = neg_op ]
    [, RESTRICT = res_proc ] [, JOIN = join_proc ]
    [, HASHES ] [, MERGES ]
)

这里面的一些参数我们来逐一介绍：
首先是commutator参数，这个参数的作用是用来实现操作符调换，是什么意思呢？比方说我们都知道2>1等价于1<2，这是因为>就是<的commutator。例如在索引扫描的时候，我们都知道索引列必须在操作符的左侧才能使用索引，例如col1>10，但是如果你写成10<col1其实也能走索引，因为优化器会自动将其转换成col1>10。
下面我们以int4类型的>和<操作符举例来验证下：

bill=# select oprcom::regoper from pg_operator where oprname='>' and oprcode='int4gt'::regproc; 
    oprcom    
--------------
 pg_catalog.<
(1 row)

bill=# select oprcom::regoper from pg_operator where oprname='<' and oprcode='int4lt'::regproc;  
    oprcom    
--------------
 pg_catalog.>
(1 row)

记录他们的oprcom对应的OID，可以发现>对应的oid是521，它的oprcom的oid刚好是97（<的oid）

bill=# select * from pg_operator where oprname='>' and oprcode='int4gt'::regproc;
 oid | oprname | oprnamespace | oprowner | oprkind | oprcanmerge | oprcanhash | oprleft | oprright | oprresult | oprcom | oprnegate | oprcode |   oprrest   |     oprjoin     
-----+---------+--------------+----------+---------+-------------+------------+---------+----------+-----------+--------+-----------+---------+-------------+-----------------
 521 | >       |           11 |       10 | b       | f           | f          |      23 |       23 |        16 |     97 |       523 | int4gt  | scalargtsel | scalargtjoinsel
(1 row)

bill=# select * from pg_operator where oprname='<' and oprcode='int4lt'::regproc;  
 oid | oprname | oprnamespace | oprowner | oprkind | oprcanmerge | oprcanhash | oprleft | oprright | oprresult | oprcom | oprnegate | oprcode |   oprrest   |     oprjoin     
-----+---------+--------------+----------+---------+-------------+------------+---------+----------+-----------+--------+-----------+---------+-------------+-----------------
  97 | <       |           11 |       10 | b       | f           | f          |      23 |       23 |        16 |    521 |       525 | int4lt  | scalarltsel | scalarltjoinsel
(1 row)

接下来我们通过更新pg_operator解除他们的commutator关系，设置为0即可。

bill=# update pg_operator set oprcom=0 where oprname='>' and oprcode='int4gt'::regproc;  
UPDATE 1
bill=# update pg_operator set oprcom=0 where oprname='<' and oprcode='int4lt'::regproc; 
UPDATE 1

现在我们再来验证我们前面举的索引的例子:

bill=# explain select * from t1 where id > 10;
                             QUERY PLAN                              
---------------------------------------------------------------------
 Index Scan using idx_t1 on t1  (cost=0.15..21.86 rows=423 width=36)
   Index Cond: (id > 10)
(2 rows)

bill=# explain select * from t1 where 10 < id;
                      QUERY PLAN                      
------------------------------------------------------
 Seq Scan on t1  (cost=0.00..25.88 rows=423 width=36)
   Filter: (10 < id)
(2 rows)

可以看到一个是索引扫描，一个是全表扫描。

重新建立这两个 operator的commutator关系：

bill=# update pg_operator set oprcom=521 where oprname='<' and oprcode='int4lt'::regproc;  
UPDATE 1
bill=# update pg_operator set oprcom=97 where oprname='>' and oprcode='int4gt'::regproc;  
UPDATE 1
bill=# explain select * from t1 where 10 < id;
                             QUERY PLAN                              
---------------------------------------------------------------------
 Index Scan using idx_t1 on t1  (cost=0.15..21.86 rows=423 width=36)
   Index Cond: (id > 10)
(2 rows)

可以看到使用10 < id作为查询条件又能走索引了！

接着我们来看看剩下几个参数：
negator：简单点说就是让操作符支持求反操作，比如我们都知道x != y 等价于NOT(x=y)，这便是negator参数的作用。

restrict是用来评估选择性的函数，仅适用于二元操作符。

join，是joinsel即join的选择性计算函数。

hashes和merges表示该操作符是否允许hash join和merge join, 只有返回布尔逻辑值的二元操作符满足这个要求。
在这里插入图片描述

foucus、

发布了70 篇原创文章 · 获赞 5 · 访问量 3153

私信关注

PostgreSQL自定义操作符

猜你喜欢