Correlated MySQL subquery to Postgre

Muhammad Hashir Anwaar :

MySQL support Correlated Subqueries and also working properly to fecth records from my db ...but i need same query to run on AWS Redshift to bring same records but it gives error of This type of correlated subquery pattern is not supported yet

MYSQL Query

SELECT IFNULL((SELECT IF (c.c_id,'',0) FROM customer c WHERE c.c_id = o.c_id AND DATE (c.date) >= DATE ('2020-01-24') AND DATE (c.date) <= DATE ('2020-02-22')),'BCK') AS cstatus,
   IF (o.os_id = 0,'Missing','Place') AS 'os',
   IFNULL(COUNT(DISTINCT o.c_id),0) AS 'customers',
   IFNULL(COUNT(o.o_id),0) AS 'orders',
   IFNULL(SUM(o.total),0) AS 'cell'
FROM order o
WHERE 1
AND   DATE (o.date) >= DATE ('2020-01-24')
AND   DATE (o.date) <= DATE ('2020-02-22')
GROUP BY cstatus,
     ostatus
ORDER BY cstatus,
     ostatus DESC

Postgresql Query

SELECT COALESCE((SELECT CASE WHEN c.c_id THEN NULL ELSE 0 END FROM customer c WHERE c.c_id = o.c_id AND DATE (c.date) >= DATE ('2020-01-24') AND DATE (c.date) <= DATE ('2020-02-22')),'BCK') AS cstatus,
   CASE WHEN o.os_id = 0 THEN 'Missing' ELSE 'Place' END AS ostatus,
   COALESCE(COUNT(DISTINCT o.c_id),0) AS customers,
   COALESCE(COUNT(o.o_id),0) AS orders,
   COALESCE(SUM(o.total),0) AS cell
FROM order o
WHERE 1
AND   DATE (o.date) >= DATE ('2020-01-24')
AND   DATE (o.date) <= DATE ('2020-02-22')
GROUP BY cstatus,
     ostatus
ORDER BY cstatus,
     ostatus DESC

This postgre query gives me 2 errors. 1- at 'BCK' , 2- correlated subquery error. (For 'BCK' if i put integer it works, but i need string)

expected result will be like in attached image.

enter image description here

How can i handle this issue in AWS Redshift

Gordon Linoff :

Your code would be much easier to follow with explanations on what it is trying to do. The code looks suspicious, simply because it is a scalar subquery, but there is no guarantee that it never returns more than one row (say by using LIMIT or aggregation).

But, it appears to, suggesting that there is at most one customer per order. With that assumption, you can replace the logic with some sort of JOIN without affecting the calculations:

SELECT (CASE WHEN c.c_id IS NOT NULL THEN 'BCK' END) AS cstatus,
       (CASE WHEN o.os_id = 0 THEN 'Missing' ELSE 'Place' END) AS os,
       COUNT(DISTINCT o.c_id) AS customers,
       COUNT(o.o_id) AS orders,
       COALESCE(SUM(o.total), 0) AS cell
FROM order o LEFT JOIN
     customer c
     ON c.c_id = o.c_id AND
        c.date >= DATE('2020-01-24') AND
        c.date < DATE ('2020-02-23')
WHERE o.date >= DATE('2020-01-24') AND
      o.date < DATE('2020-02-23')
GROUP BY cstatus, ostatus
ORDER BY cstatus, ostatus DESC;

Note other changes to the query:

  • The date comparisons work for both times and dates by using < for the second comparison. This can be important for optimization and partition pruning (depending on the database).
  • COUNT() and COUNT(DISTINCT) never return NULL values, so replacing NULLs with 0 is just wasted code.
  • Use single quotes only for string and date constants. Do not use them -- in any database -- for column aliases.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=5627&siteId=1