I believe there should be a simple way to do this but I'm stuck here! I've read 20+ pages on here and elsewhere but couldn't find what I want.
I have one single table with few thousands of rows and tens of columns, but to simplify let's imaging the following table qu
is my data:
+----+-----+-----+------+
| id | PID | AID | Data |
+----+-----+-----+------+
| 1 | A | 1 | 56 |
| 2 | A | 2 | 234 |
| 3 | B | 1 | 23 |
| 4 | B | 2 | 78 |
| 5 | B | 3 | 65 |
| 6 | C | 2 | 89 |
| 7 | C | 3 | 74 |
+----+-----+-----+------+
I want to have a query that generates the following results:
+-----+-----+------+
| PID | AID | Data |
+-----+-----+------+
| A | 1 | 56 |
| A | 2 | 234 |
| A | 3 | NULL |
| B | 1 | 23 |
| B | 2 | 78 |
| B | 3 | 65 |
| C | 1 | NULL |
| C | 2 | 89 |
| C | 3 | 74 |
+-----+-----+------+
Basically I want the query to fill all the missing AID
for PID
s and add NULL
or NA
for their Data
values. I can achieve this by doing loops outside MySQL but it's very slow as I need to run individual queries for every single PID
and AID
combination to get the Data
value.
Here is one of my latest tries with no success!
SELECT
*
FROM
(
SELECT
`id`,
`PID`
FROM
`qu`
GROUP BY
`PID`
) `a`
LEFT OUTER JOIN(
SELECT
`id`,
`AID`
FROM
`qu`
GROUP BY
`AID`
) `b`
ON
`a`.`id` = `b`.`id`
LEFT OUTER JOIN `qu` `c` ON
`a`.`id` = `c`.`id` AND `b`.`id` = `c`.`id`
You may use a calendar table approach here:
SELECT
q1.PID,
q2.AID,
q3.Data -- or use COALESCE(q3.Data, 'NA') AS Data
FROM (SELECT DISTINCT PID FROM qu) q1
CROSS JOIN (SELECT DISTINCT AID FROM qu) q2
LEFT JOIN qu q3
ON q3.PID = q1.PID AND
q3.AID = q2.AID
ORDER BY
q1.PID,
q2.AID;
The idea here is that we generate all possible combinations of PID
and AID
using the cross join between the distinct subqueries aliased as t1
and t2
. Then, we left join with your actual qu
table, which either brings in the data if available, or else NULL
if not available.