How to count distinct values from two columns into one number(follow up)

Aurel Drejta :

This is a follow up question to How to count distinct values from two columns into one number

I wanted to know how to do the counting part and neglected that i am already joining some other tables into the mix.

The answer given on the previous question is the correct one for that case.

Here's my additional problem now.

I have 3 tables:

Assignments

+----+-------------------+
| id |       name        |
+----+-------------------+
| 1  | first-assignment  |
| 2  | second-assignment |
+----+-------------------+

Submissions

+----+---------------+------------+
| id | assignment_id | student_id |
+----+---------------+------------+
|  1 |             1 |          2 |
|  2 |             2 |          1 |
|  3 |             1 |          3 |
+----+---------------+------------+

Group_submissions

+----+---------------+------------+
| id | submission_id | student_id |
+----+---------------+------------+
| 1  |             1 |          1 |
| 2  |             2 |          2 |
+----+---------------+------------+

Each submission belongs to an assignment.

Submissions can be an individual submission or a group submission

When they are individual the one that did the submission in an assignment(assignment_id) goes into the submissions table(student_id)

When they are group submission the same thing happens with two additional details:

  1. The one that does the submission goes into the submissions table
  2. The others go to the group_submissions table and are associated with the id in the submissions table (so submission_id is a FK from the submissions table)

I want to return every assignment with it's columns, but also add the number of students that have made submissions into that assignment. Keep in mind that students that haven't done the submission(are not in the submissions table) but have participated in a group submission (are in the group_submissions table) also count

Something like this:

+----+-------------------+----------+
| id |       name        | students |
+----+-------------------+----------+
| 1  | first-assignment  |       11 |
| 2  | second-assignment |        2 |
+----+-------------------+----------+

I tried 2 ways of getting the numbers:

count(distinct case when group_submissions.student_id is not null then
group_submissions.student_id when assignment_submissions.student_id is
not null then assignment_submissions.student_id end)

This doesn't work because the case statement will short circuit once the first condition is met. For example: If one student has done group submissions but has never actually done the submission he/she will be displayed on the group_submissions table only. So if on the submissions table the id is 1 and on the group_submission table the id is 2, and id 2 does not occur on the submissions table it will not be counted.

count(distinct case when group_submissions.student_id is not null then group_submissions.student_id end) 
+ count(distinct case when submissions.student_id is not null then submissions.student_id end)

This one doesn't work because it gives duplicates if a student is in both tables.

NOTE: This is a MySQL database

Uueerdo :

Since you can't change the data, you'll need to use a UNION subquery, and then aggregate over that.

SELECT a.id, a.name, COUNT(DISTINCT x.student_id) AS students
FROM Assignments AS a
LEFT JOIN (
   SELECT assignment_id, student_id FROM Submissions
   UNION 
   SELECT s.assignment_id, g.student_id
   FROM Submissions AS s
   INNER JOIN Group_submissions AS g ON s.id = g.submission_id
) AS x ON a.id = x.assignment_id
GROUP BY a.id, a.name
;

Edit: vhu's first part is better as long as you cannot have assignment X submitted by student Y with a group_submission credit of student Z, and another for assignment X submitted directly by student Z or having a group_submission credit or student Y (because then they would be counted twice).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=197907&siteId=1