How to get all items that has added action but wasn't deleted

Nikita Lvov :

I have an Activity table in which user's activity on actions is stored. There is two type of actions in column type that are required to check:

  • add_favorite
  • remove_favorite

So same object can be added or removed from favorites multiple times so table has records like:

| id | type            | timestamp  | object_id | user  |
|----|-----------------|------------|-----------|-------|
| 1  | add_favorite    | 1584150783 | 4         | user1 |
| 2  | add_favorite    | 1584151341 | 5         | user1 |
| 3  | remove_favorite | 1584161259 | 4         | user1 |
| 4  | add_favorite    | 1584168758 | 4         | user1 |
| 5  | remove_favorite | 1584171635 | 10        | user1 |
| 6  | add_favorite    | 1584174201 | 11        | user1 |
| 7  | remove_favorite | 1584177194 | 5         | user1 |

First I was constructing two queries that was collecting all object_id's with add_favorite actions for the user in array. Then I was filtering this array with another one set of object_id's which had type of action removed_favorite

SELECT object_id FROM activity WHERE user='user1' AND type='add_favorite';
SELECT object_id FROM activity WHERE user='user1' AND type='remove_favorite';

But this is highly ineffective on large number of activities. Please advise me query with LEFT JOIN which would be more effective and will give the end result

Gordon Linoff :

You can simplify your query to:

SELECT object_id
FROM activity
WHERE user = 'user1' AND
      type IN ('add_favorite', 'remove_favorite');

Then I would recommend aggregation:

SELECT object_id
FROM activity
WHERE user = 'user1' AND
      type IN ('add_favorite', 'remove_favorite')
GROUP BY object_id
HAVING SUM(type = 'add_favorite') > SUM(type = 'remove_favorite');

This gets the objects that have net more adds than removes.

An alternative is to look at the last action and only choose those that are adds:

SELECT a.object_id
FROM activity a
WHERE a.user = 'user1' AND
      a.timestamp = (SELECT MAX(a2.timestamp)
                     FROM activity a2
                     WHERE a2.user = a.user AND
                           a2.type IN ('add_favorite', 'remove_favorite')
                    ) AND
      a.type = 'add_favorite'

With an index on activity(user, type, timestamp), this probably has better performance.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=336926&siteId=1