狗家sql

1. dt|user_id|views

1.1. find active user of the month

select distinct user_id from views
where date_format(dt,’%Y-%m’) = date_format(curdate(),’%Y-%m’) and views > 0;

1.2. find rolling 7 days window active viewer找出每一天，当天最近7天内有浏览过视频的用户

select v1.dt, count(distinct v2.user_id) from views v1
join views v2 on v2.dt > v1.dt - 7 and v2.dt <= v1.dt
where v2.views >0
group by v1.dt
order by v1.dt;

Q9 - When joining multiple tables is slow, what do you do?

check execution plan to find out the bottle neck.

We use subquery before join.

Add index;

If we are sure every record in table1 will have a match on table2, we should use inner join instead of left join.

If we need to use left join, we should always put the smaller table on the right, because for every record on the left side, we need to search through the right table.

Assuming we have 3 tables, table 1 and table 3 are pretty large and table 2 is small. We can join table 1 with table 2 first, and then join table 3 uses t2.id instead of t1.id. This would mean we only pull up rows out of t3 if you have a full match on t1 and t2, so your t2.col1 filter would reduce the number of rows visited in t3.

We can add EXPLAIN at the beginning of any (working) query to get a sense of how long it will take. This is most useful if you run EXPLAIN on a query, modify the steps that are expensive, then run EXPLAIN again to see if the cost is reduced.

猜你喜欢