mysql中使用group by进行分组后取某一列的最大值,我们可以直接使用max()函数。
假设有一张表,记录用户登录信息,其表结构如下:
CREATE TABLE `tb` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`uid` int(11) NOT NULL,
`login_time` datetime,
PRIMARY KEY (`id`));
我们往其中插入一些测试数据:
INSERT INTO tb SELECT null, 1001, '2017-01-21 16:30:47';
INSERT INTO tb SELECT null, 1003, '2017-01-21 19:30:51';
INSERT INTO tb SELECT null, 1001, '2017-01-21 16:50:41';
INSERT INTO tb SELECT null, 1002, '2017-01-21 18:30:21';
INSERT INTO tb SELECT null, 1002, '2017-01-21 19:12:32';
INSERT INTO tb SELECT null, 1001, '2017-01-21 19:53:09';
INSERT INTO tb SELECT null, 1001, '2017-01-21 19:55:34';
如果我们需要查每个用户的最近登录时间,可以用下面的写法:
SELECT uid, max(login_time)
FROM tb
GROUP BY uid;
如果我们还想查除uid以外的id信息,如果直接写:
SELECT id,uid, max(login_time)
FROM tb
GROUP BY uid;
可能会报错:ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'zhenxi.tb.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
当mysql 设置sql_mode=only_full_group_by时,MySQL会对分组查询施加严格的规则,要求SELECT子句中所有不属于聚合函数的列都必须包含在GROUP BY子句中。也就是说如果select 中包含了id, uid字段,那么group by 中也必须是这两个列。
假设我们关闭only_full_group_by,这时候执行sql不会报错,但是id列的结果并不准确,会给使用者带来歧义,所以我们最好不要这样写。
那么我们怎么获取分组后最大值那一行中除分组字段的其他列信息呢?
方法则是使用join关联查询:
select tb.* from tb,(
SELECT uid, max(login_time) as login_time
FROM tb
GROUP BY uid) as tb_new where tb.uid = tb_new.uid and tb.login_time = tb_new.login_time
引申一下:假设我想只保留每个用户最后登录的那条记录,应该怎么写呢?
可以使用多表关联删除的写法:
delete tb from tb left join (
SELECT uid, max(login_time) as login_time
FROM tb
GROUP BY uid) as tb_new on tb.uid = tb_new.uid where tb_new.uid is not null and tb.login_time