查找单个字段中的重复数据

如果想要找出 email 重复的数据,可以基于该字段进行分组统计,并且返回行数大于 1 的分组:

1
2
3
4
5
6
7
8
mysql> select email, count(email) from people group by email having count(email) > 1;

mysql>
|---------------|------------|
|email |count(email)|
|---------------|------------|
|lisi@test.com | 2|
|wangwu@test.com| 3|

查询结果显示有两个邮箱地址存在重复情况。如果想要查看完整的重复数据,可以使用子查询或者连接查询:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
mysql>  select *
from people
where email in (
select email
from people
group by email
having count(email) > 1
)
order by email;
mysql>
|--|------|---------------|
|id|name |email |
|--|------|---------------|
| 2|李四 |lisi@test.com |
| 4|李斯 |lisi@test.com |
| 3|王五 |wangwu@test.com|
| 5|王五 |wangwu@test.com|
| 6|王五 |wangwu@test.com|

mysql> select p.*
from people p
join (
select email
from people
group by email
having count(email) > 1
) d on p.email = d.email
order by email;

mysql>
|--|------|---------------|
|id|name |email |
|-- |------|---------------|
| 2|李四 |lisi@test.com |
| 4|李斯 |lisi@test.com |
| 3|王五 |wangwu@test.com|
| 5|王五 |wangwu@test.com|
| 6|王五 |wangwu@test.com|

另一种查找重复记录的方法就是直接使用自连接查询和 distinct 操作符,例如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
mysql> select distinct p.*
from people p
join people d on p.email = d.email
where p.id <> d.id
order by p.email;
mysql>
|--|------|---------------|
|id|name |email |
|--|------|---------------|
| 4|李斯 |lisi@test.com |
| 2|李四 |lisi@test.com |
| 6|王五 |wangwu@test.com|
| 5|王五 |wangwu@test.com|
| 3|王五 |wangwu@test.com|

注意,不能省略 distinct,否则会某些数据(3、5、6)会返回多次。

查找多个字段中的重复数据

如果我们想要找出 name 和 email 字段都重复的数据,实现方式也类似:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
mysql> select *
from people
where (name, email) in (
select name, email
from people
group by name, email
having count(1) > 1)
order by email;
mysql>
|--|-----|---------------|
|id|name |email |
|--|-----|---------------|
| 3|王五 |wangwu@test.com|
| 5|王五 |wangwu@test.com|
| 6|王五 |wangwu@test.com|

mysql> select distinct p.*
from people p
join people d on p.name = d.name and p.email = d.email
where p.id <> d.id
order by email;
mysql>
|--|------|--------------|
|id|name |email |
|--|------|--------------|
| 6|王五 |wangwu@test.com|
| 5|王五 |wangwu@test.com|
| 3|王五 |wangwu@test.com|

只有当 name 和 email 都相同时才是重复数据,所以 2 和 4 不是重复记录。

SQL 在两个表中查询重复值

方法一:使用DISTINCT和COUNT函数

使用DISTINCT和COUNT函数结合,可以快速查询重复值的数量和内容。例如,我们有两个表:表A和表B。我们想要找出在表A和表B中重复的值。可以使用以下SQL查询语句:

1
2
3
4
5
6
7
8
SELECT value, COUNT(value) AS count
FROM (
SELECT value FROM tableA
UNION ALL
SELECT value FROM tableB
) AS temp
GROUP BY value
HAVING count > 1;

上述查询语句将返回在表A和表B中重复的值以及重复的次数。

方法二:使用INNER JOIN语句

INNER JOIN语句可以用来连接两个表,我们可以利用这个特性来查询重复的值。以下是一个示例:

1
2
3
4
5
SELECT value, COUNT(*) AS count
FROM tableA
INNER JOIN tableB ON tableA.value = tableB.value
GROUP BY value
HAVING count > 1;

上述查询将返回在表A和表B中重复的值以及重复的次数。

方法三:使用EXISTS关键字

使用EXISTS关键字,我们可以检查一个子查询是否返回结果。以下是一个示例:

1
2
3
4
5
6
7
8
9
SELECT value, COUNT(*) AS count
FROM tableA
WHERE EXISTS (
SELECT 1
FROM tableB
WHERE tableB.value = tableA.value
)
GROUP BY value
HAVING count > 1;

上述查询将返回在表A和表B中重复的值以及重复的次数。