Searching MySQL db for duplicates

DewChugr

Photoshop God
Jun 26, 2006
1,977
66
0
48.655139,-119.644032
I need to search a MySQL database for duplicate records. I would like to search for records that are a complete match and I also need to search for records where first name and last name match.

Can someone point me in the right direction for doing this?
 


Are first name and last name in separate fields?

Use LIKE for partial string matches.
 
This is psuedo code for you since I use T-SQL but the equivalent should exist in MySQL if it's ANSI compliant....

-- test data
create table #foo
(
r1 int,
fname nvarchar(50),
lanme nvarchar(50)
)

insert into #foo
select 1, 'Justin', 'Smith'
union all
select 2, 'Justin', 'Smith'
union all
select 3, 'Robert', 'Smith'
union all
select 4, 'Santa', 'Claus'
union all
select 5, 'Santa', 'Claus'
union all
select 6, 'Justin', 'Jones'
go

-- Here be answer
select count(fname), fname, lanme
from #foo
group by fname, lanme
having count(fname)>1


That takes care of first and last name dupes, replicate what I did for Fname and LName for all fields but the PK if you want whole row dupes outside of the primary key.
 
My above solution assumes you're searching a single table, if you are wanting to scan the entire DB that gets a lot more complicated (and way outside the scope of forum answer.)
 
If you just want to eliminate dupes: dump it to a file, empty your table, configure unique indexes, then re-import the file.
 
i suggest creating a temp table and inserting all the values into it by grouping the nonprimary keys (eg the fields you want to see are duplicates of each other)

this should hopefully do it

CREATE TABLE temp_mytable AS SELECT * FROM mytable GROUP BY field1,field2;
DROP TABLE mytable;
RENAME TABLE temp_mytable TO mytable;
 
CREATE TABLE temp_mytable AS SELECT * FROM mytable GROUP BY field1,field2;

Not being nitpicky, I'm just a database nerd, (and maybe this isnt the case for MySQL if it doesnt adhere to ANSI SQL) but assuming there are non dupe fields the OP wants filtered out SELECT * wont work unless they are in the GROUP BY or have an aggregate function applied to them. So if the table structure is:

PKField, Field1, Field2

You would need to do something along the lines of:
SELECT MAX(PKField), Field1, Field2 from MyTable GROUP BY Field1, Field2

If there is no PK field then the GROUP BY can be skipped altogether and DISTINCT used instead:

SELECT DISTINCT Field1, Field2 from MyTable
 
This is a big help. I'm going to have to play with it a bit, but I think this will do what I need. And yes, only one table. The problem is the table started out as an Excel file of about 15k records and has since been added to and I want to weed out the duplicates.

I haven't tried the other methods yet, but I'm sure they would work also.

Thank you for the help!

This is psuedo code for you since I use T-SQL but the equivalent should exist in MySQL if it's ANSI compliant....

-- Here be answer
select count(fname), fname, lanme
from #foo
group by fname, lanme
having count(fname)>1


That takes care of first and last name dupes, replicate what I did for Fname and LName for all fields but the PK if you want whole row dupes outside of the primary key.
 
You could export it to a .csv file, clean it up with excel (there's a dupe removal function)
then reimport the new .csv
 
Not being nitpicky, I'm just a database nerd, (and maybe this isnt the case for MySQL if it doesnt adhere to ANSI SQL) but assuming there are non dupe fields the OP wants filtered out SELECT * wont work unless they are in the GROUP BY or have an aggregate function applied to them. So if the table structure is:

PKField, Field1, Field2

You would need to do something along the lines of:
SELECT MAX(PKField), Field1, Field2 from MyTable GROUP BY Field1, Field2

If there is no PK field then the GROUP BY can be skipped altogether and DISTINCT used instead:

SELECT DISTINCT Field1, Field2 from MyTable

Thanks i was kind of wishy washy for a lack of better words about that part. I'm not sure what selecting the fields or doing the catch all would do to the primary key if he had one. I guess it would also depend on if he wanted to maintain the numbers or let them get reordered after the duplicates were removed.
 
i'm sorry i just now realized i read the OP's post wrong. I thought he wanted to remove duplicates didn't know he was just wanting to find them.
 
You can do a self join which will return the IDs of the dupes:

SELECT tblb.id AS dupe
FROM tablename tbla, tablename tblb
WHERE tbla.id > tblb.id
AND tbla.name = tblb.name
 
If you want to do it with PHP you can use this:

Code:
<?php
mysql_connect("localhost", "usrnam", "password") or die(mysql_error());
mysql_select_db("database") or die(mysql_error());



if(isset($_POST['submit'])) {

if(isset($_POST['fname'])) {
$result = mysql_query("SELECT * FROM tablename") 
or die(mysql_error());  

while($row = mysql_fetch_array( $result )) {
$fnamedb = $row['fname'];
$fname = $_POST['fname'];

if($fname == $fnamedb) {
echo "The Firstname $fname Exists<br>";
} else {
echo "The Firstname $fname Does Not Exist<br>";
}
}
}

if(isset($_POST['lname'])) {

$result = mysql_query("SELECT * FROM tablename") 
or die(mysql_error());  
while($row = mysql_fetch_array( $result )) {

$lnamedb = $row['lname'];
$lname = $_POST['lastname'];

if($lname == $lnamedb) {
echo "The Lastname $lname Exists<br>";
} else {
echo "The lastname $lname Does Not Exist<br>";
}
}
}

if(isset($_POST['searchrecord'])) {

$result = mysql_query("SELECT * FROM tablename") 
or die(mysql_error());  
while($row = mysql_fetch_array( $result )) {

$recorddb = $row['record'];
$record = $_POST['searchrecord'];

if($record == $recorddb) {
echo "The Record $record Exists<br>";
} else {
echo "The Record $record Does Not Exist<br>";
}
}
}

}

?>
<html>
<head>
<title> Record Checker </title>
</head>
<body>
<center>
<h1> Record Checker </h1>
<form action="" method="post">
Firstname: <input type="text" name="fname"><br>
Lastname: <input type="text" name="lname"><br>
Other Record: <input type="text" name="searchrecord"><br>
<input type="submit" value="Search" name="submit">
</form>
</center>
</body>
</html>

-John
 
If you want to do it with PHP you can use this:

<newb code snipped>

-John

Why would he want to use that bullshit, that's some of the worst newb shit code I've seen in a while. It doesn't even do what the OP requested.

Maybe you should learn PHP and SQL before you post in a thread that requires more than a basic understanding of what they are.

edit: lol, I see, your a "script developer" for hire. Good luck, Bro