Rob_Quads
Member
Registered: 29th Mar 01
Location: southampton
User status: Offline
|
I have a mysql database with a table with the following structure...
CREATE TABLE `perf_v8_ga_results` (
`Build` varchar(30) DEFAULT NULL,
`Host` varchar(50) DEFAULT NULL,
`OS` varchar(30) DEFAULT NULL,
`MQ` varchar(30) DEFAULT NULL,
`DB` varchar(30) DEFAULT NULL,
`Start` datetime DEFAULT NULL,
`End` datetime DEFAULT NULL,
`TestCase` varchar(100) DEFAULT NULL,
`State` varchar(30) DEFAULT NULL,
`Ignore` varchar(30) CHARACTER SET utf8 COLLATE utf8_unicode_ci NOT NULL DEFAULT 'NO',
`Persistence` varchar(10) DEFAULT NULL,
`MsgSize` int(11) DEFAULT NULL,
`MsgPerSec` float DEFAULT NULL,
`MsgPerSec_FullStdDeviation` float NOT NULL,
`MsgPerSec_IQRange` float NOT NULL,
`Server_Machine_CPU_Pct` float DEFAULT NULL,
`Server_Machine_CPU_Pct_FullStdDeviation` float NOT NULL,
`Server_Machine_CPU_Pct_IQRange` float NOT NULL,
`CPUCostPerMsg` float DEFAULT NULL,
`CPUS` int(11) DEFAULT NULL,
`id` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`id`),
UNIQUE KEY `index2` (`Host`,`Start`,`End`,`Build`),
KEY `IGNORE` (`Ignore`)
)
For some reason the unique key does not seem to be working as I have a number of entries where the Host/Start/End/Build are all the same.
a) Anyone know why the unique key might not be working?
b) Anyone know a query which will delete all the duplicates (without using a temp table as it needs to be done on the live DB)
Seem stuff around the following but it does not seem to match anything
SELECT *
FROM perf_v8_ga_results, perf_v8_ga_results AS vtable
WHERE vtable.id > perf_v8_ga_results.id
AND perf_v8_ga_results.Host = vtable.Host
AND perf_v8_ga_results.Start = vtable.Start
AND perf_v8_ga_results.End = vtable.End
AND perf_v8_ga_results.Build = vtable.Build
Cheers
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
Is index2 a compound key?
Give me some rows of data that you have the issue with.
|
Rob_Quads
Member
Registered: 29th Mar 01
Location: southampton
User status: Offline
|
How do I check if its a compound key?
Here is some data
INSERT INTO `perf_v8_ga_results` (`Build`, `Host`, `OS`, `MQ`, `DB`, `Start`, `End`, `TestCase`, `State`, `Ignore`, `Persistence`, `MsgSize`, `MsgPerSec`, `MsgPerSec_FullStdDeviation`, `MsgPerSec_IQRange`, `Server_Machine_CPU_Pct`, `Server_Machine_CPU_Pct_FullStdDeviation`, `Server_Machine_CPU_Pct_IQRange`, `CPUCostPerMsg`, `CPUS`, `id`) VALUES
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 05:59:49', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.47, 0.85, 1, 37.78, 1.61, 1.75, 67.25, 4, 710081),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 05:59:49', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.47, 0.85, 1, 37.78, 1.61, 1.75, 67.25, 4, 710748),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 065:15', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.47, 0.98, 1, 37.44, 3.01, 3.75, 66.65, 4, 710082),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 065:15', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.47, 0.98, 1, 37.44, 3.01, 3.75, 66.65, 4, 710749),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 068:31', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.36, 0.98, 1, 37.17, 4.98, 3, 66.49, 4, 710083),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 068:31', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.36, 0.98, 1, 37.17, 4.98, 3, 66.49, 4, 710750),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 06:11:46', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.18, 0.99, 1, 37.78, 1.96, 3.75, 68.13, 4, 710084),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 06:11:46', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.18, 0.99, 1, 37.78, 1.96, 3.75, 68.13, 4, 710751),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 06:152', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.1, 0.74, 1, 36.28, 1.34, 2.75, 65.67, 4, 710085),
('S000-L110914.1_P', 'lime11.hursley.ibm.com', 'WIN2008R2_SVR_GOLD', 'MQ_7_0_1_5', 'DB29_5_0', '2011-09-18 06:152', NULL, 'REG_A_V6LARGEMESSAGE', 'PASSED', 'NO', 'NONE', 204800, 22.1, 0.74, 1, 36.28, 1.34, 2.75, 65.67, 4, 710752);
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
Compound means in the index relies on all the fields together as one to make the decision on duplication.
So if all four fields are the same you have a duplicate, but if you have identical rows but for example the date stamp is different, that is not a duplicate row, even though some of the fields named in the index are duplicated.
You can count the duplicates using GROUP BY
SELECT host, start, end, build, COUNT(*) FROM table GROUP BY host, start, end, build
You can remove duplicates using CREATE TABLE AS SELECT DISTINCT field field field
There is no easy way to remove all duplicates with DELETE statements on the table, without caching the locations of the duplicates in a bit of programming. It can be done programmatically but the simpler solution is a new table based on a DISTINCT query on the old one.
|
Rob_Quads
Member
Registered: 29th Mar 01
Location: southampton
User status: Offline
|
My understanding was that the index was compound - thats why I want it as I don't want the table to include multiple entries of the same result.
Not quite sure why its not working.
|
|