Dom
Member
Registered: 13th Sep 03
User status: Offline
|
Right i'm having a few woes searching within a delimiter field (called `Specialisation`) in a table, field data example - '2;19;14;13;1;5;6'. The field can also be empty or only be a single number (and not have a delimiter).
Now i've been using LIKE in the form of LIKE '%19%' which is fine but i've just realised if i'm searching for the number 3 and use LIKE '%3%', it will return rows where this field contains 3 and that includes 13/23/33/43 etc which is not what i'm after.
I've currently knocked up the following -
code: WHERE `Specialisation` LIKE '%;3;%' OR `Specialisation` LIKE '3;%' OR `Specialisation` LIKE '%;3' OR `Specialisation` LIKE '3'
And so far it seems to work but surely there must be a better solution (regex perhaps?)?
Cheers for any help,
Dom
p.s - i've looked into full searches, but it's possible for the table to contain no data or search string to appear in more that 50% of the data....
[Edited on 06-01-2010 by Dom]
|
Paul_J
Member
Registered: 6th Jun 02
Location: London
User status: Offline
|
Thing is, you don't usually have a sub set of data within a field.
Usually a field is a value, and therefore you are matching that value with the like.
However, by having delimiter'd values within the field, it's like having another sub set of data inside (like an array).
So you're not searching for all fields which contain 3. You're searching for sub sets that contain 3. Which is more a pain due to as you found out, the possibilities being
3 - on its own
3; - first element
;3 - last element
;3; - in the middle
I would personally be looking into why this data is like this and re designing it to be better.
- either You could use the sql you've got.
- or You could always add ';' to the object... e.g.
;3; - on its own
;3;1;2; - first element
;1;2;3; - last element
;2;3;1; - middle
Then you only need
select * from table where Specialisation like '%;3;%'
- Or you could look at moving the sub set data to another table perhaps?
e.g.
Table 1 Specialisation
Table 2 SubSet data
Specialisation has a header id
Subset has a detail id and a link back to the headerid e.g. a row for each 'element'
Then
Select * from specialisation where special_ID in (select special_ID from subset where element = 3)
Thus returning all header lines that contain a subset element 3.
[Edited on 06-01-2010 by Paul_J]
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
That database structure is so shit you need to stop right now and do it properly.
If your data right now is:
1,2;19;14
2,6;3;7
3,33;2;4
Make a table called specialisation and populate it:
pid,sid
1,2
1,19
1,14
2,6
2,3
2,7
3,33
3,2
3,4
If you really really want the list back in that format and you're using MySQL, you can use GROUP_CONCAT to reconstruct it:
SELECT pid, GROUP_CONCAT(sid) FROM specialisation GROUP BY pid
You can reformat the GROUP_CONCAT with the separator argument, order them etc.
Presumably then you can also write some queries because those specialisation values have some textual equivalent in another table.
What you actually have I think is a many-to-many relationship between the person and the specialisation, therefore the table in question which I describe is the junction which facilitates this relationship.
My strong advice - stop messing with LIKE which is a nightmare and fix the structure! If you can't fix the structure, refuse to work on such a poor data structure until such time at that whoever is in charge sees sense! I really mean that!
If you stick with LIKE and get the substring syntax good, you've still got to deal with needing to find multiple values if they appear in a different order.
If you search for 3%4 and your list contains 4;3, you won't find the record. Dogs dinner.
Your original query, you can replace that massive line of LIKE conditons with a WHERE on the junction table, joined to either of side of the many-to-many.
SELECT p.pid, p.name, p.phonenumber, s.sid FROM person p INNER JOIN specialisation s ON p.pid = s.pid WHERE s.sid = 3
Again, this can be GROUP_CONCAT'd if you wish. Just add that in the field list and a GROUP BY at the end.
If you want multiple specialisations, get an OR in there
SELECT p.pid, p.name, p.phonenumber, s.sid FROM person p INNER JOIN specialisation s ON p.pid = s.pid WHERE s.sid = 3 OR s.sid = 4;
No worry about order, you can DISTINCT it if you don't want duplication etc.
[Edited on 07-01-2010 by Ian]
|
Dom
Member
Registered: 13th Sep 03
User status: Offline
|
Cheers all,
Here's the basics -
We have a `subject` table, with an auto ID field and subject_title field (this is currently 550 rows but can be added to, so unlimited). The `Specialisation` field is part of a `teachers` table, and holds the subject IDs which as you see are delimiter'd (as a teacher can be assigned to multiple subjects, was originally only 3 subjects but the client changed their minds so it can be unlimited, but never multiples).
So there (should) never be any multiples and i only need to search one subject (ID) at a time. Hence why i was using LIKE.
And yes the DB structure is toss
The idea of having another table (with the teacher_id and subject_id fields) was briefly talked about but a few of the devs said that we were already duplicating data and it was becoming "silly" (having to do a shed load of joins just to pull a single result). But i suppose it's the only decent way of doing it.
cheers for the help though
|
ed
Member
Registered: 10th Sep 03
User status: Offline
|
I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong
[Edited on 07-01-2010 by ed]
|
James
Member
Registered: 1st Jun 02
Location: Surrey
User status: Offline
|
quote: Originally posted by Dom
The idea of having another table (with the teacher_id and subject_id fields) was briefly talked about but a few of the devs said that we were already duplicating data and it was becoming "silly" (having to do a shed load of joins just to pull a single result). But i suppose it's the only decent way of doing it.
Shed loads of joins is often the case in industry, I came across a query with 35 joins in the other day. Admittedly it was badly written but still.
And a shed load of joins will more often than not perform better than a shed load of likes (providing the DB has been well designed).
[Edited on 07-01-2010 by James]
|
Paul_J
Member
Registered: 6th Jun 02
Location: London
User status: Offline
|
quote: Originally posted by ed
I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong
[Edited on 07-01-2010 by ed]
which was pretty much what I said in my point 3.
Reading what the actual situation is dom, you need to seperate that data out of there. It's rediculous.
For this - I'd of personally suggested.
A Teachers Table
A Subject Table
And a joining table.
So you create your teacher table
Table: Teachers
Columns:
TeacherID *Primary Key*
TeacherName
TeacherAddress or whatever
Table: Subjects
Columns:
SubjectID *Primary Key*
SubjectName
SubjectDescription
Subjectwhatever
Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
TeacherID *ForeignKey*
SubjectID *ForeignKey*
...............
Fill with sample data.
e.g.
Teacher Table
1 - Julie Makov
2 - Mickey Mouse
3 - Gary Glitter
Subjects Table
1 - Maths
2 - English
3 - Science
4 - Bumming Children
SubjectAssignment Table
1 - 1 - 2
2 - 1 - 3
3 - 2 - 1
4 - 2 - 3
5 - 3 - 4
... Another words...
Julie teaches english and science
Mickey teaches maths and science
Gary teaches bumming children
... If you want all subjects beloging to a teacher, say julie.
Select SubjectName from Subjects inner join SubjectAssignment on SubjectID = SubjectID where TeacherID = '1'
You may need to do what Ian did above, I can't really say without bashing it into a query and trying to process it.
e.g.
Select a.SubjectName from Subjects a, SubjectAssignment b where a.SubjectID = b.SubjectID and b.TeacherID = '1'
and to take the other example...
to find all teacher that specialise in a subject.
Select a.TeacherName from Teachers a, SubjectAssignment b where a.TeacherID = b.TeacherID and b.SubjectID = '1'
[Edited on 07-01-2010 by Paul_J]
|
DaveyLC
Member
Registered: 8th Oct 08
Location: Berkshire
User status: Offline
|
Ive got some functions I've written to split by a character and return a dataset if you are interested.
|
Paul_J
Member
Registered: 6th Jun 02
Location: London
User status: Offline
|
quote: Originally posted by Paul_J
quote: Originally posted by ed
I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong
[Edited on 07-01-2010 by ed]
which was pretty much what I said in my point 3.
Reading what the actual situation is dom, you need to seperate that data out of there. It's rediculous.
For this - I'd of personally suggested.
A Teachers Table
A Subject Table
And a joining table.
So you create your teacher table
Table: Teachers
Columns:
TeacherID *Primary Key*
TeacherName
TeacherAddress or whatever
Table: Subjects
Columns:
SubjectID *Primary Key*
SubjectName
SubjectDescription
Subjectwhatever
Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
TeacherID *ForeignKey*
SubjectID *ForeignKey*
...............
Fill with sample data.
e.g.
Teacher Table
1 - Julie Makov
2 - Mickey Mouse
3 - Gary Glitter
Subjects Table
1 - Maths
2 - English
3 - Science
4 - Bumming Children
SubjectAssignment Table
1 - 1 - 2
2 - 1 - 3
3 - 2 - 1
4 - 2 - 3
5 - 3 - 4
... Another words...
Julie teaches english and science
Mickey teaches maths and science
Gary teaches bumming children
... If you want all subjects beloging to a teacher, say julie.
Select SubjectName from Subjects inner join SubjectAssignment on SubjectID = SubjectID where TeacherID = '1'
You may need to do what Ian did above, I can't really say without bashing it into a query and trying to process it.
e.g.
Select a.SubjectName from Subjects a, SubjectAssignment b where a.SubjectID = b.SubjectID and b.TeacherID = '1'
and to take the other example...
to find all teacher that specialise in a subject.
Select a.TeacherName from Teachers a, SubjectAssignment b where a.TeacherID = b.TeacherID and b.SubjectID = '1'
[Edited on 07-01-2010 by Paul_J]
On a side note...
As i said originally at the top, if you don't want to completely restructure everything / still wanted to use like (but only one like) then you could just append a ; to the front of the first and ; to the end which would mean to search for 3 would literally be like '%;3;%'
If the data already exists without this, it wouldn't be hard to quickly write a script to fix the data into this way (going through each row appending).
I personally would ditch it and move to a better system though.
|
xa0s
Banned
Registered: 4th Mar 08
Location: Dartford, Kent Car: Turbo'd Fabia vRS
User status: Offline
|
Database structure is so important, it needs to be gotten right from the start to avoid future problems like this.
|
Dom
Member
Registered: 13th Sep 03
User status: Offline
|
Cheers for all of the help have proposed the changes, will have to wait and see what the rest of the team says...
DaveyLC - YES! would be great if you could email/u2u them me
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
Can't see how normalisation would duplicate data? If anything you have less data, not more.
And if joins are slow, index them properly. James is quite right.
quote: Originally posted by James
And a shed load of joins will more often than not perform better than a shed load of likes
And the joins are a damn sight easier to maintain.
Did the guy actually use the word 'silly' at the meeting? A proper structure and elegant, inexpensive queries is not silly.
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
quote: Originally posted by Dom
DaveyLC - YES! would be great if you could email/u2u them me
No - you neither need nor want them. Fix the db.
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
quote: Originally posted by Paul_J
Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
There's referential integrity on the relationships so you can use a compound key to uniquely identify the row (if ever you actually need to). This would of course be unique, which means staff can't be awarded the same specialism twice. If you use an regular primary, you don't enforce this condition.
Only reason to have one would be if you want to use it as a foreign elsewhere which in this scenario might take the form of some sort of accreditation tables in which you list the quals or awards which qualify how come that guy is granted the specialism. Although I think if you raised that one at your team meeting they would probably fall off their chairs.
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
quote: Originally posted by Paul_J
if you don't want to completely restructure everything / still wanted to use like
Then you are mad as a box of frogs.
|
Paul_J
Member
Registered: 6th Jun 02
Location: London
User status: Offline
|
quote: Originally posted by Ian
quote: Originally posted by Paul_J
if you don't want to completely restructure everything / still wanted to use like
Then you are mad as a box of frogs.
Yeah but I also understand, if someone else is in charge and has made it into this fucking mess, they will probably be an idiot / too stubborn to change.
So I was just saying 'IF' he can't change the structure due to it being out of his control, then he can do it with a like still.
e.g. if it's an existing system, designed by some other moron, with a lot of existing data... Then you trying to change the structure, will make your bosses get angry... (as they prob won't understand what the problem is).
Just saying.
|
Dom
Member
Registered: 13th Sep 03
User status: Offline
|
no this is a completely new system. it's the fact that the client has some really obscure specs (so far from common sense that it's unreal) that have made it a complete bulls up trying to structure data and how data is processed. Eitherway, i've re-proposed the idea of a relationship table, hopefully the rest will agree and it'll get sorted.
Although using LIKE as paul previously said works ok, even doing multiple searches on the `specialisation` field. However, like you say it's far from ideal as it stands....
[Edited on 08-01-2010 by Dom]
|
Ian
Site Administrator
Registered: 28th Aug 99
Location: Liverpool
User status: Offline
|
With respect, if a student submitted the current one I would fail it.
There are loads and loads of great reasons to go with the proper approach and a few terrible ones to stick with what you have.
I'm curious to see what else in the spec prohibits normalisation? If you've got some more information here I would be interested to see/adjust the entire ERD.
|
Paul_J
Member
Registered: 6th Jun 02
Location: London
User status: Offline
|
I've got to agree with Ian.
There's no excuse for it, if it's a new system. I could half expect it on some old legacy system you're supporting ... but these days there's no excuse.
|