corsasport.co.uk
 

Corsa Sport » Message Board » Off Day » Geek Day » SQL Query Help (LIKE searching)


New Topic

New Poll
  Subscribe | Add to Favourites

You are not logged in and may not post or reply to messages. Please log in or create a new account or mail us about fixing an existing one - register@corsasport.co.uk

There are also many more features available when you are logged in such as private messages, buddy list, location services, post search and more.


Author SQL Query Help (LIKE searching)
Dom
Member

Registered: 13th Sep 03
User status: Offline
6th Jan 10 at 20:50   View User's Profile U2U Member Reply With Quote

Right i'm having a few woes searching within a delimiter field (called `Specialisation`) in a table, field data example - '2;19;14;13;1;5;6'. The field can also be empty or only be a single number (and not have a delimiter).

Now i've been using LIKE in the form of LIKE '%19%' which is fine but i've just realised if i'm searching for the number 3 and use LIKE '%3%', it will return rows where this field contains 3 and that includes 13/23/33/43 etc which is not what i'm after.

I've currently knocked up the following -
code:
WHERE `Specialisation` LIKE '%;3;%' OR `Specialisation` LIKE '3;%' OR `Specialisation` LIKE '%;3' OR `Specialisation` LIKE '3'

And so far it seems to work but surely there must be a better solution (regex perhaps?)?

Cheers for any help,
Dom

p.s - i've looked into full searches, but it's possible for the table to contain no data or search string to appear in more that 50% of the data....

[Edited on 06-01-2010 by Dom]
Paul_J
Member

Registered: 6th Jun 02
Location: London
User status: Offline
6th Jan 10 at 23:05   View User's Profile U2U Member Reply With Quote

Thing is, you don't usually have a sub set of data within a field.

Usually a field is a value, and therefore you are matching that value with the like.

However, by having delimiter'd values within the field, it's like having another sub set of data inside (like an array).

So you're not searching for all fields which contain 3. You're searching for sub sets that contain 3. Which is more a pain due to as you found out, the possibilities being

3 - on its own
3; - first element
;3 - last element
;3; - in the middle

I would personally be looking into why this data is like this and re designing it to be better.

- either You could use the sql you've got.

- or You could always add ';' to the object... e.g.
;3; - on its own
;3;1;2; - first element
;1;2;3; - last element
;2;3;1; - middle

Then you only need

select * from table where Specialisation like '%;3;%'

- Or you could look at moving the sub set data to another table perhaps?

e.g.

Table 1 Specialisation
Table 2 SubSet data

Specialisation has a header id
Subset has a detail id and a link back to the headerid e.g. a row for each 'element'

Then

Select * from specialisation where special_ID in (select special_ID from subset where element = 3)

Thus returning all header lines that contain a subset element 3.


[Edited on 06-01-2010 by Paul_J]
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
7th Jan 10 at 02:34   View Garage View User's Profile U2U Member Reply With Quote

That database structure is so shit you need to stop right now and do it properly.

If your data right now is:

1,2;19;14
2,6;3;7
3,33;2;4

Make a table called specialisation and populate it:

pid,sid
1,2
1,19
1,14
2,6
2,3
2,7
3,33
3,2
3,4

If you really really want the list back in that format and you're using MySQL, you can use GROUP_CONCAT to reconstruct it:

SELECT pid, GROUP_CONCAT(sid) FROM specialisation GROUP BY pid

You can reformat the GROUP_CONCAT with the separator argument, order them etc.

Presumably then you can also write some queries because those specialisation values have some textual equivalent in another table.

What you actually have I think is a many-to-many relationship between the person and the specialisation, therefore the table in question which I describe is the junction which facilitates this relationship.

My strong advice - stop messing with LIKE which is a nightmare and fix the structure! If you can't fix the structure, refuse to work on such a poor data structure until such time at that whoever is in charge sees sense! I really mean that!

If you stick with LIKE and get the substring syntax good, you've still got to deal with needing to find multiple values if they appear in a different order.

If you search for 3%4 and your list contains 4;3, you won't find the record. Dogs dinner.

Your original query, you can replace that massive line of LIKE conditons with a WHERE on the junction table, joined to either of side of the many-to-many.

SELECT p.pid, p.name, p.phonenumber, s.sid FROM person p INNER JOIN specialisation s ON p.pid = s.pid WHERE s.sid = 3

Again, this can be GROUP_CONCAT'd if you wish. Just add that in the field list and a GROUP BY at the end.

If you want multiple specialisations, get an OR in there

SELECT p.pid, p.name, p.phonenumber, s.sid FROM person p INNER JOIN specialisation s ON p.pid = s.pid WHERE s.sid = 3 OR s.sid = 4;

No worry about order, you can DISTINCT it if you don't want duplication etc.

[Edited on 07-01-2010 by Ian]
Dom
Member

Registered: 13th Sep 03
User status: Offline
7th Jan 10 at 14:06   View User's Profile U2U Member Reply With Quote

Cheers all,
Here's the basics -
We have a `subject` table, with an auto ID field and subject_title field (this is currently 550 rows but can be added to, so unlimited). The `Specialisation` field is part of a `teachers` table, and holds the subject IDs which as you see are delimiter'd (as a teacher can be assigned to multiple subjects, was originally only 3 subjects but the client changed their minds so it can be unlimited, but never multiples).
So there (should) never be any multiples and i only need to search one subject (ID) at a time. Hence why i was using LIKE.

And yes the DB structure is toss

The idea of having another table (with the teacher_id and subject_id fields) was briefly talked about but a few of the devs said that we were already duplicating data and it was becoming "silly" (having to do a shed load of joins just to pull a single result). But i suppose it's the only decent way of doing it.

cheers for the help though
ed
Member

Registered: 10th Sep 03
User status: Offline
7th Jan 10 at 16:28   View User's Profile U2U Member Reply With Quote

I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong

[Edited on 07-01-2010 by ed]
James
Member

Registered: 1st Jun 02
Location: Surrey
User status: Offline
7th Jan 10 at 16:59   View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Dom


The idea of having another table (with the teacher_id and subject_id fields) was briefly talked about but a few of the devs said that we were already duplicating data and it was becoming "silly" (having to do a shed load of joins just to pull a single result). But i suppose it's the only decent way of doing it.



Shed loads of joins is often the case in industry, I came across a query with 35 joins in the other day. Admittedly it was badly written but still.

And a shed load of joins will more often than not perform better than a shed load of likes (providing the DB has been well designed).

[Edited on 07-01-2010 by James]
Paul_J
Member

Registered: 6th Jun 02
Location: London
User status: Offline
7th Jan 10 at 17:20   View User's Profile U2U Member Reply With Quote

quote:
Originally posted by ed
I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong

[Edited on 07-01-2010 by ed]


which was pretty much what I said in my point 3.

Reading what the actual situation is dom, you need to seperate that data out of there. It's rediculous.

For this - I'd of personally suggested.

A Teachers Table
A Subject Table
And a joining table.

So you create your teacher table

Table: Teachers
Columns:
TeacherID *Primary Key*
TeacherName
TeacherAddress or whatever

Table: Subjects
Columns:
SubjectID *Primary Key*
SubjectName
SubjectDescription
Subjectwhatever

Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
TeacherID *ForeignKey*
SubjectID *ForeignKey*

...............

Fill with sample data.

e.g.

Teacher Table
1 - Julie Makov
2 - Mickey Mouse
3 - Gary Glitter

Subjects Table
1 - Maths
2 - English
3 - Science
4 - Bumming Children

SubjectAssignment Table
1 - 1 - 2
2 - 1 - 3
3 - 2 - 1
4 - 2 - 3
5 - 3 - 4

... Another words...
Julie teaches english and science
Mickey teaches maths and science
Gary teaches bumming children

... If you want all subjects beloging to a teacher, say julie.

Select SubjectName from Subjects inner join SubjectAssignment on SubjectID = SubjectID where TeacherID = '1'

You may need to do what Ian did above, I can't really say without bashing it into a query and trying to process it.

e.g.
Select a.SubjectName from Subjects a, SubjectAssignment b where a.SubjectID = b.SubjectID and b.TeacherID = '1'

and to take the other example...
to find all teacher that specialise in a subject.

Select a.TeacherName from Teachers a, SubjectAssignment b where a.TeacherID = b.TeacherID and b.SubjectID = '1'


[Edited on 07-01-2010 by Paul_J]
DaveyLC
Member

Registered: 8th Oct 08
Location: Berkshire
User status: Offline
7th Jan 10 at 17:23   View User's Profile U2U Member Reply With Quote

Ive got some functions I've written to split by a character and return a dataset if you are interested.
Paul_J
Member

Registered: 6th Jun 02
Location: London
User status: Offline
7th Jan 10 at 17:32   View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Paul_J
quote:
Originally posted by ed
I thought the same as Ian, but I didn't want to say in case I looked n00b if I was wrong

[Edited on 07-01-2010 by ed]


which was pretty much what I said in my point 3.

Reading what the actual situation is dom, you need to seperate that data out of there. It's rediculous.

For this - I'd of personally suggested.

A Teachers Table
A Subject Table
And a joining table.

So you create your teacher table

Table: Teachers
Columns:
TeacherID *Primary Key*
TeacherName
TeacherAddress or whatever

Table: Subjects
Columns:
SubjectID *Primary Key*
SubjectName
SubjectDescription
Subjectwhatever

Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
TeacherID *ForeignKey*
SubjectID *ForeignKey*

...............

Fill with sample data.

e.g.

Teacher Table
1 - Julie Makov
2 - Mickey Mouse
3 - Gary Glitter

Subjects Table
1 - Maths
2 - English
3 - Science
4 - Bumming Children

SubjectAssignment Table
1 - 1 - 2
2 - 1 - 3
3 - 2 - 1
4 - 2 - 3
5 - 3 - 4

... Another words...
Julie teaches english and science
Mickey teaches maths and science
Gary teaches bumming children

... If you want all subjects beloging to a teacher, say julie.

Select SubjectName from Subjects inner join SubjectAssignment on SubjectID = SubjectID where TeacherID = '1'

You may need to do what Ian did above, I can't really say without bashing it into a query and trying to process it.

e.g.
Select a.SubjectName from Subjects a, SubjectAssignment b where a.SubjectID = b.SubjectID and b.TeacherID = '1'

and to take the other example...
to find all teacher that specialise in a subject.

Select a.TeacherName from Teachers a, SubjectAssignment b where a.TeacherID = b.TeacherID and b.SubjectID = '1'


[Edited on 07-01-2010 by Paul_J]


On a side note...

As i said originally at the top, if you don't want to completely restructure everything / still wanted to use like (but only one like) then you could just append a ; to the front of the first and ; to the end which would mean to search for 3 would literally be like '%;3;%'

If the data already exists without this, it wouldn't be hard to quickly write a script to fix the data into this way (going through each row appending).

I personally would ditch it and move to a better system though.
xa0s
Banned

Registered: 4th Mar 08
Location: Dartford, Kent Car: Turbo'd Fabia vRS
User status: Offline
7th Jan 10 at 18:34   View User's Profile U2U Member Reply With Quote

Database structure is so important, it needs to be gotten right from the start to avoid future problems like this.
Dom
Member

Registered: 13th Sep 03
User status: Offline
7th Jan 10 at 20:06   View User's Profile U2U Member Reply With Quote

Cheers for all of the help have proposed the changes, will have to wait and see what the rest of the team says...

DaveyLC - YES! would be great if you could email/u2u them me
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
8th Jan 10 at 00:23   View Garage View User's Profile U2U Member Reply With Quote

Can't see how normalisation would duplicate data? If anything you have less data, not more.

And if joins are slow, index them properly. James is quite right.

quote:
Originally posted by James
And a shed load of joins will more often than not perform better than a shed load of likes


And the joins are a damn sight easier to maintain.

Did the guy actually use the word 'silly' at the meeting? A proper structure and elegant, inexpensive queries is not silly.
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
8th Jan 10 at 00:23   View Garage View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Dom
DaveyLC - YES! would be great if you could email/u2u them me
No - you neither need nor want them. Fix the db.
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
8th Jan 10 at 00:31   View Garage View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Paul_J
Table: SubjectAssignment(or whatever)
Columns:
AssignmentID *PrimaryKey*
There's referential integrity on the relationships so you can use a compound key to uniquely identify the row (if ever you actually need to). This would of course be unique, which means staff can't be awarded the same specialism twice. If you use an regular primary, you don't enforce this condition.

Only reason to have one would be if you want to use it as a foreign elsewhere which in this scenario might take the form of some sort of accreditation tables in which you list the quals or awards which qualify how come that guy is granted the specialism. Although I think if you raised that one at your team meeting they would probably fall off their chairs.
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
8th Jan 10 at 00:33   View Garage View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Paul_J
if you don't want to completely restructure everything / still wanted to use like
Then you are mad as a box of frogs.
Paul_J
Member

Registered: 6th Jun 02
Location: London
User status: Offline
8th Jan 10 at 00:52   View User's Profile U2U Member Reply With Quote

quote:
Originally posted by Ian
quote:
Originally posted by Paul_J
if you don't want to completely restructure everything / still wanted to use like
Then you are mad as a box of frogs.


Yeah but I also understand, if someone else is in charge and has made it into this fucking mess, they will probably be an idiot / too stubborn to change.

So I was just saying 'IF' he can't change the structure due to it being out of his control, then he can do it with a like still.

e.g. if it's an existing system, designed by some other moron, with a lot of existing data... Then you trying to change the structure, will make your bosses get angry... (as they prob won't understand what the problem is).

Just saying.
Dom
Member

Registered: 13th Sep 03
User status: Offline
8th Jan 10 at 00:59   View User's Profile U2U Member Reply With Quote

no this is a completely new system. it's the fact that the client has some really obscure specs (so far from common sense that it's unreal) that have made it a complete bulls up trying to structure data and how data is processed. Eitherway, i've re-proposed the idea of a relationship table, hopefully the rest will agree and it'll get sorted.

Although using LIKE as paul previously said works ok, even doing multiple searches on the `specialisation` field. However, like you say it's far from ideal as it stands....

[Edited on 08-01-2010 by Dom]
Ian
Site Administrator

Avatar

Registered: 28th Aug 99
Location: Liverpool
User status: Offline
8th Jan 10 at 01:03   View Garage View User's Profile U2U Member Reply With Quote

With respect, if a student submitted the current one I would fail it.

There are loads and loads of great reasons to go with the proper approach and a few terrible ones to stick with what you have.

I'm curious to see what else in the spec prohibits normalisation? If you've got some more information here I would be interested to see/adjust the entire ERD.
Paul_J
Member

Registered: 6th Jun 02
Location: London
User status: Offline
8th Jan 10 at 09:25   View User's Profile U2U Member Reply With Quote

I've got to agree with Ian.

There's no excuse for it, if it's a new system. I could half expect it on some old legacy system you're supporting ... but these days there's no excuse.

 
New Topic

New Poll

  Related Threads Author Forum Replies Views Last Post
Unsure if anyone can help me with this Steve Geek Day 21 972
4th Oct 06 at 19:49
by Steve
 
SQL - pronounciation Sam Geek Day 31 1818
8th Jan 10 at 17:41
by Tom G
 
(my)SQL dudes Dom Geek Day 9 642
3rd May 07 at 23:50
by Dom
 
VB6 problem - strings Ian Geek Day 19 509
12th Mar 08 at 23:30
by Ian
 
PHP help again Dan Lewis Geek Day 25 1371
3rd Feb 09 at 21:26
by Dan Lewis
 

Corsa Sport » Message Board » Off Day » Geek Day » SQL Query Help (LIKE searching) 29 database queries in 0.0204930 seconds