I was looking for a well-known CMS (Content Management System) that I could easily run in a Docker container as a target for information security reconnaissance tools, such as WhatWeb.
One of my servers runs a query once a week to remove all rows from a Syslog table (>20,000,000 rows) in a MySQL database that are older than 60 days. This was running terribly slowly and interfering with other tasks on the server.
Although the original query used a DELETE statement I’ve used SELECT statements in the examples below.
WHERE ReceivedAt < DATE_SUB(NOW(), INTERVAL 60 DAY);
That selects about 900,000 rows and takes about 45 seconds.
WHERE ReceivedAt < DATE_ADD(CURRENT_DATE, INTERVAL -60 DAY);
Likewise takes about 48 seconds.
Is MySQL running a function every time it makes a comparison? I decided to try using a hard-coded date to find out:
WHERE ReceivedAt < '2015-11-12 12:00:00';
6 seconds! Much faster.
I created a user-defined variable:
SET @sixty_days_ago = DATE_SUB(NOW(), INTERVAL 60 DAY);
Then ran the query:
WHERE ReceivedAt < @sixty_days_ago;
12 seconds. No 6 seconds, but still a fraction of the original time!
This one was quite a sticky wicket. So what if you have two tables in mySQL, with a one-to-many relationship (say a forum and comments in that forum), and each of the comments is dated. Now let’s say you want a result set containing each forum’s name, and with it the text of the newest comment in that forum.
The obvious way to do this is with a subselect, which mySQL doesn’t have. So how does one do it?
At first I thought that it might be impossible, but I have figured out the answer:
max(c.created_on) as last_date,
c2.created_on as created_on,
FROM forum f
LEFT JOIN comment c ON (c.forum_id = f.forum_id)
LEFT JOIN comment c2 ON (f.forum_id = c2.forum_id)
GROUP BY f.forum_id, c2.comment_id
HAVING created_on = last_date OR last_date IS NULL;
See, it’s pretty clever actually, though it leans on the “HAVING” option, which is sort of crappy. The trick is to join the comment table with the forum table twice. One of those joins gets grouped by forum id so that you can use the max function on it. The other doesn’t get grouped (by adding its primary key to the group by clause) so that you can still pull data out of it. Then you use the having clause to find only the row that has the max date.
I am also accepting values with a max value of NULL so that if a forum has no comments, it still comes back in the results.
Hope this helps someone, somewhere.
mysql_query(): 14 is not a valid MySQL-Link resource in <bla bla bal> on line 47
A few other people seem to have gotten this error, but no one has posted a solution. (though one guy oh-so-annoyingly posted “I figured it out, so never mind” … Grrr. I mean, if you’re going to post a question, the answer should be in that thread if you ever figure it out….
So as soon as I figure out the answer I’m going to post it here.
I’m back with the solution:
I got clued onto it from this page: http://bytes.com/forum/thread638479.html . The error is coming because mysql_close was being called by the destructor, and because I was in safe mode the same MySQL resource was being used for each instance. What threw me even more though was that the destructor was being called at all, because I thought I only _had_ one instance. Turns out that there is a spot in my code where I (accidentaly) passed my DB object by value rather than refrence. This made a new copy of the object, which ran mysql_connect again, because it was in safe mode it returened the _same_ refrence. Then the object got unloaded, the destructor ran and closed the refrence, even though there was another instance of the object out there still using the same refrence.