MySQL

Just a collection of MySQL stuff that I find most useful and often forget and then end up having to search to time and time again...

To Do/To Read

Installing on Ubuntu and adding a non-root user

Install mysql with root password. The installer will pop up a screen asking for the root password which you should set following usual strong password guidelines.

 sudo apt-get install mysql-server 

You can check that the install worked by seeing if the MySQL server is up and running:

$ sudo netstat -tap | grep mysql
tcp    0      0 localhost:mysql    *:*        LISTEN      7168/mysqld
                ^
                Note: by default MySQL binds to local loopback address

You should see the above if the server is running. Note that by default MySQL will only bind to localhost. For any other setup you must edit the configuration file (/etc/mysql/my.cnf on Ubuntu) to change the line "bind-address = 127.0.0.1" to bind to your host's address (or 0.0.0.0 to listen on all interfaces) and then restart the server ("sudo /etc/init.d/mysql restart").

The MySQL manual has this to say about binding to a specific address: ...If you intend to bind the server to a specific address, be sure that the mysql.user grant table contains an account with administrative privileges that you can use connect to that address. Otherwise, you will not be able to shut down the server...

If, for example, you set bind-address=0.0.0.0, running netstat would show the following.

$ sudo netstat -tap | grep mysql
tcp    0      0 *:mysql    *:*        LISTEN      7168/mysqld
                ^
                Note: the new bind address

To check if the MySQL DB is accepting remote connections you can use the following from a remote machine.

mysqladmin ping -u <your-username> -p<your-password> -h <host-ip-address>
                                     ^
                                     Note: no space between '-p' and pwd string

If it isn't then you will need to trouble shoot. If the netstat on the MySQL host is showing the server as running and bound to the correct interface, then try checking firewall rules using sudo iptables -L to list the firewall rules being enforced, but to allow access to your MySQL server through your filewall is beyond the scope of this page... I contact my sysadmin!

Now you can login to the MySQL console, on the machine hosting the database, as root (root access is only local):

$ mysql -u root -p
Enter password: password-you-chose-during-install

Create a new user for your database (NewUser in example below) that is not root. Only has access to a specific database (called A_Specfic_DB database in example below) but can access all tables in that database and do almost anything in that one database except create and drop tables and modify privalidges. Doing this so that the scripts I use to access the DB will have limited scope to do terrible damage. Might also consider not granting DELETE priviladges too.

CREATE USER 'NewUser'@'localhost' IDENTIFIED BY 'user-password';
GRANT DELETE,INSERT,SELECT,UPDATE ON A_Specific_DB . * TO 'NewUser'@'localhost';
FLUSH PRIVILEGES;

MySQL account names consist of a user name and a host name. This enables creation of accounts for users with the same name who can connect from different hosts. In the above example the DB will only accept connections made by NewUser from the local machine. To specify that NewUser can connect from anywhere we would CREATE USER 'NewUser', which is equivalent to CREATE USER 'NewUser'@'%'.

If you make any mistakes you can delete the user using the following.

DROP USER 'NewUser'@'localhost';

Or, to further debug if something goes wrong the following will be useful.

SELECT User FROM mysql.user;             /*< List all users */
SHOW GRANTS;                             /*< Show all grants for all user accounts */
SHOW GRANTS FOR CURRENT_USER();          /*< Show grants for the currently logged in user */
SHOW GRANTS FOR 'user-name'@'host-name'; /*< Show grants for a specific user account */

For example execiting the statement on line 4, once you have completed the above steps to create a user should output the following.

mysql> SHOW GRANTS FOR 'NewUser'@'locahost';
+-----------------------------------------------------------------------------------+
| Grants for NewUser@locahost                                                       |
+-----------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO 'NewUser'@'locahost'                                        |
| GRANT SELECT, INSERT, UPDATE, DELETE ON `A_Specific_DB`.* TO 'NewUser'@'locahost' |
+-----------------------------------------------------------------------------------+

Worth noting that GRANT OPTION is not covered in a normal grant all command. It has to be specifically and explicitly added to a user's permissions.

Dump and Restore A Database

To dump and restore the database schema and all data:

mysqldump --databases <db-name> --result-file=filename.sql -u root -p<root_password>
mysql -u root -p<root_password> <database_name> < filename.sql

To dump just the schema:

mysqldump --databases <db-name> --result-file=filename.sql -u root -p<root_password> --no-data
or just
mysqldump -u root -p<password> --no-data <db-name>

Describing Tables & Information Schemas

The easist way to get information about a specific table is to see the SQL that would be required to generate it. To do this you would type the following...

SHOW CREATE TABLE <table name>

This isn't particularly easy to parse from a script though. I wanted to get information about primary keys etc so the following solution works a lot better in this case.

SELECT COLUMN_NAME,CONSTRAINT_NAME, REFERENCED_TABLE_NAME,REFERENCED_COLUMN_NAME
FROM INFORMATION_SCHEMA.KEY_COLUMN_USAGE 
WHERE TABLE_SCHEMA = <database name> AND TABLE_NAME = <table name>;

Note: If you are seeing a lot of NULLs it could be that you do not have sufficient priviledges to see those entries. For example, with InnoDB tables you only need the PROCESS privilege. Basically, regarless of which engine you use, you will need have some (appropriate) privilege on an object to see information about it.

Some SQL Revision

Select Unique Values

SELECT DISTINCT ... 

Sorting Data

SELECT ...
FROM ...
WHERE ...
ORDER BY col1, ..., colN [ASC|DESC]

Limit Number Of Rows Returned

SELECT ... FROM ... LIMIT 10;   # Selects the first 10 rows
SELECT ... FROM ... LIMIT 5,10; # Select rows 6-15. First argument specifies
                                # the offset of the first row to return (indexed
                                # from 0 not 1), and the second specifies the
                                # maximum number of rows to return 

Filtering Data

WHERE [NOT] x OP y
WHERE x <> y                   # Non matches
WHERE x != y                   # Non matches
WHERE x IS NULL                # How to check for NULL values
WHERE x = 1 AND y = 2 OR x = 3 # AND has higest precedence
WHERE x IN (1, 2, ...)         # Normally quicker than list of ORs
WHERE x IN (SELECT ...)        # IN can contain sub-selects
WHERE x LIKE '%something%'     # Wildcard matching
WHERE x BETWEEN a AND b        # Check a range of values.
                               # For dates use #mm/dd/yyyy#

When wildcard matching using the LIKE operator the following wildcards could be useful: %, _, [].

% means match any number of occurences of any character. Note that fixed width char fields may be padded with spaces!

_ matches any single character.

[] specifies a set of characters. Can negate the set of chars using [^].

Be aware that when comparing NULLs you must use IS NULL:

-- WRONG!!
select case when NULL = NULL then 'Null Is Null' else 'WTF' end as Result;

The above will output the text "WTF"! The reason is that to compare NULL values, we should use the following to get the correct result:

-- RIGHT
select case when NULL IS NULL then 'Null Is Null' else 'WTF' end as Result; -- CORRECT :)

Grouping Data

SELECT group_by_col, FUNC1(*) AS new_col_name, FUNC2(*) AS newer_col_name, ...
FROM ...
WHERE ...
GROUP BY group_by_col
HAVING FUNC1(*) OP condition

Note the difference between WHERE and HAVING. The former selects rows and the latter selects groups. Therefore WHERE will select what is aggregated into groups (i.e., which rows have the aggregating functions applied to them) and HAVING specifies which of the resulting group measures make it to the final "cut". Put another way, WHERE filters data before it is grouped and HAVING filters data after it has been grouped.

FUNC() is generally one of AVG(), COUNT([DISTINCT]), MAX(), MIN(), STD(), SUM(), VARIANCE() etc...

Subqueries

Using subqueries to build a list of selection criteria for the WHERE clause:

SELECT ...
FROM ...
WHERE x IN (SELECT field
   FROM ...
   WHERE y IN (SELECT field
      FROM ...
      WHERE ... and so on...));

Also can test for at least one match or all matches of an operator comarison. For example, wanting to select a row if a column is greater than at least one value from a subquery etc...

SELECT ...
FROM
WHERE x <OP> ANY|ALL (SELECT field ...);

Using subqueries to generate calculated fields:

SELECT x,
       y,
       (SELECT FUNC(*)
               FROM ...
               WHERE ...) AS z
WHERE ....

Multiple Joins On Same Table

To JOIN on the same table more than once the key is to use aliases to rename each instance of the target table so that we can distinguish the multiple copies of the same sets of columns from the target table...

A little test to demonstrate this... first let's create two tables. The Main table contains two foreign keys, each of which references a (possible different but not necessarily) row in the same target table, Ref.

CREATE TABLE Ref
(
refId INT NOT NULL AUTO_INCREMENT,
refName VARCHAR(10) NOT NULL,
PRIMARY KEY(refId)
) ENGINE=InnoDB;

CREATE TABLE Main
(
mainId INT NOT NULL AUTO_INCREMENT,
refA INT NOT NULL,
refB INT NOT NULL,
PRIMARY KEY(mainId),
FOREIGN KEY(refA) REFERENCES Ref(refId),
FOREIGN KEY(refB) REFERENCES Ref(refId)
) ENGINE=InnoDB;

Next let's put some values into the references table and then insert values into the Main table so that it's foreign keys both reference rows in the referenced table, Ref.

INSERT INTO Ref VALUES (NULL, 'RefA');
INSERT INTO Ref VALUES (NULL, 'RefB');
INSERT INTO Ref VALUES (NULL, 'RefC');

INSERT INTO Main
   SELECT NULL, r1.refId, r2.refId
      FROM Ref AS r1 CROSS JOIN Ref AS r2
      WHERE r1.refName='RefA' AND r2.refName='RefB';

INSERT INTO Main
   SELECT NULL, r1.refId, r2.refId
      FROM Ref AS r1 CROSS JOIN Ref AS r2
      WHERE r1.refName='RefA' AND r2.refName='RefC';

INSERT INTO Main
   SELECT NULL, r1.refId, r2.refId
      FROM Ref AS r1 CROSS JOIN Ref AS r2
      WHERE r1.refName='RefB' AND r2.refName='RefC';

To insert the correct foreign keys into the Main table I'm using a CROSS JOIN. The MySQL manual explains that "...In MySQL, JOIN, CROSS JOIN, and INNER JOIN are syntactic equivalents (they can replace each other). In standard SQL, they are not equivalent. INNER JOIN is used with an ON clause, CROSS JOIN is used otherwise...". I'm joining Ref on Ref so that I can select the primary key for each referenced item by name to insert the key as a foreign key in Main.

SELECT * FROM Main;

SELECT Main.*, r1.refName AS refAName, r2.refName as refBName
FROM Main
INNER JOIN Ref AS r1 ON Main.refA=r1.refId
INNER JOIN Ref AS r2 ON Main.refB=r2.refId;

This produces the following output,

mainId   refA   refB
1        1      2
2        1      3
3        2      3

mainId   refA   refB   refAName   refBName
1        1      2      RefA       RefB
2        1      3      RefA       RefC
3        2      3      RefB       RefC

Note that to produce this effect we do need to do two distinct INNER JOINs. Just joining on multiple conditions is not appropriate here as this will only join one column:

SELECT * FROM Main INNER JOIN Ref AS r1 ON Main.refA=r1.refId OR Main.refB=r1.refId;

Produces:

mainId   refA   refB   refId   refName
1        1      2      1       RefA
1        1      2      2       RefB
2        1      3      1       RefA
2        1      3      3       RefC
3        2      3      2       RefB
3        2      3      3       RefC

ON vs. USING vs Natural Join

Credits to Shlomi Noach and Tom Mac who posted their answers to this on SO.

CREATE TABLE A( aId INT NOT NULL AUTO_INCREMENT, PRIMARY KEY (aId));
CREATE TABLE B( bId INT NOT NULL AUTO_INCREMENT, aId INT NOT NULL, PRIMARY KEY (bId));
> SELECT * FROM A;
+-----+
| aId |
+-----+
|   1 |
|   2 |
|   3 |
+-----+





  and



> SELECT * FROM B;
+-----+-----+
| bId | aId |
+-----+-----+
|   1 |   1 |
|   2 |   1 |
|   3 |   1 |
|   4 |   2 |
|   5 |   3 |
+-----+-----+
-- Using USING aId appears once...
> SELECT * FROM A JOIN B USING (aId);
+-----+-----+
| aId | bId |
+-----+-----+
|   1 |   1 |
|   1 |   2 |
|   1 |   3 |
|   2 |   4 |
|   3 |   5 |
+-----+-----+





  VS.



-- Using ON aId appears twice...
> SELECT * FROM A JOIN B ON A.aId=B.aId;
+-----+-----+-----+
| aId | bId | aId |
+-----+-----+-----+
|   1 |   1 |   1 |
|   1 |   2 |   1 |
|   1 |   3 |   1 |
|   2 |   4 |   2 |
|   3 |   5 |   3 |
+-----+-----+-----+

Using ON is not much different from a NATRUAL JOIN. From the MySQL documentation: The NATURAL [LEFT] JOIN of two tables is defined to be semantically equivalent to an INNER JOIN or a LEFT JOIN with a USING clause that names all columns that exist in both tables.

For example, in the following, the two tables t1 and t2 are joined on all commonly named columns between the two tables, and those common column names only appear once in the result.

SELECT * FROM t1 NATURAL JOIN t2;

Use Left Join To Find What Isn't In The Other Table

We have two tables, "A" and "B", table "A" with a column (not necessarily the primary key) id and "B" with id, which would usually be a foreign key referecing A.id. The challenge is to find all the ids in table "A" that do not occur in table "B".

To do this, left join A with B on id and select all the rows in B which are NULL.

SELECT DISTINCT ...
FROM a
LEFT JOIN b
ON id
WHERE b.id IS NULL

Find Rows Associated With All Other Rows From A Set

Credits to Marcin Juraszek who posted his answer to this on SO.

In this example there is a table of items and a table of groups. Each group is identified by a unique name. There is an items2groups table to provide a many-to-many relationship.

The challenge is to find all items that belong to all groups in a set of groups...

SELECT *
FROM items
WHERE ( items.groupId IN ( SELECT groupId
                           FROM items2groups
                           JOIN groups
                           USING (groupId)
                           WHERE groups.groupName IN (<LIST>)
                           GROUP BY groupId
                           HAVING COUNT(itemId) NOT = LEN(<LIST>)
                         )
      )

The drawback here is that the expression LEN(<LIST>) is not SQL. It should resolve to the number of items in <LIST> which implies the query has been built dynamically.

INTERSECT And MINUS In MySQL

Credits to Carsten H. Pedersen's article Doing INTERSECT and MINUS in MySQL.