January 5, 2021
(Updated on April 3, 2025)
MySQL - Helpful Tidbits
My experiences at Rev360 (RevolutionEHR) and Mercato brought about developing for back-ends that leverage MySQL for data storage solutions. All my other previous experiences were within the context of Oracle, DB2, and SQL Server database technologies, and I found myself constantly Googling to see how to do things better in MySQL. Below are some of those findings.
Perform batch UPDATEs with a temp table
When performing UPDATE
s against a table in bulk, MySQL is pretty non-performant when given a raw UPDATE
statement with a WHERE
clause:
UPDATE A
SET foo = 'bar' WHERE id < 1000;
The best way of handling bulk UPDATE
s turns out to be establishing a temp table containing the values that encapsulate the data for the UPDATE
, and then performing the UPDATE
with a JOIN
against the temporary table.
DROP TABLE IF EXISTS `tempFooValues`;
CREATE TABLE `tempFooValues` (
`id` INT(11) UNSIGNED NOT NULL,
`foo` VARCHAR(255) NOT NULL,
PRIMARY KEY(`id`)
);
INSERT INTO tempFooValues (id, foo) VALUES
(1, 'bar'),
(2, 'bar'),
...,
...,
(1000, 'bar');
UPDATE A
INNER JOIN tempFooValues ON tempFooValues.id = A.id
SET
A.foo = tempFooValues.foo;
Leveraging CASE within a JOIN
This one is pretty self-explanatory...
SELECT * FROM A
JOIN B
ON CASE
WHEN B.type IN (1, 3) AND B.a_id = A.id THEN 1
WHEN B.type IN (2) AND B.other_id = A.other_id THEN 1
ELSE 0
END = 1;
Converting Timezones from UTC
The arguments for the "from" and "to" timezones in CONVERT_TZ
are the offsets from
UTC. Therefore, +00:00
is itself UTC, and -07:00
would equate to PT in the U.S.
SELECT CONVERT_TZ(created, '+00:00','-07:00') FROM users;
Performing Maintenance on Huge Tables
Old faithful has been holding down the fort for 9 years. It's full of juicy data that's both important for reporting and critical for application business logic for end users. It is also missing a ton of helpful index values underneath the columns matter most.
Your reports won't report, and you know in your gut that ALTER TABLE
in real time on
old faithful will take about 10,000 years to execute.
So, you perform the following:
- Create a new table that contains the index values you require
- Copy the data from old faithful into this clone table with indices
- Rename (or drop) the original old faithful, and rename the clone to become old faithful
BEGIN;
CREATE TABLE IF NOT EXISTS oldFaithfulWithIndices LIKE oldFaithful;
ALTER TABLE oldFaithfulWithIndices
ADD INDEX `account_id_index`(`account_id`),
ADD INDEX `email_address_index`(`email_address`),
ADD INDEX `created_datetime_index`(`created_datetime`),
ADD INDEX `updated_datetime_index`(`updated_datetime`);
INSERT INTO oldFaithfulWithIndices SELECT * FROM oldFaithful;
RENAME TABLE oldFaithful TO oldFaithful_archive;
RENAME TABLE oldFaithfulWithIndices TO oldFaithful;
COMMIT;
Adding composite unique indexes to existing table
When working with a table that already contains data that would violate a desired unique
index
you wish to apply to a table, it's important you scrub the dupes.
DELETE FROM my_table
WHERE id NOT IN (
SELECT MIN(id)
FROM my_table
GROUP BY col_1, col_2
)
AND (col_1, col_2) in (
SELECT col_1, col_2
FROM my_table
GROUP BY col_1, col_2
HAVING count(*) > 1
);
GROUP_CONCAT()
The GROUP_CONCAT()
function is used to aggregate the results of a column into a single field.
Use cases...
Concatenating all single-column results into a single field results:
SELECT GROUP_CONCAT(email_address) FROM customer;
Grouping all results of a column into a single field relative to a shared group identifier
SELECT customer, GROUP_CONCAT(order.id)
FROM customer
JOIN order ON customer.id = order.customer_id
GROUP BY customer.id
GROUP_CONCAT()
allows for customizability of the structure of the aggregated data:
SELECT customer, GROUP_CONCAT(order.id ORDER BY order.id DESC SEPARATOR '| ')
FROM customer
JOIN order ON customer.id = order.customer_id
GROUP BY customer.id
LAST_INSERT_ID()
For any given session in the database engine, the session user can inspect their latest insertion ID with:
SELECT LAST_INSERT_ID();
This function operates on a per-connection basis, meaning that when the DB session user terminates their
current connection, their next session will yield a different result. A fresh session yields 0
.