Tuesday, January 20, 2009

Testing MySQL Query Results

Often times, developers talk about testing code, but few spend the time to test data changes made in SQL adequately. I've found that using code similar to this has been a real help to me in isolating potential bugs in how I've coded my own inserts, updates, and deletes. Many would argue that there are better ways to do this, often from an external language, but when all you have is plain old SQL, there are effective ways to perform tests.

Breaking it down, there are three basic steps I use:


  1. Drop and re-create a table to store results in

  2. Write inserts to that table from selects that verify key parts of expected results

  3. Inspect the outcome of the tests



Drop and re-create a table to store results in

SQL Code:


DROP TABLE IF EXISTS test_resultset;
CREATE TABLE test_resultset (
test_name VARCHAR(255) NOT NULL
, created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
, passfail ENUM('Pass', 'Fail') NOT NULL DEFAULT 'Fail'
, result_count INT UNSIGNED NOT NULL DEFAULT 0
);


Write inserts to that table from selects that verify key parts of expected results


INSERT INTO test_resultset ( test_name, passfail, result_count )
SELECT '01 first_test_table.first_test_column' AS test_name
, IF( SOME CONDITION , 'Pass', 'Fail' ) AS passfail
, COUNT(*) AS result_count
...
GROUP
BY passfail
;


Inspect the outcome of the tests


SELECT *
FROM test_resultset
;


I've found that when I write my tests first, it helps me know when I've written "enough" code to be able to say a particular step is complete. When someone asks me what my code is supposed to do, I use my test cases to explain it to them.

Recently, I made modifications to reports that were very intricate. To make it easier on me, I wrote SQL code that verified my modifications before I made them. I used tests that looked something like this (the real tables are named descriptively, unlike these):


INSERT
INTO test_resultset
SELECT '41 mlr.withheld_amount = SUM(mlrd.withheld_amount)' AS test_name
, IF( ABS( mlr.withheld_amount - mlrdj.withheld_amount ) < 0.001
, 'Pass'
, 'Fail' ) AS passfail
, COUNT(*)
FROM mlr

JOIN
(
SELECT id
, SUM(withheld_amount) AS withheld_amount
FROM mlrd
)
AS mlrdj
ON mlr.mlrd_id = mlrdj.id

GROUP
BY passfail
;


When I looked at my result set, I saw my test passed, what the name of the test was, the number of records that matched, and the time when the test was last run. I know to look for test 41 in my result set so that if the test doesn't show up, I know it didn't return any rows so it's likely that either my test is not valid or there is no data to run my test against.

By storing the test results in a table, it also provides reference to help you know when you ran your last test set and gives you an easy way to find out what the results of each test were.

If you wanted to keep results from multiple runs of tests, you may want to create a separate table that will give you a test run number, then add that to your test_resultset table as a cross-reference. Then, removing data (like dropping the test_resultset table) between runs becomes not only unnecessary but undesirable.

While this is a very rudimentary test method, it can save lots of work by giving yourself an easy way to run a test set quickly. As you develop code, keeping a test set like this around (whether it's in a SQL file or driven by some other language), you'll find benefits from having test sets like this to rely on as your code becomes more robust and complex.

I could go on, but I'm sure you get the idea and can see how useful this could be when verifying SQL code works as expected. Want to learn more about writing tests before code? Check out "Test Driven Development" in your favorite search engine.

No comments:

Post a Comment