Friday 13 November 2009

Not getting much done

This morning I benchmarked my MySQL shortURL functions, but found something quite wrong - the auto increment numbers were going up by one even when a row wasn't inserted because it already existed. This is no good for me, if 50% of requests were for URLs that already existed in the database, this would mean that your table can only use 50% of its capacity.

Doing some tests I found that this problem exists with the InnoDB storage engine, but not with the MyISAM storage engine.

After finding the problem was due to InnoDB I did some googling, and eventually found the MySQL Manual page for InnoDB Auto Increment Handling, which says:
“Lost” auto-increment values and sequence gaps

In all lock modes (0, 1, and 2), if a transaction that generated auto-increment values rolls back, those auto-increment values are “lost.” Once a value is generated for an auto-increment column, it cannot be rolled back, whether or not the “INSERT-like” statement is completed, and whether or not the containing transaction is rolled back. Such lost values are not reused. Thus, there may be gaps in the values stored in an AUTO_INCREMENT column of a table.


For a while I did look at alternative storage engines, as I need one with row level locking and auto increment values that are only increased when an actual record is inserted. One that sounded quite promising (though I didn't actually install or test it) was the PrimeBase XT Storage Engine for MySQL.

But in the end I decided it would be easier and more compatible to just check whether a row exists and then try to insert it if it doesn't exist. So here are the benchmarks.
  • makesURL4 is the one I plan using that checks whether the value already exists before trying to insert it.
  • makesURL1 is the one I wrote yesterday that uses a 'variable' with scope limited to the function.
  • makesURL2 is the one I was using before with a global variable, and the variable reset to NULL at the start of the function.
  • makesURL3 is the one I was using before that doesn't reset the global variable, and so doesn't work when run multiple times on one connection. Despite working okay yesterday when I benchmarked it, today it inserted the rows okay but set sURL to an empty string for all records, so its benchmark times are essentially void.
DROP FUNCTION IF EXISTS makesURL4//
CREATE FUNCTION makesURL4 (lURL CHAR(255))
RETURNS CHAR(5) DETERMINISTIC
BEGIN
DECLARE sURL CHAR(5);
DECLARE CONTINUE HANDLER FOR NOT FOUND
BEGIN
#If the record doesn't already exist, insert a new record, create the short URL and update the record
INSERT INTO shortURL VALUES(NULL, '', lURL);
SET sURL = strFromNum(LAST_INSERT_ID());
UPDATE shortURL SET shortURL.sURL = sURL
WHERE shortURL.id = LAST_INSERT_ID()
LIMIT 1;
END;
#First try and get the sURL based on the long URL
SELECT shortURL.sURL INTO sURL FROM shortURL WHERE shortURL.lURL = lURL LIMIT 1;
RETURN sURL;
END//

SELECT BENCHMARK(10000, ( SELECT makesURL4( SUBSTRING( MD5(RAND()),1,4 ) ) )); #2.17 #1.80 #1.61 #1.48


DROP FUNCTION IF EXISTS makesURL1//
CREATE FUNCTION makesURL1 (lURL CHAR(255))
RETURNS CHAR(5) DETERMINISTIC
BEGIN
DECLARE sURL CHAR(5);
DECLARE CONTINUE HANDLER FOR SQLSTATE '23000'
BEGIN
SELECT shortURL.sURL INTO sURL FROM shortURL WHERE shortURL.lURL = lURL;
RETURN sURL;
END;
INSERT INTO shortURL VALUES(NULL, '', lURL);
SET sURL = strFromNum(LAST_INSERT_ID());
UPDATE shortURL SET shortURL.sURL = sURL
WHERE shortURL.id = LAST_INSERT_ID()
LIMIT 1;
RETURN sURL;
END//

SELECT BENCHMARK(10000, ( SELECT makesURL1( SUBSTRING( MD5(RAND()),1,4 ) ) )); #2.05 #1.72 #1.69 #1.65


DROP FUNCTION IF EXISTS makesURL2//
CREATE FUNCTION makesURL2 (lURL CHAR(255))
RETURNS CHAR(5) DETERMINISTIC
BEGIN
SET @sURL = NULL;
#First try and insert the long URL
INSERT INTO shortURL VALUES(NULL, '', lURL)
#If the long URL already exists, then set @sURL to the short URL value
ON DUPLICATE KEY UPDATE
id = IF( (@sURL := sURL), id, id);
#If the record didn't already exist, so we've just inserted a new record, create the short URL and update the record
IF @sURL IS NULL THEN
SET @sURL = strFromNum(LAST_INSERT_ID());
UPDATE shortURL SET shortURL.sURL = @sURL
WHERE shortURL.id = LAST_INSERT_ID()
LIMIT 1;
END IF;
RETURN @sURL;
END//

SELECT BENCHMARK(10000, ( SELECT makesURL2( SUBSTRING( MD5(RAND()),1,4 ) ) )); #1.99 #1.78 #1.67 #1.61


DROP FUNCTION IF EXISTS makesURL3//
CREATE FUNCTION makesURL3 (lURL CHAR(255))
RETURNS CHAR(5) DETERMINISTIC
BEGIN
#First try and insert the long URL
INSERT INTO shortURL VALUES(NULL, '', lURL)
#If the long URL already exists, then set @sURL to the short URL value
ON DUPLICATE KEY UPDATE
id = IF( (@sURL := sURL), id, id);
#If the record didn't already exist, so we've just inserted a new record, create the short URL and update the record
IF @sURL IS NULL THEN
SET @sURL = strFromNum(LAST_INSERT_ID());
UPDATE shortURL SET shortURL.sURL = @sURL
WHERE shortURL.id = LAST_INSERT_ID()
LIMIT 1;
END IF;
RETURN @sURL;
END//

#Didn't actually write any sURLs?!?
SELECT BENCHMARK(10000, ( SELECT makesURL3( SUBSTRING( MD5(RAND()),1,4 ) ) )); #0.81 #0.78 #0.83 #0.95


After writing this blog post so far I checked my selftrade account, and found that my at Limit order for some Astra Zeneca shares had failed. Weirdly, my At Limit order was for 35 shares at £28.1028 each, and looking at the share price chart for Astra Zeneca, their share price had been below that point quite a bit since I put the at Limit order in. Also, the At Limit expiry date was 26/11/09, so I don't know why it had failed.

Anyway, though the price of Astra Zeneca had gone up a bit in the past couple of weeks, it was still below £28 so I just bought 35 shares using At Best.

I tidied up my bedroom a bit, then after lunch I cut out some Chupa Caps in photoshop. Unfortunately Vista has developed a nasty habit of not letting me cut or delete folders, which means I have to resort to the much slower and more clunky method of using commands in a DOS prompt. If you right click on a folder you can't move or delete, Windows Explorer will crash. It only does this sometimes though, so weird.

I spent the afternoon cutting out the Chupa Caps with Yellow Backs and Coca Cola Series 1 and 2 caps in photoshop, then uploading them to my pog website. While I was waiting for the uploads to process and PNG Gauntlet to compress the PNGs I checked DpReview, Canon Rumors and Nikon Rumors websites.#

After dinner I added the pages to the pog website for the new caps I'd uploaded to the pog website and also checked the stats for the pog website. I couldn't remember how to get CGI running for awstats to work, so I had to refer back to my post Getting awstats working/perl running as CGI in Nginx.

After looking at the stats, which were quite boring apart from someone visting the site using a PSP, I watched Autumn Watch with Grandad, Brian, L and Clare.

Then after that I checked the WebFaction forums and read that actually the Apache mpm per user processes don't count towards your total memory usage. It would have been nice if they had put this information on the knowledge base article about working out your memory usage rather than just leaving people to think that they can't have many static/php/cgi apps as it will use up all their memory.

The weather was rainy most of the day and it also got quite windy in the evening.

Food
Breakfast: Bowl of Asda Golden Balls Cereal; cup o' tea.
Lunch: Peppered ham with mustard and crunchy salad sandwich; bag of prawn cocktail flavour crisps; Clementine; Home-made Milkybar button muffin; cup o' tea.
Dinner: Battered fish portion; baked beans; mashed potato; very small button mushrooms; ground black pepper. Pudding was Lemon Meringue with spleenvap. Coffee; 2x pieces of Sainsbury's Caramel Chocolate.

No comments: