Thursday 25 February 2010

Metadataring and processing a few photos

This morning I checked my email, then finished off adding metadata to a few Korea photos.

I went in the garden to put some food out for the birds. I found that the food in the feeder, which I'd put some food in during the winter, but the birds hadn't touched, had all stuck together and didn't look that good, so I cleaned it out.

After lunch I uploaded one of the Korea photos to my photo website, then remembered that I hadn't fixed the problem with locations being duplicated in the location table on the database the website uses.

I had previously modified the mysql trigger to prevent duplicate records in the location table, but hadn't actually made the change to the trigger on the database on the web server. I noticed a possible problem with the trigger, so I modified it a bit more, then did a bit of local testing with it.

When I was satisfied that it seemed to be working okay I made a backup of the databases on the web server, and also replaced my local copy of the photo website database with the copy of the one from the web server. I then ran my procedure, which I'd written previously, to remove duplicates from the location table. However, I found that while it did remove duplicates, it messed up the locations that images were referencing i.e. before running the procedure an image would reference one location, and running the procedure, the same image would now be referencing a different location.

Obviously, I didn't want to mess up my data, so I looked into what the problem was. When I saw the problem, it was quite obvious. The problem was that the procedure was updating the lookup tables that hold the image and location relationships as each location was added to the location table. This meant that after a record had been updated, if its new location_id matched the old location_id of another record that was subsequently updated by the procedure, the record would get its location_id updated again (and so to a wrong value) since the procedure provides no way for telling what records have been updated and what records haven't.

So I modified my procedure to add in an updated column with default 0 to the lookup tables, and set that to 1 when a record was updated. Then I modified the procedure to only update records where updated was 0. An alternative would have been to copy the lookup tables to temporary tables, and then use them for selecting the records that needed updating. They way I did it was quite slow, so using temporary tables may have been a better idea.

Either way, the modified procedure worked okay, and since it was hopefully a run once only, the slowness wasn't too bad (It took 2.63s, but I only had something like 400 records for it to go through).

Anyway, here's the modified procedure as I used it:
DELIMITER //
DROP PROCEDURE IF EXISTS remove_location_dupes//
CREATE PROCEDURE remove_location_dupes()
BEGIN
DECLARE id, Sublocation, City, ProvinceState, country, WorldRegion INT;
DECLARE done INT DEFAULT 0;
DECLARE cur1 CURSOR FOR SELECT location.id, location.Sublocation, location.City, location.ProvinceState, location.country, location.WorldRegion FROM location;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
OPEN cur1;
# Disable Foreign Key Constraints so we can empty the location table
SET foreign_key_checks = 0;
# Truncate location table
TRUNCATE TABLE location;
ALTER TABLE `img_locationShown` ADD `Updated` TINYINT UNSIGNED NOT NULL DEFAULT 0;
ALTER TABLE `img_locationCreated` ADD `Updated` TINYINT UNSIGNED NOT NULL DEFAULT 0;
# Loop through location table
the_loop: LOOP
FETCH cur1 INTO id, Sublocation, City, ProvinceState, country, WorldRegion;
IF done THEN
ALTER TABLE `img_locationShown` DROP `Updated`;
ALTER TABLE `img_locationCreated` DROP `Updated`;
SET foreign_key_checks = 1;
CLOSE cur1;
LEAVE the_loop;
END IF;
# Try inserting the location into the location table
INSERT IGNORE INTO location SET location.Sublocation = Sublocation, location.City = City, location.ProvinceState = ProvinceState, location.country = country, location.WorldRegion = WorldRegion;
IF LAST_INSERT_ID()
THEN
# Update img_locationShown and img_locationCreated with the correct location id
UPDATE img_locationShown SET img_locationShown.location_id = LAST_INSERT_ID(), UPDATED = 1 WHERE img_locationShown.location_id = id AND UPDATED = 0;
UPDATE img_locationCreated SET img_locationCreated.location_id = LAST_INSERT_ID(), UPDATED = 1 WHERE img_locationCreated.location_id = id AND UPDATED = 0;
END IF;
END LOOP the_loop;
END
//
DELIMITER ;

I had quite a bit of touble getting the procedure on the web server though. When I entered it through the SQL option on phpMyAdmin, it would break PHP?!? (Made PHP not respond, didn't kill it). So I ssh'd into the server (and after restarting PHP), logged into mysql. But now when I tried to create the procedure, I would get the following partway through
Display all 1034 possibilities? (y or n)
Googling, I found this was because I had tabs in the code. So I removed the tabs, and then I could create the procedure okay, and call it. I guess I could remove the procedure now as well, but I don't think there's any harm in leaving it there.

After checking the locations didn't seem to be messed up at all, I uploaded a couple more Korea photos to the website, processed a couple more Korea photos, then uploaded them to the website as well.

In the evening I processed a few more Korea photos.

The weather was a mixture of sun and cloud, but it clouded over around sunset. The snow hadn't melted much the last couple of days, but the rest of it (apart from the piles) melted away today.

Food
Breakfast: Bowl of Choco Moons Cereal; Cup o' Tea.
Lunch: Mature Cheddar Cheese Sandwich; Apple; Clementine; Crumpet with Blackcurrant Jam and I cannae believe it's nae butter jimmy; Slice of Lemon Drizzle Cake; Cup o' Tea.
Dinner: Southern Fried Chicken; Jacket Potato; Baked Beans; Tinned Plum Tomato; Grated Mature Cheddar Cheese. Pudding was a slice of Bo's birthday cake - a Raspberry Gateaux. Coffee; One of Bo's Birthday Sweets - a Lemon Chocolate.

No comments: