Monday, 15 December 2008

ebay is annoying

Got up this morning about 6.30am. Had breakfast, then finished running the mysql SELECT IN test that I was doing on Friday. It is much faster to build up the query inside the loop and then execute the query in one go when the loop's finished.

Loop 5 times

OR IN IN2

0.00798606872559 0.00504803657532 0.00319099426270

0.00821709632874 0.00626087188721 0.00343608856201

0.00728511810303 0.00468301773071 0.00320792198181

0.00834417343140 0.00567698478699 0.00334596633911

0.00859093666077 0.00527000427246 0.00325489044189

0.01122093200680 0.00462007522583 0.00328207015991

0.00714206695557 0.00437498092651 0.00323915481567

0.00991106033325 0.00634288787842 0.00347900390625

0.00950288772583 0.00515413284302 0.00312805175781

0.00950288772583 0.00515413284302 0.00312805175781
Avg 0.00877032279968 0.00525851249695 0.00326921939850


And the test script I used:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css">
div{
border: 1px solid #000;
}
.left{ float: left;}
.right{float: right;}
</style>
<title>Untitled Document</title>
</head>

<body>
<?php
include('globalFuncs.inc.php');

$words = array('a', 'ac', 'accumsan', 'ad', 'adipiscing', 'Aenean', 'Aliquam', 'amet', 'ante', 'aptent', 'arcu', 'at', 'auctor', 'augue', 'bibendum', 'Class', 'commodo', 'condimentum', 'consectetuer', 'consequat', 'conubia', 'convallis', 'Cras', 'cubilia', 'Cum', 'Curabitur', 'Curae;', 'cursus', 'dapibus', 'diam', 'dictum', 'dignissim', 'dis', 'dolor', 'Donec', 'dui', 'Duis', 'egestas', 'eget', 'eleifend', 'elementum', 'elit', 'enim', 'erat', 'eros', 'est', 'et', 'Etiam', 'eu', 'euismod', 'facilisi', 'facilisis', 'fames', 'faucibus', 'felis', 'fermentum', 'feugiat', 'fringilla', 'Fusce', 'gravida', 'habitant', 'himenaeos', 'iaculis', 'id', 'imperdiet', 'in', 'inceptos', 'Integer', 'interdum', 'ipsum', 'justo', 'laoreet', 'lectus', 'leo', 'libero', 'litora', 'Lorem', 'luctus', 'Maecenas', 'magna', 'magnis', 'malesuada', 'massa', 'mattis', 'Mauris', 'metus', 'mi', 'molestie', 'montes', 'Morbi', 'mus', 'Nam', 'nascetur', 'natoque', 'nec', 'neque', 'netus', 'nibh', 'nisi', 'nisl', 'non', 'nostra', 'Nulla', 'Nullam', 'nunc', 'orci', 'ornare', 'parturient', 'pellentesque', 'penatibus', 'per', 'Phasellus', 'placerat', 'porttitor', 'posuere', 'Praesent', 'primis', 'Proin', 'purus', 'quam', 'quis', 'rhoncus', 'ridiculus', 'risus', 'rutrum', 'sagittis', 'sapien', 'Sed', 'sem', 'semper', 'senectus', 'sit', 'sociis', 'sociosqu', 'sollicitudin', 'suscipit', 'Suspendisse', 'taciti', 'tellus', 'tempor', 'tempus', 'tincidunt', 'torquent', 'tortor', 'tristique', 'turpis', 'ultrices', 'ultricies', 'urna', 'ut', 'varius', 'vehicula', 'vel', 'velit', 'venenatis', 'Vestibulum', 'vitae', 'Vivamus', 'volutpat', 'vulputate');
/*
$sql = "INSERT IGNORE INTO tagsTest (Tag) VALUES ('";
$sql .= implode("'),('",$words)."')";
$conn->query($sql);
if($conn->errno){echo '<br>'.$sql.'<br><br>'.$conn->error;}
*/
echo '<div class="left"><p>OR</p>';
$start = microtime(true);
for($i=0; $i<5; $i++)
{
$tags = array();
for($j=0; $j<20; $j++)
{
$tags[] = $words[rand(0, (count($words)-1))];
}

$sql = "SELECT `Key`, Tag FROM tagsTest WHERE Tag = '";
$sql .= implode("' OR Tag = '", $tags)."'";
$tags = array();
//if($conn){echo $sql;}
if($result = $conn->query($sql))
{
while($row = $result->fetch_row())
{$tags[] = $row;}
}
if($conn->errno){echo '<br>'.$sql.'<br><br>'.$conn->error;}
echo '<div>';
foreach($tags as $tmp)
{echo "{$tmp[0]} =&gt; {$tmp[1]}<br>";}
echo '</div>';
}
$end = microtime(true) - $start;
echo "<p>$end seconds</p>";?>
</div>
<div class="left"><p>IN</p>
<?php
$start = microtime(true);
for($i=0; $i<5; $i++)
{
$tags = array();
for($j=0; $j<20; $j++)
{
$tags[] = $words[rand(0, (count($words)-1))];
}

$sql = "SELECT `Key`, Tag FROM tagsTest WHERE Tag IN ('";
$sql .= implode("', '", $tags)."')";
$tags = array();
//if($conn){echo $sql;}
if($result = $conn->query($sql))
{
while($row = $result->fetch_row())
{$tags[] = $row;}
}
if($conn->errno){echo '<br>'.$sql.'<br><br>'.$conn->error;}
echo '<div>';
foreach($tags as $tmp)
{echo "{$tmp[0]} =&gt; {$tmp[1]}<br>";}
echo '</div>';
}
$end = microtime(true) - $start;
echo "<p>$end seconds</p>";?>
</div>
<div class="right"><p>IN2</p>
<?php
$start = microtime(true);
$sql = array();
for($i=1; $i<6; $i++)
{
$tags = array();
for($j=0; $j<20; $j++)
{
$tags[] = $words[rand(0, (count($words)-1))];
}

$sql[] = implode("', '", $tags)."') AND pages.Key = $i)";
$tags = array();
}
$sql = implode(" OR (tagsTest.Tag IN ('", $sql);
$sql = "SELECT tagsTest.Key, tagsTest.Tag, pages.Key FROM tagsTest, pages WHERE (tagsTest.Tag IN ('".$sql;
if($result = $conn->query($sql))
{
while($row = $result->fetch_row())
{$tags[$row[2]][] = $row;}
}
if($conn->errno){echo '<br>'.$sql.'<br><br>'.$conn->error;}
for($i=1; $i<6; $i++)
{
echo '<div>';
foreach($tags[$i] as $tmp)
{echo "{$tmp[0]} =&gt; {$tmp[1]}<br>";}
echo '</div>';
}
$end = microtime(true) - $start;
echo "<p>$end seconds</p>";?>
</div>
</body>
</html>


After doing that test I started writing a test to compare running an INSERT query followed by a SELECT LAST_INSERT_ID() in a loop with making a long INSERT and SELECT query and then running them when the loop finished, and also with writing the queries in the loop but then executing them with multi_query() outside the loop. However, after writing this I came across a few problems.

First off, I have been SELECT LAST_INSERT_ID() rather than $mysqli->insert_id() because my understanding was that LAST_INSERT_ID() was connection specific whilst insert_id() probably wasn't. This was due to the following statement in the 'Friends of ED PHP Solutions - Dynamic Web Design Made Easy' book, which says:
The SELECT query highlighted in step 6 uses the filename and caption of the record just
entered as search criteria. This is a more accurate way of finding the primary key than a
technique that you often see recommended. By calling the mysql_insert_id() function,
you can get the primary key of the most recently inserted record (as long as it uses
auto_increment). MySQL Improved and PDO both offer equivalents with the insert_id
and lastInsertId properties. respectively. Most of the time, this will give you the information
that you want, but on a busy server, someone else might insert another record at
the same time. To be sure that you get the correct primary key, it’s best to be specific in
your request.
However, it seems that insert_id() is connection specific, and actually is the exact same thing as LAST_INSERT_ID(), except of course you can just run it from your mysql object in PHP rather than having to perform another query to get the ID.

The following sites/posters all say that insert_id() is reliable so long as you don't use a shared connection or close and open a connection between the insertion and calling insert_id():
Not getting last insert id
PHP and MySQL Insert ID
[PHP-DB] Is persistent connection + MySQL + Apache + insert_id === safe?
And even the mysql website says it's connection specific:
You can retrieve the most recent AUTO_INCREMENT value with the LAST_INSERT_ID() SQL function or the mysql_insert_id() C API function. These functions are connection-specific, so their return values are not affected by another connection which is also performing inserts.
(I think I must have read this before but not understood that the mysql_insert_id() C API function is what PHP uses when using insert_id()).

So I will actually need(?) to rewrite my test to use insert_id().

The other thing was that with the long built up query, because we are inserting all the rows in one go, we can't get the last_insert_id, so instead I have to build up a long SELECT statement that selects the key based on all the data that's just been inserted. This is the same way the author suggests you get the last_insert_id in the PHP Solutions book I mentioned earlier.

So my code to pull the rows look like this:
if($result = $conn->query($sqlSELECT))
{
while($row = $result->fetch_row())
{$insertedCountries[] = $row;}
$result->close();
}
However, there is a problem with this, since your data might exist in the table more than once. If you are just selecting one id, then this okay, you can just loop through the rows retrieved until you get to the last one, which should be the last inserted row. But I can't see how to get it to work with selecting the id/keys of multiple records without selecting duplicate records.

This led me on to looking at how to avoid duplicate rows. This post explains how to do this using a Unique key on multiple columns: how to prevent duplicate rows but allow duplicate column entries. However, this doesn't work with columns that allow NULL, e.g. Unique indexes and NULL column values.

Then when I thought about it a bit more, I don't actually want UNIQUE rows in the percountry table (which is the table these tests are inserting/selecting from) as I'm using a lookup table to associate the page id with the percountry record. So in the percountry table I should be able to have 2 rows exactly the same, but then in the lookup table one row might be assigned to page 10, whilst the other row might be for page 12.

Then this got me thinking - why am I using a lookup table for this? Because each percountry record must be associated to one page, and one page only, I might as well just have the page key/id in the percountry table. Then before I insert a record to the percountry table I think I will need to do a check to make sure the record doesn't already exist.

After lunch I did a bit more reading/coding, then played on Animal for about an hour. I didn't do much really, just spoke to Sahara who said she wanted some old carpets, then I spoke to all the animals, and only one had an old carpet. Then I had to run around to try and find Sahara to give her the old carpet.

The rest of the afternoon I did more coding and reading, though I still haven't finished writing my test script yet.

After dinner I bid on a Lowepro Topload Zoom AW bag, I bid £20 with 30 seconds of the auction left, but annoying ebay didn't put the bid through and it sold to someone else for £16.66 + £7 P&P. Annoying for the person selling it as well since now they won't get as much for it as they would have done if ebay hadn't blocked my bid.

Then I watched Lost with Mac & Ben. I did a bit more coding and then watched the end of Shaolin soccer with Mac and Ben. After that Ben went on Animal again since it was his bed time and he wanted to see the meteor shower before he went to bed. Maccy looked it up on the pinternet, but it seems there's not really anything special about the meteor shower except you can see shooting stars (though you can't wish on them) in the sky.

Food
Breakfast: Bowl of fake coco pops; cup o' tea.
Lunch: Cheese straw; chicken tikka sandwich spread sandwich; 1½ bananas; toasted tea cake with butter; cup o' tea.
Dinner: Potato; peas; ham quiche. Pudding was a slice of chocolate swiss roll with tinned mandarin segments and chocolate custard. Coffee.

No comments: