Tuesday, 10 March 2009

EXIF/Image metadata extraction comparison

This morning I was trying to figure out why PHP wouldn't parse the JSON produced by ExifTool. After much testing and messing about I eventually figured it out - there was a © sign in the EXIF data, and this was messing it up. Doing a UTF_encode() to the data before json_decode()ing it worked (and this is mentioned on the php.net JSON page). The © sign is now appearing with an accented A before it though, so I still have to figure out what's happening there.

I checked my email and saw that RED have announced some new lenses, though doesn't seem to be any price indication at the moment.

I sorted out the problem with the © symbol not displaying properly - I wasn't sending any html headers, so making the page a proper html page with the encoding set to UTF-8 fixed it. I made a page extracting the EXIF & IPTC using Evan Hunter's PHP JPEG Metadata Toolkit and another page that used the PHP iptcparse and exif_read_data functions.

After lunch I went on Animal for a bit. Then I ran each exif/iptc extracting page 10 times to see how they compared:

PHP-iptcparse-exif_read_data.php

PHP_JPEG_Metadata_Toolkit-Test.php

ExifTool-Perl-Module-Test.php

ExifTool-Command-line-Test.php

0.077036857605

0.166090011597

0.293696880341

0.269569873810

0.110205173492

0.141455888748

0.280531167984

0.207340955734

0.110219955444

0.106762886047

0.264120101929

0.257773876190

0.135658979416

0.182878017426

0.276825904846

0.275245904922

0.092431068420

0.164710998535

0.294065952301

0.259418964386

0.086300134659

0.178915023804

0.310618877411

0.257565021515

0.108342885971

0.176629066467

0.267220973969

0.268435955048

0.108730077744

0.183045148849

0.291265964508

0.248710870743

0.073312997818

0.129640102386

0.255465984344

0.326677083969

0.120223999023

0.143432855606

0.292721986771

0.190216064453

0.102246212959

0.157355999947

0.282653379440

0.256095457077



As you can see, Evan Hunter's PHP JPEG Toolkit is about 50% slower than PHP's built in methods, running a perl script that uses the ExifTool perl Module is about 280% slower than PHP's built in methods, and running the ExifTool perl script directly is about 250% slower than PHP's built in methods. So the built in PHP functions win on speed, but there are also other factors to consider.

  • Both ExifTool and Evan Hunter's PHP JPEG Toolkit can extract far more information than the built in PHP functions.
  • Both ExifTool and Evan Hunter's PHP JPEG Toolkit can write EXIF and other meta data.
  • ExifTool works with many image formats including RAW files.
  • ExifTool is being continuously developed
  • Assuming you are parsing the image metadata, then writing it to a database, this will only need to be done once, when initially uploading the file. So the speed differences between the different methods aren't particularly important.
  • Evan Hunter's PHP JPEG Toolkit presents data as HTML rather than an array. You could edit the code so it creates an array rather than html, but that would be a very big job.
  • The IPTC data produced by the PHP function iptcparse has unfriendly array key names e.g. the tags are $iptc['2#025']
  • The array produced by exif_read_data cannot be passed to utf8_encode (presumably because the array contains binary data), so I guess you would need to utf8_encode each piece of info you want to extract.


After that I checked the web squeeze, and asked a question there about whether most hosts have perl installed. I think CentOS is one commonly used OS for web servers, and it seems that includes perl.

There was a google car with the big camera(s) on top of it going up our street today (though I didn't see it myself), so our road should be on Google StreetMap whenever they get round to updating it with the data from today.

I checked the dpreview canon lens forum, and then took some photos of a daffodil. I found this time that lighting a white piece of card for the background with a gelled flash didn't work very well.

After dinner I watched 'Elite Squad' with Mac, which was good. I did a backup of some stuff, and then took some more daffodil photos.

The weather today was overcast in the morning, then brightened up in the afternoon and there was quite a nice sunset. I listened to Amiga chip music nearly all day.

Food
Breakfast: Toasted tea cake with butter; cup o' tea.
Lunch: Farmy cheddar with sweet & crunchy salad sandwich; plum; clementine; Chocolate Wacko (like a Rocky); cup o' tea; piece of Sainsbury's Caramel Chocolate.
Dinner: Thin & crispy pepperoni pizza; peas; chips; salt. Pudding was a piece of Tiramisu. Coffee; Quality Street; piece of Sainsbury's Mint Chocolate.

No comments: