There is a conversation going on in Flickr Central right now called “Google is Penalizing Flickr, Right?” based on a conversation that Robert Scoble and I had the other night on our photo shoot where I mentioned that I felt that Picasa images were better indexed on Google Image Search than Flickr’s images were indexed on either Google or Yahoo Image search.
What follows are my thoughts on why this is so and what the opinion of this might be over at Flickr.
I’ve written about image search in the past quite a bit and spent a lot of time trying to figure a lot of it out. It’s difficult to fully analyze because so much of what Google/Yahoo with regards to search is black box and so a lot of trying to figure it out is educated guessing and speculation.
The first issue to discuss is is it a good thing or a bad thing to have your images indexed for search. Here you are going to get two different opinions. Many people are not going to want their images indexed. Matt at Flickr for instance thinks that having Flickr images in Google’s Image Search will increase spam at Flickr. Others have thoughts about just not wanting their images to show up there for whatever personal reason that they may have.
I think Stewart said a while back that I was the only person who ever brought up wanting to be indexed at Flickr. I might be wrong on that but I seem to recall him saying something like that.
And so the flip side of the coin is that someone would *want* to be indexed in search with their images. Why would someone want to be indexed? Well it depends on the person. Some people might like the extra attention that their photographs might get. Bloggers certainly might enjoy the traffic that Google provides (Google “juice”). Google Images is the number one overall referrer to thomashawk.com for instance.
A while back Choice Hotels was doing a search for “Grand Lake” for a commercial that they were doing. It was a commercial to the tune “I”ve Been Everywhere” by Johnny Cash and part of the song says “Grand Lake” and they wanted a photo for this. So they searched in Google Image search and as you’ll see from the image search result mine is the first result. So they bought my image for $500 for their commercial. Some might think that this is kind of cool. Others might say I’m selling out. But they did find my image because it was indexed very highly on Google Image Search.
But then there are some that say well yes they want more views, attention, etc. But they only want it organically from inside flickr from other flickr users, not from non flickr users and the outside world.
Interestingly enough, my flickr image of the exact same theater does not show up in Google Images. Even if you do a search for “Grand Lake” + flickr it doesn’t show up. It is of course in flickr here.
In fact, if you do a flickr search for “Grand Lake” you’ll actually find tons of photos, the vast majority of which are not showing up in Google Images.
So Flickr photos index poorly in Google Image Search (and Yahoo Image Search as well by the way). Note the same search at Yahoo yields virtually no flickr results. Although again my image hosted at thomashawk.com and originally uploaded with Google’s Hello Picasa is on the fourth page results at Yahoo.
Many people are going to say, who cares, or even hurrah, image search would be the death of the flickr community, etc. But then there are others (like Jeff Clow yesterday who talked about selling some of his images at microstock sites) who might say, yeah, that would be cool if someone found my image on Google and offered me $500 for it like they did for Thomas Hawk.
Different strokes for different folks is all. But I think that even if people don’t realize it, that there are a lot of marketers (gasp, the evil word) who use Google Images today and some of Flickr’s users wouldn’t mind picking up the phone and getting $500 for doing what they love anyways.
So. that’s the debate.
Flickr by the way most likely (they can chime in and speak for themselves of course) probably wants images indexed for search. At least at Yahoo. They probably want images indexed for search because flickr’s algorithm for interesting images is better than either Google or Yahoo Image Search and it would make a more compelling Image Search experience at Yahoo to have access to these better images. Plus it would drive more traffic to flickr in the end (which they may say they don’t want this kind of traffic etc.) Striatic has a story about how flickr once had some of their photos show up in broader Yahoo News search for a disaster and it kind of freaked them out. Etc. But I do suspect that longer term Flickr does want to be indexed for search (again, at least at Yahoo) but that they want to approach this cautiously and patiently.
Flickr has built a way for you to opt out of search indexes so they are obviously aware and thinking about it at least but want to make sure that those that might object would have a way out, etc. Community first perhaps.
So that’s my analysis on where things stand with image search.
Now. On to my thoughts as to *why* images uploaded with Picasa index so much better than images at Flickr. Some Flickr images after all do still make their way onto Google irrespective of any desire for Flickr to have them there or not.
I used to think that there was some conspiracy here. I used to think that as Yahoo and Google were competitors that somehow Google was purposely dinging Flickr’s Google juice and filtering flickr results out so as not to send traffic to a competitor. Or that maybe Flickr was somehow blocking or making it more difficult for Google to crawl their index (this still may be true), etc.
But I don’t think this anymore.
I think it just comes down to how difficult it is to analyze what’s inside of an image. I mean, when I name an image of mine, “Feel So Cold and I Long for Your Embrace,” how is Google supposed to know it’s a snake? Flickr knows and indexes it highly because I tagged it that way. But Google can’t really scour Flickr tags. They have to go by headlines and text around an image and the image file title etc. And so my snake shot doesn’t show up in image search on Google or Yahoo very well even though it does show up in Flickr ranked highly.
The reason why I speculate that Google indexes my Picasa images higher than my Flickr images I now believe has to do more with naming conventions than anything. Back to another Google Image Search for the term “San Francisco” (a popular one no doubt). You’ll see my image there (at least for now) is on the first page results (second row). If you actually click through the image you will see that the actual image file itself is named San Francisco Rain. This is telling Google (who probably uses file names as the highest indicator of what’s in a photo, even ahead of title, text around it etc.) that this very well may be a photo of San Francisco.
C
ombined with the text around the image (it’s headline, caption, etc.) all referencing San Francisco and combined with my overall high Google PageRank. Magic. It goes to the top of the list of over 1.8 million images of San Francisco in Google image search.
Conversely, back over at Flickr. I have the same shot (and it’s pretty popular on Flickr) but check out how the file (not the image title) is named at Flickr. It’s named 38874619. It is named a number, not “San Francisco Rain.” Flickr has to do it this way because they’d have too many “San Francisco Rain,” or “cute cat” or “hot donkey” if they didn’t use unique numbers to identify their images instead of user naming conventions.
Flickr of course still has the tags and knows that this is rain and I think that this is how they will eventually use this info for Yahoo Image Search (and maybe Google Image search will be allowed to use these descriptive tags in the future or maybe not). But not until they are comfortable with their images showing up in broader image search vis a vis community concerns etc.
I do hope this is coming soon in fact though as I would like to get more of those calls from people wanting to buy my images for $500.
And while many on Flickr might not give a rat’s ass about image search. Geeks like Scoble and I geek out talking about things like this and it’s super interesting to us which is why he probably included it in his post about our shoot the other day.
So now that you host all of your blog images on Zooomr, won’t you run into the same problem? Zooomr file names also appear to use numbers instead of titles. Can we expect to see Zooomr fix this?
Eric. Yep same problem there. Something we may need to work on — although we have higher priorities at the present than necessarily optimizing for search. But optimizing Zooomr for search should be something that we look at at some point in the future.
Thomas, I’d already picked up on Robert’s post and hopefully added some clarification both at his blog, and within my own post “ Picasa uses real filenames when uploading – good for indexing“.
Unfortunately I think that there are three distinct things that get associated with the word Picasa:
1) the desktop program – used to organize photos on the PC hard drive, but which is capable of uploading to online services
2) Picasa Web Albums – the online service which offers photo sharing on the web, and is what is normally meant by Picasa when flickr appears in the same sentence
3) other desktop programs produced by the Picasa company (which everyone is very aware is a Google subsidiary) – of which Hello is the only real example.
Your writing has not made it explicit which of the 3 you were referring to, though I think you probably have longest experience of using Hello to do your uploads. Have you any further insights from your experience of using the main Picasa client app to do your uploading, or of using Picasa Web Albums for image hosting? Your central point, that uploads that preserve filenames are good for search engines, I think is unconnected to which actual tool does the uploading.
(I’ve also written quite extensively about how its currently very difficult to find any images hosted in Picasa Web Albums – external indexing spiders such as the Google Image search one are excluded via the robots.txt, and there is no internal search provided at all).