How to Build a Better Photo Recommendation Engine

The other day I tweeted out the number one photo on Flickr’s Explore. It was a popular tweet. Out of the millions of photos that get uploaded to Flickr every single day, this photo was the one that Flickr felt was the absolute most awesomest photo on all of Flickr.

Nothing against the photo linked above, or the photographer who captured both a puppy and a full moon at *exactly* the same time, but it wasn’t what I would have picked as the number one most interesting photo on Flickr.

Different strokes for different folks though as the saying goes.

Along with others, I’ve abandoned flickr’s Explore section as a sort of cheesy photo watermark ghetto, but it remains a popular place across the network. If Flickr cared about innovating, there are lots of ways that they could improve this area, but that’s another conversation.

What I want to talk about today is how other companies that *are* innovating like Google Photos and 500px might build a better photo recommendation engine.

I do believe that all photo sharing sites need a photo recommendation area. We all love to look at engaging photography and it’s a useful tool to find new and interesting photographers to follow as well as to see better work uploaded by people on the site.

The number one problem with most photo recommendation engines is that they are the exact same recommendations for every individual. If you go to flickr’s Explore page, it is the exact same 100 photos for everybody, everyday. It doesn’t matter what sort of photography *you* like. It doesn’t matter where you live. It doesn’t matter what you fave.

Dumb algorithms that don’t take into consideration available data for personalization are not as good as smart algorithms that do.

So what should a smarter photo recommendation search engine do?

1. Don’t show me blocked content. Any accounts that I block should not appear in the recommendation engine for me. If I’m blocking an account it’s for a reason — maybe the person is a stalker/harasser, maybe the person puts 24 point Helvetica copyright watermarks over every single picture that make me want to vomit, maybe the individual focuses on a niche that I’m not interested in — whatever. I don’t want to see it and I shouldn’t have to see it if I’ve gone through enough trouble to block somebody. Google’s new Ignore setting should also be a strong signal.

2. Analyze my fave/+1 vs. view ratio by photographer. Do I +1 100% of a certain photographer’s work? Do I see 100 photos form another photographer and not +1 a single photo? Get to know my faving/+1ing activity and show me more stuff by these photographers that I fave/+1 most. Existing percentage fave/view ratio is a good one to take into consideration.

3. Analyze the tags/keywords on what I’m faving/+1ing. Do I seem to fave/+1 tons of photos of trains? Maybe I’m really into trains. Is graffiti my thing? Do I like abandoned photography? Show me more of this stuff and less puppies and moons.

4. Are there geographical clues that can provide information? What is my fave/view ratio by geographical location? Do I live in San Francisco and fave a higher number of photos in SF? — or maybe I live in SF and I’m sick of it and fave a lower number of photos.

Maybe I dream about laying on my stomach on a glacier in Antartica and fave a much higher rate of photos taken there. Everybody’s different.

5. What can my google/flickr search activity tell you about me? Do I search for neon signs a lot? Do I search especially for “San Francisco” AND “neon”? Maybe a neon sign in San Francisco with only 5 faves should be shown to me before showing me a photo of a puppy and a photoshopped moon with 100 faves.

6. New users should be able to provide input to a recommendation engine. You’d be surprised the sort of things people will personally voluntarily tell you. Does someone like Creative Commons photos more than all rights reserved? Do they care about watermarks? Do they LOVE them? Do the HATE them? Do they prefer local photos? Or do they want more travel abroad photos? How might they rate subject matter on a slider from one to ten? If some users *want* to provide this, use this data as a jumping off point for the recommendation engine.

Do they want to see artistic nudes? (This one is big as it represents a big genre in social photo sharing). Google+ also needs to get this figured out.

7. Don’t use the engine to blacklist. Currently Flickr blacklists certain members. This is bad for community. Blacklisting certain members creates enormous ill will.

Once a smarter recommendation engine is built around photographs customized to me it should be presented to me in a prominent place.

On Google+ specifically I’d recommend adding it as a link under the section to the left of the photo’s displayed on the photos tab in G+. Right now they have “photos from your circles” “photos from your phone” “photos of you” and “your albums” there.

I think they should add a menu item and call it Awesomeness (or whatever). Obviously they can’t call it “Explore,” but they can come up with something better.

From that menu item you should be able to expand it with a little triange and then filter the recommendation engine by circle.

Overall comments and faves/+1s should still factor heavily into any algorithm — but every user should get a unique set of photos tailored to their taste and input into the system.

Google could also play around with ranking incentives.

When Flickr first launched geotagging, I suggested to Stewart Butterfield that Flickr should tell users that geotagged photos would be rated higher in Explore if that was activity that flickr wanted to encourage (and they *should*) and flickr actually did end up doing this.

Similarly Google+ may want to consider what sort of activity they want to promote with photo ranking. Certainly social activity itself should rank high. If a user is uber social maybe their photos should rank higher — they are of more value to your network perhaps than someone that just pumps in flickr photos and never engages.

Likewise, metadata should be rewarded — both keywords and geotags. This is valuable information for Google to use in other ways and for search in the future.

A side note about negative voting systems in social networks. 500px currently employs a negative voting system with their photographs. You can essentially anonymously vote down a photo. They have some tools put in place to prevent mass downvoting and other abuses of the system and seem to feel that having access to this data is helpful for serving up great photographs (and they do a good job at that).

My own advice though is to to kill it. The problem is not that negative voting systems don’t provide valuable information. The problem is not that they are being abused. The problem is not in negative votes themselves, but rather their perception and people’s reactions to them.

It’s the same thing that hurt digg in my opinion. At digg a user would submit a story, they would watch it rise up the charts, and then just before it would hit the front page it would be killed. Boom. Buried. Gone.

Because burying on digg is anonymous (like 500px) it leaves people to suspect the worst. Was there a coordinated effort by people who hate me to bury my story? Was there a coordinated effort by those that wanted to see their own content have a better chance that killed it? This speculation is a negative input for a user. It’s probably the number one complaint I’ve heard about 500px and I know it’s kept some users away. I’m not sure the value of the data and information outweighs the negative feelings it provides to some in the community.

If you *absolutely* must have a negative voting system. Take away the anonymity. Of course this would also take away 98% of the down votes as well though.

Be Sociable, Share!
Loading Facebook Comments ...


  1. Statsr says:

    Hi Thomas,
    That’s a nice article, the main point to meis really that it’s weird that photos sharing services don’t use all the data they have to compute a kind of personnal explore to each user. That’s really the best way I think to help people discover new photographs.

    There are some tools using flickr Api that made this for you, and I’vre recently tried to do the same thing for 500px, brownsing your contacts to suggest new photographers you could like and brownsing your contact’s favorites to suggests you photos you should like.

    Sure we could do better looking for tags you like or not and other things, but trought an API it’s quite dificult as it asks a lot of time…

    Here is my tool for 500px :

    If you try it, don’t doubt in sending me feedbacks as I agree with you that we can do better…

  2. boscbo says:

    Basically, we want our photo services’ recommendation engines to act as ‘Big Brother’.

    I grew tired of Flickr’s Explore a long time ago. A tool I’ve found incredibly helpful is Ffffl*ckr (… it works WITH Flickr, but is contained OUTSIDE of it. Allow Ffffl*ckr to connect to your account, login, and select 1 of your recent 100 favs, and it will then show you 10 favs that that person has recently fav’d because it assumes you may like what that person likes, since you’ve fav’d some of their own work… sooner or later you’re down the rabbit hole looking at awesome Flickr photos. Works great, we just needed someone other than Flickr to find a way to do it.

  3. Tim says:

    There are any number of these criteria that can be considered – including “extended circles” influencing the decision, which is probably computationally intensive.

    The main problem I have is that, what*ever* the formulae might be – and it should be a linear combination of sub-formulae – if it’s presented as a list of descending relevance/interest/tastiness/favourability, then I don’t want to always gravitate toward the top of the list – if I’m going to actually *explore* then I want the odd curve-ball thrown my way.

    It’s the fundamental flaw with naive Bayes – the twist from “I liked this photo of a kitten” to “you shall always have kittens first” and degeneration to “nothing but blo00dy kittens”. Show me 50 cat photos and then a tomato or something instead.

  4. Tom says:

    Interesting post – thanks.

    Free pointers for building a recommendation engine on this site.

  5. Cutcaster says:

    I agree with the negative commenting thing and how you related it to what happened at digg.

    Interesting ideas for a recommendation engine but that is a ton of data to capture, organize, store and then continuously analyze. It’s a fun problem to solve and think about. I laughed at boscbo’s comment.

    I think the recommendation engine should also take into account how many clicks farther down the “rabbit hole” the user goes after clicking on the first image in the recommendation. also it should analyze the context of the page that the user clicks through to so the engine knows better what the user was interested in the next time it rec’s something.

    I’m sure Google will use all the behavioral search data it has relating to their users as well as their internet browsing history to serve up some pretty relevant and personalized recommendations in the future.

  6. Sherwin says:

    What you are proposing, Sir, will be like a stumbleupon for photographs, and that would be cool and awesome. I hope GOOGLE implements your suggestions to Picasa or Google+. While they’re at it, they should also create an app for Picasa and Google+ for Android, which follows the exact same parameters you’ve set.