How to Build a Better Photo Recommendation Engine
The other day I tweeted out the number one photo on Flickr’s Explore. It was a popular tweet. Out of the millions of photos that get uploaded to Flickr every single day, this photo was the one that Flickr felt was the absolute most awesomest photo on all of Flickr.
Nothing against the photo linked above, or the photographer who captured both a puppy and a full moon at *exactly* the same time, but it wasn’t what I would have picked as the number one most interesting photo on Flickr.
Different strokes for different folks though as the saying goes.
Along with others, I’ve abandoned flickr’s Explore section as a sort of cheesy photo watermark ghetto, but it remains a popular place across the network. If Flickr cared about innovating, there are lots of ways that they could improve this area, but that’s another conversation.
What I want to talk about today is how other companies that *are* innovating like Google Photos and 500px might build a better photo recommendation engine.
I do believe that all photo sharing sites need a photo recommendation area. We all love to look at engaging photography and it’s a useful tool to find new and interesting photographers to follow as well as to see better work uploaded by people on the site.
The number one problem with most photo recommendation engines is that they are the exact same recommendations for every individual. If you go to flickr’s Explore page, it is the exact same 100 photos for everybody, everyday. It doesn’t matter what sort of photography *you* like. It doesn’t matter where you live. It doesn’t matter what you fave.
Dumb algorithms that don’t take into consideration available data for personalization are not as good as smart algorithms that do.
So what should a smarter photo recommendation search engine do?
1. Don’t show me blocked content. Any accounts that I block should not appear in the recommendation engine for me. If I’m blocking an account it’s for a reason — maybe the person is a stalker/harasser, maybe the person puts 24 point Helvetica copyright watermarks over every single picture that make me want to vomit, maybe the individual focuses on a niche that I’m not interested in — whatever. I don’t want to see it and I shouldn’t have to see it if I’ve gone through enough trouble to block somebody. Google’s new Ignore setting should also be a strong signal.
2. Analyze my fave/+1 vs. view ratio by photographer. Do I +1 100% of a certain photographer’s work? Do I see 100 photos form another photographer and not +1 a single photo? Get to know my faving/+1ing activity and show me more stuff by these photographers that I fave/+1 most. Existing percentage fave/view ratio is a good one to take into consideration.
3. Analyze the tags/keywords on what I’m faving/+1ing. Do I seem to fave/+1 tons of photos of trains? Maybe I’m really into trains. Is graffiti my thing? Do I like abandoned photography? Show me more of this stuff and less puppies and moons.
4. Are there geographical clues that can provide information? What is my fave/view ratio by geographical location? Do I live in San Francisco and fave a higher number of photos in SF? — or maybe I live in SF and I’m sick of it and fave a lower number of photos.
Maybe I dream about laying on my stomach on a glacier in Antartica and fave a much higher rate of photos taken there. Everybody’s different.
5. What can my google/flickr search activity tell you about me? Do I search for neon signs a lot? Do I search especially for “San Francisco” AND “neon”? Maybe a neon sign in San Francisco with only 5 faves should be shown to me before showing me a photo of a puppy and a photoshopped moon with 100 faves.
6. New users should be able to provide input to a recommendation engine. You’d be surprised the sort of things people will personally voluntarily tell you. Does someone like Creative Commons photos more than all rights reserved? Do they care about watermarks? Do they LOVE them? Do the HATE them? Do they prefer local photos? Or do they want more travel abroad photos? How might they rate subject matter on a slider from one to ten? If some users *want* to provide this, use this data as a jumping off point for the recommendation engine.
Do they want to see artistic nudes? (This one is big as it represents a big genre in social photo sharing). Google+ also needs to get this figured out.
7. Don’t use the engine to blacklist. Currently Flickr blacklists certain members. This is bad for community. Blacklisting certain members creates enormous ill will.
Once a smarter recommendation engine is built around photographs customized to me it should be presented to me in a prominent place.
On Google+ specifically I’d recommend adding it as a link under the section to the left of the photo’s displayed on the photos tab in G+. Right now they have “photos from your circles” “photos from your phone” “photos of you” and “your albums” there.
I think they should add a menu item and call it Awesomeness (or whatever). Obviously they can’t call it “Explore,” but they can come up with something better.
From that menu item you should be able to expand it with a little triange and then filter the recommendation engine by circle.
Overall comments and faves/+1s should still factor heavily into any algorithm — but every user should get a unique set of photos tailored to their taste and input into the system.
Google could also play around with ranking incentives.
When Flickr first launched geotagging, I suggested to Stewart Butterfield that Flickr should tell users that geotagged photos would be rated higher in Explore if that was activity that flickr wanted to encourage (and they *should*) and flickr actually did end up doing this.
Similarly Google+ may want to consider what sort of activity they want to promote with photo ranking. Certainly social activity itself should rank high. If a user is uber social maybe their photos should rank higher — they are of more value to your network perhaps than someone that just pumps in flickr photos and never engages.
Likewise, metadata should be rewarded — both keywords and geotags. This is valuable information for Google to use in other ways and for search in the future.
A side note about negative voting systems in social networks. 500px currently employs a negative voting system with their photographs. You can essentially anonymously vote down a photo. They have some tools put in place to prevent mass downvoting and other abuses of the system and seem to feel that having access to this data is helpful for serving up great photographs (and they do a good job at that).
My own advice though is to to kill it. The problem is not that negative voting systems don’t provide valuable information. The problem is not that they are being abused. The problem is not in negative votes themselves, but rather their perception and people’s reactions to them.
It’s the same thing that hurt digg in my opinion. At digg a user would submit a story, they would watch it rise up the charts, and then just before it would hit the front page it would be killed. Boom. Buried. Gone.
Because burying on digg is anonymous (like 500px) it leaves people to suspect the worst. Was there a coordinated effort by people who hate me to bury my story? Was there a coordinated effort by those that wanted to see their own content have a better chance that killed it? This speculation is a negative input for a user. It’s probably the number one complaint I’ve heard about 500px and I know it’s kept some users away. I’m not sure the value of the data and information outweighs the negative feelings it provides to some in the community.
If you *absolutely* must have a negative voting system. Take away the anonymity. Of course this would also take away 98% of the down votes as well though.