~ Loki on Yahoo's filter - a short pointer ~ Lately, all eyes are turned on the diva Google, and every sidestep is
noticed, blogged, commented, flamed etc.. For good reason
or not. I don't really care. But people tend to forget about the other
big players in this industry, be it for good or bad things. Recently, Google launched his China version of websearch, generating a
lot of discussions about censoring results for chinese users at
the request of China government.
http://images.google.cn/images?q=Tiananmen
Shortly after, some guys published a way to bypass this filter by using
capitalised queries, and managed to output uncensored results
But what about Yahoo (or MSN) ? Are they also filtering the results ?
http://images.search.yahoo.com/search/images?p=tiananmen
It's not even filtered. You do not have ANY result at all. Do you really
think that is a better solution ?
http://www.yahoo.com.cn/search?p=test
No problem.
http://www.yahoo.com.cn/search?p=tiananmen
Response: HTTP/1.x 302 Found
Bounced to yahoo news. also note the source parameter :
ysearch_www_filter_noresult
So. Is there anything we can do, as some did for Google, to bypass this
filter ? And how long will it take to be spotted and corrected by
I tried to bypass the filter, using similar "poke around" techniques.
tried different approaches, mixing caps, adding useless keywords
(-dsfasdfds for ex),
[tianamen] :
http://xinwen.yahoo.com.cn/search.html?p=tiananmen&ei=utf-8&source=ysearch_www_filter_noresult
'+' is encoded for urls in '%3B'
338 times '+' -> 338*3 = 1014
tiananmen -> 9 chars
338 '+' and tiananmen -> 1023 chars
We've reached and crossed the 1024 bytes limit for the value used for
the filter. So this query does bypass it :)
But this is quickly changing, between the time when I made those tests
and now, they seem to have added more limits, and the query field seems
to be restricted to 1024 chars. But if you feed the
parameter directly into the URL is will still work (as per late february 2006).
By Loki
New
interesting features coming out of Yahoo's labs are ignored, useful MSN sliders
are underused, yet nobody miss the latest crappy packaged
solution promoted by Google and his partners. And it goes the same for
all bad stuff..
Everybody knows by now the visual proof of this censor, by performing
the (infamous) following queries:
http://images.google.com/images?q=Tiananmen
(ie: [Tienanmen] instead of [tienanmen]).
See here :
http://www.crypticide.com/dropsafe/articles/security/post20060129233439.html
But it was quickly corrected and this trick isn't working anymore.
Compare the same query on Yahoo (tld .com) and Yahoo China :
http://image.yahoo.com.cn/search?p=tiananmen
Same goes on for the web search, but instead of having no results you
are redirected to the news results, where sources are obviously filtered
and subject to censorship.
Location:
http://xinwen.yahoo.com.cn/search.html?p=tiananmen&ei=utf-8&source=ysearch_www_filter_noresult
The usual one is 'ysearch_www_result_topsearch' when it's not filtered.
Yahoo's teams ? Yahoo and other competitors of Google don't have the
same hype around them, and if you publish something about them,
it won't spread like any Google related news.
multiple quotes etc.. Nothing. But finally, I tried to 'overflow' it, by
feeding the query parameter with big numbers of chars.. and it worked !
Apparently, if you add enough '+' before your queries, the filter is
bypassed, and you get censor free output.
['+'(338 times) tiananmen] :
http://xinwen.yahoo.com.cn/search.html?p=%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2Btiananmen&ei=utf-8&source=ysearch_www_filter_noresult
['+'(339 times) tiananmen] :
http://www.yahoo.com.cn/search?ei=UTF-8&fr=fp-tab-web-ycn&p=%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2B%2Btiananmen&meta=vl%3Dlang_zh-CN%26vl%3Dlang_zh-TW&pid=ysearch&source1=ysearch_www_hp_button
339 times '+' -> 339*3 = 1017
339 '+' and tiananmen -> 1026 chars
Also, I did not manage to make it work on Yahoo Images.