WTF is /ai/love ?

BacklinksMonkey

I cans rite.
Apr 7, 2010
331
0
0
Niggaland - GMT+2
I'm getting shitloads of requests from spiders to bogus /ai/love/* URL's on several domains hosted on the same server. I'm even getting requests on a domain that's just been registered. This is the kind of URL's I'm talking about:

Code:
216.129.119.47 - - [29/Jul/2010:14:14:48 +0400] "GET /ai/love/is-there-any-wholesale-clothing-that-are-specifically-for-women.html/feed HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:14:29:14 +0400] "GET /ai/love/i-have-a-super-bad-credit-and-i-really-have-to-take-out-a-personal-loan-help.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
95.108.248.30 - - [29/Jul/2010:14:35:57 +0400] "GET /ai/love/in-what-ways-are-the-stock-market-and-the-real-estate-market-related.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
95.108.248.30 - - [29/Jul/2010:14:35:59 +0400] "GET /ai/love/in-what-ways-are-the-stock-market-and-the-real-estate-market-related.html/ HTTP/1.1" 200 1936 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
216.129.119.47 - - [29/Jul/2010:14:42:31 +0400] "GET /ai/love/where-to-buy-furniture-electronics-in-chennai.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
66.249.65.207 - - [29/Jul/2010:14:46:33 +0400] "GET /ai/love/where-can-i-buy-affordable-furniture-or-new-kicthen-cabnets.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.65.207 - - [29/Jul/2010:14:46:34 +0400] "GET /ai/love/where-can-i-buy-affordable-furniture-or-new-kicthen-cabnets.html/ HTTP/1.1" 200 1936 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
216.129.119.47 - - [29/Jul/2010:14:54:55 +0400] "GET /ai/love/how-can-i-find-if-co-i-worked-for-many-years-ago-theyve-since-changed-names-has-a-pension-for-me.html/feed HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:15:00:12 +0400] "GET /ai/love/what-is-the-best-way-to-remove-poison-oak-from-furniture.html/trackback HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
95.108.248.30 - - [29/Jul/2010:15:09:07 +0400] "GET /ai/love/tag/half HTTP/1.1" 301 20 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
95.108.248.30 - - [29/Jul/2010:15:09:10 +0400] "GET /ai/love/tag/half/ HTTP/1.1" 200 1936 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
216.129.119.47 - - [29/Jul/2010:15:16:08 +0400] "GET /ai/love/how-can-the-average-person-hedge-against-a-fall-in-the-value-of-the-us-dollar.html/feed HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:15:30:28 +0400] "GET /ai/love/how-long-of-a-break-do-airplanes-get-between-flights.html/trackback HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
95.108.248.30 - - [29/Jul/2010:15:42:18 +0400] "GET /ai/love/yipes-i-applied-to-fafsa-and-i-have-not-yet-enrolled-in-a-college.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
95.108.248.30 - - [29/Jul/2010:15:42:20 +0400] "GET /ai/love/yipes-i-applied-to-fafsa-and-i-have-not-yet-enrolled-in-a-college.html/ HTTP/1.1" 200 1936 "-" "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
216.129.119.47 - - [29/Jul/2010:15:43:52 +0400] "GET /ai/love/why-does-cheap-alcohol-give-you-a-worse-hangover-than-the-more-expensive-stuff.html/feed HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:15:56:17 +0400] "GET /ai/love/any-body-know-how-security-works-for-us-bound-international-flights.html/feed HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:16:01:18 +0400] "GET /ai/love/whats-the-name-of-the-old-school-clothing-brand-that-featured-african-american-men.html HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"
216.129.119.47 - - [29/Jul/2010:16:17:49 +0400] "GET /ai/love/tag/spring/page/2 HTTP/1.1" 301 20 "-" "Mozilla/5.0 (Twiceler-0.9 http://www.cuil.com/twiceler/robot.html)"

Needless to say, none of those URL's ever existed on any domain. WTF is this? and I do mean shitloads of requests:

Code:
~# grep "/ai/love" /var/log/apache2/* | wc -l
3475
 


Rather than blocking /ai/love access, would it make sense to set up 301's from all affected domains to a splog and generate content for the requested URL on the fly? Even though I've got no fucking idea what this is, I might profit from it!

Any inputs?
 
BOOH *blush* got it

Several domains from my hosting company's previous customers still point to my IPs. At a closer look at the log files I found some spiders that show the original request.

Nevertheless, I still find it funny that none of the URLs are indexed by those SEs and still their spiders are trying to retrieve them. Therefore, even though "bots don't have cc's" (you got a +1 for that, dude :D), does it make sense to have those URL's indexed on some domain? I mean, I would get some pages from a site (splog, whatever) indexed for free...