Off to Sorrento, Italy to meet with a hundred or so other M y S Q L development and support people. Lots of hard work and personal contact instead of the usual online things we do. Oh, and some fun mixed in... :)
Google for one indexed even the journals here which had the nobots setting, via their RSS or atom feeds and last I checked hadn't undone that inappropriate collection activity, though there were at the time blog searched launched indications that it was planned to do so.
For high visibility stuff I plan to open a blog (as opposed to journal) at some point. This one is intended to be low visibility, mostly personal things.
What prompted it initially was things like LiveJournal installing circumventions for nobots via RSS and Atom, then not providing any way to completely disable them; then the killer was LiveJournal adding a feed deliberately aimed at high volume indexing services, so it clearly moved from good and respecting nobots to working with those disregarding it. Basically, switched from good to evil so far as harvesting goes, by that point. Means I no longer trust LiveJournal to respect that setting, because it's been too repeatedly demonstrated that LiveJournal doesn't act in a way which does respect it.
I have some small hope that LiveJournal might move the RSS feed URLs to the username.livejournal.com domain so that the main robots.txt files now presented there will clearly include them. I won't hold my breath, though. And anyway, that will still present the content to bulk harvesters via that alternative route so it's not going to affect the biggest problem sources, I expect.
I'm sure that there are good intentions around at LiveJornal, it's just the implementation lacking when it comes to implementing fun new features.
So, key word obscuring as a second line of defence for people who go around nobots or simply ignore it. This matters because I don't want to just go private with everything and kill the human social networking here - I want the humans to be able to see (much of) the content.
Google for one indexed even the journals here which had the nobots setting, via their RSS or atom feeds and last I checked hadn't undone that inappropriate collection activity, though there were at the time blog searched launched indications that it was planned to do so.
For high visibility stuff I plan to open a blog (as opposed to journal) at some point. This one is intended to be low visibility, mostly personal things.
What prompted it initially was things like LiveJournal installing circumventions for nobots via RSS and Atom, then not providing any way to completely disable them; then the killer was LiveJournal adding a feed deliberately aimed at high volume indexing services, so it clearly moved from good and respecting nobots to working with those disregarding it. Basically, switched from good to evil so far as harvesting goes, by that point. Means I no longer trust LiveJournal to respect that setting, because it's been too repeatedly demonstrated that LiveJournal doesn't act in a way which does respect it.
I have some small hope that LiveJournal might move the RSS feed URLs to the username.livejournal.com domain so that the main robots.txt files now presented there will clearly include them. I won't hold my breath, though. And anyway, that will still present the content to bulk harvesters via that alternative route so it's not going to affect the biggest problem sources, I expect.
I'm sure that there are good intentions around at LiveJornal, it's just the implementation lacking when it comes to implementing fun new features.
So, key word obscuring as a second line of defence for people who go around nobots or simply ignore it. This matters because I don't want to just go private with everything and kill the human social networking here - I want the humans to be able to see (much of) the content.
Reply
Leave a comment