Treat deleted entries consistently to allow deletion of abusive comments

Jan 22, 2009 20:19


Title
Treat deleted entries consistently to allow deletion of abusive comments

Short, concise description of the idea
If an entry does not exist, then the LiveJournal server should ALWAYS give an Error 404. Else, abusive comments are cached permanently in Google.

Full description of the ideaCurrently, if the LiveJournal server is requested to display ( Read more... )

entry deletion, searches, § no status

Leave a comment

danceinacircle January 27 2009, 20:54:03 UTC
I'm mongoose on the suggestion (though consistent behavior is always a plus), but I fail to see how it's LiveJournal's fault that Google won't remove a cached page without a 404. If the content is abusive and/or criminal, Google shouldn't be hosting it on their servers.

Reply

polyfrog January 27 2009, 21:02:46 UTC
But how do they programmatically determine that?

I get an abusive comment.
Google caches that page.
I delete the comment.
It's still in the cache; it gets served on the google search results page under "cached".

vs

I get an abusive comment.
Google caches that page.
I delete the comment.
Google sees the 404 next time the spider comes through and deletes the cache and references to it from results.

Reply

danceinacircle January 27 2009, 21:05:06 UTC
I know very, very little about Google's caching system - they don't have a way to manually remove items from the cache?

Also, wouldn't deleting the comment just remove that specific page (ie http://community.livejournal.com/suggestions/917740.html?thread=14525420) from the cache? Would it still be cached under the http://community.livejournal.com/suggestions/917740.html address? (Trying to understand!)

Reply

polyfrog January 27 2009, 21:21:26 UTC
They might have such a way. But they won't use it, because it would be a huge logistical nightmare to keep up with all the requests for its use.

We're venturing further away from what I know for sure about Google, but here we go:
Yes, they would still show a cache of the main comment page with the deleted comment for a while, until the cache was refreshed.

But the cache of the single comment would persist for quite a while longer (because it was changing less fast before being deleted), and be its own hit on the results page...unless the page is reported as a 404 the next time the spider comes through.

Reply

mooism January 27 2009, 21:54:49 UTC
If LJ supported the sitemap protocol (it seems not to) then Google (and other search engines) could be expected to visit the sitemap regularly, and discover that the page had changed.

Reply

azurelunatic January 28 2009, 02:21:03 UTC
Just from skimming that sitemap protocol there, this would be one very large file listing all the public pages on LJ, including all entries and every expanded comments page?

That would be one very, very large file.

Reply

mooism January 28 2009, 09:20:48 UTC
One file per lj, but yes.

I started writing it up as a suggestion last night, then stopped when I realised it would probably be impractical.

Reply

pauamma January 27 2009, 22:45:23 UTC
But if the Google indexer sees different content (even if not a 404) on its next visit, why doesn't it update its cached copy? (I'm confused. Am I missing something obvious?)

Reply

azurelunatic January 27 2009, 23:17:55 UTC
Perhaps the Google indexer is a very stupid robot that assumes that since the page still exists, it would not have changed?

Reply


Leave a comment

Up