I was just thinking about the fact that all of the major search engines are essentially gigantic closed proprietary systems. Even Google, which talks about major points of its algorithm, basically operates as a big black box. Sure there are thousands of books and articles and opinions out there on the subject of SEO (Search Engine Optimization), and even the search providers themselves offer guidelines and caveats and terms of use/abuse-- but essentially how they function is kept a secret, ostensibly in the interests of preserving the intellectual property and interests of the companies which provide those services.
But what impact does the secrecy of the search company operations/algorithms have on the usefulness of those services? On the one hand, keeping some secrets gives the companies the option (and not all of them appear to capitalize too much on the option) to thwart abuse of the service which occurs via link-spam, meta-tag/keyword spam, and other forms of misrepresentation of content for the purpose of search index manipulation.
On the other hand, it is precisely because of the closed nature of the search algorithms and policies that so many people try everything under the sun to manipulate the system, which turns SEO into some kind of mystical discipline instead of a commonly understood process. This creates real frustration for people who have no interest in intentionally abusing the system, but because of the lack of absolute policy/algorithmic information are unable to get indexed in one or more systems.
It could perhaps be said that Open systems are the easiest to find weaknesses in and exploit --but only insofar as they are not capable of being designed to curtail exploitation. Is it possible to design and maintain a successful, useful, public search engine with a published algorithm? Could it be economically viable? Does it already exist? Are there applications for which an Open search system is a better or worse choice?
I tend to have tunnel vision on the issue of Open vs. Secret systems, as I have as a knee-jerk emotional reaction to the idea of sharing and Open-ness is what life is all about-- let alone the Internet and even Searching (after all the purpose of search engine is to enable the sharing of information, right?) But even if I am biased, I think it is really the kind of situation which probably has a lot of hidden factors that could only really be brought to light if the project to create a truly Open (as in Source) web search engine was actually undertaken.
Update: Right before I clicked Post on this, I decided to quickly Google for 'open source search engine' and I found
an article on SearchEngineWatch.com from Sept 11, 2003 which identifies
www.nutch.org as one current attempt at this. Promising?