One of the most frustrating things about researching from a paper book is the limits upon its searchability. If it's a work of non-fiction, it may have an index -- but did the indexer think significant the things that you're trying to find? If it's fiction and you're trying to locate that one scene that sticks in your mind, that quote that used such an apt turn of phrase, you can spend hours searching and never be sure that you didn't misremember, or worse, confabulate a non-existent scene out of several half-remembered actual ones.
Which was why Google's plans to scan and index every book in the world seemed like such an exciting prospect. Until it ran aground on the
realities and practicalities of copyright law.
The central problem is the enormous number of books that are, or may, still be in copyright but have little or no paper trail leading to any rights-holders. Authors were obscure and it is often impossible to identify them and their possible heirs. Publishers have gone out of business and their records were discarded, particularly as they related to books that had gone out of print or were otherwise regarded as having no continued value.
And as copyright law is currently written, a would-be reprinter or digitizer is hampered by the fact that absence of evidence does not constitute evidence of absence. No matter how many null results your search for a living copyright holder turn up, there's always the possibility that somewhere out there is a line of searching that will turn up someone who holds a legitimate claim to the rights in question. As a result, the cost of searching for copyright holders can quickly outstrip the value of the work in question.
However, any effort to create a system to allow such "orphaned works" to be published without finding the rights-holder, even ones that require payments held in escrow, runs into the problem of defining an adequately thorough search in such a way that it can't be gamed by large corporations eager to define works as orphaned even when the rights-holders are identifiable by a relatively trivial effort. We've seen far too many instance of corporations bending legislation to their advantage to believe it could work otherwise unless the definition of an adequate search for a rights-holder is absolutely airtight.
As a result, it looks like it's going to be a long, long time before we get a searchable digital library of all books. Not because of limits on the technology, but because of the legal issues involved.