image ResEx Logo
ResExcellence www : Powered by Google
Cell Phone Themes Icons Mighty Mouse Cursors Software Reviews Widgets & Widgets


Files are in Stuffit 5 or greater format.
Free download.

Tell us about a bad link.

Thank You!


Running
without a restart.

Comparing Spotlight to Grokker and DevonThink Pro
by Erik Vlietinck, Publisher IT-Enquirer and Freelance IT-editor
January 20th, 2005

With Spotlight again having enjoyed considerable attention from the Mac crowd, thanks to the presentation by Apple’s CEO himself , it is time to put Spotlight under the microscope and compare it with other search technologies available on the Mac.

Why is Spotlight considered such an advance? Because we are flooded with information, contained in e-mail messages, web pages, and plain old office documents. To find our way through all that information is becoming increasingly harder, and so a little automated help is more than welcome. Enter Spotlight, the search fueature in Mac OS X 10.4 aka Tiger.

Steve Jobs always compares Spotlight to Google Desktop, and while this may actually make sense from a point of view where you try to convince PC users to switch to the Power Mac platform, it doesn’t much good for Mac users. Google Desktop isn’t available to us (yet?), but there are search and knowledge disclosure applications available on the Mac.

_01_20_grokkerThis article compares two of those with how Spotlight behaves and the results you obtain. The two alternative technologies we cover are Grokker 2.2 and DEVONTechnologies Think Pro. The latter is a beta product, which I will take into account when I cover the modus operandi and results with DEVON’s product.

One of the queries I ran across these three applications was a search for “proofing colour”. Especially this one was a little tricky because I know the query string doesn’t appear as an exact match in any of the documents on my system. The term “colour proofing” does, but this is different from the reverse, even for Google.

Concepts

Spotlight is a search engine that largely depends on meta data to return accurate and complete result lists. Spotlight extracts metadata attributes from your files, so you can use them in your search. If you can’t remember the name of an attribute, just select Other from the search filter to choose from a complete list of supported attributes. 

Spotlight combines full-content indexing with meta data indexing. This combination is more or less what advanced search engines such as Verity do as well. The results of this combination are immediately visible in both the accuracy and completeness of the document list that results from your query.

The results list itself is divided into sections. The most probable bull’s-eye documents are listed at the top, in a separate section. Following are documents that match the query string but less than the first few documents listed. These documents are listed and grouped by topic. The user can choose to change the grouping of documents.

Disclaimer: the above information was obtained through Apple’s public web site. No information contained in the paragraphs above was subject to any non-disclosure agreement.

Grokker 2.2 in essence is a search engine categorizer. It uses the output from a number of search engines like Google and AltaVista to draw a clustered graphical representation of the search results. Grokker can be used to search your desktop files as well. It is impossible to find how Grokker searches local files. As far as I can tell, no indexing is performed, nor is the Mac OS X index used to perform the search. Nor does Grokker use any avilable meta data.

Perhaps Grokker’s method for searching local documents will improve when it is used on a Tiger system, but the application clearly lacks the power and the underlying technology to find even remotely accurate document results.

The last application, DEVONTechnologies’ Think Pro, is not a search application in the strict sense of the word. It really is a knowledge tool, in that you first have to set up your search domain. This can be one or more folders, your total email message collection, or one file that you drop on the application and which you want to search purely on content.

Think Pro is database-oriented, so the application will index content and store its index and content information inside one database file per knowledge set --basically a set of documents that you will search through. Still, if you are patient enough to wait 11 minutes (this was a beta version, so I expect the indexing to be faster when all the debugging code has been removed) until 17,000 documents have been indexed, you can start a query much the same way as you would with Spotlight.

Think Pro’s concept goes one step further, though. As soon as you enter the query, the results start appearing in a separate window panel, with the text you were looking for in another panel. You can then expand your search by using the “See also” feature, which automatically and simultaneously creates a list of documents that do not contain your query terms, but contextually or conceptually resemble your query terms.

This is what makes Think Pro a knowledge instrument, more than a simple search tool. But it can be used as such too, of course.

Query Method

I’m sure you have already heard of boolean search. That’s the kind of query method most people do not know how to use, as it depends on mathematical concepts. Boolean queries force you to follow a specific order in the terms you enter and on a specific use of terms to “bind together” multiple search terms.

None of the applications use the boolean search method. At least, it doesn’t make a difference whether you enter a boolean search or just type search terms one after the other. At the other end of the spectrum you have the natural language search method. This enables the user to enter a query in the form of a sentence, a question. You might enter: “Show me all the document on proofing colours.”

A natural language search engine will recognize the redundancy in the question, i.e. it will only search for “document” and “proofing colours” and will return documents containing those terms. If the engine has been programmed smartly enough, it will also filter out the word “documents”, and that is what we want.

None of the three applications are capable of natural language search.

The three applications support Google type searching, i.e. you just enter from one to multiple search terms and put quotation marks around them if you want it to be searched as a literal string.

Spotlight doesn’t support quotation marks, which means you cannot search for exact sentences. This may not matter much, however, since Spotlight seems capable of returning accurate lists as it is. My attempts to fool the system, which went beyond the example I gave above, all failed. This is what you would expect from a search technology that not only depends on content indexing but on meta data as well.

Grokker offers support for quotation marks but that doesn’t mean it gets any more accurate. In fact, the presence of quotation marks and multiple search terms don’t seem to matter much for the accuracy of the results when searching locally. Groxis should definitely rewrite the local search plugin or deliver a user-driven means of tuning the plug-in.

Think Pro supports quotation marks in that it doesn’t return zero documents when you put them around a query string. The resulting list is identical with the same string without quotation marks, however. Multiple search terms do make a difference in Think Pro, just as it does with Spotlight.

Disclaimer: the above information is based on the information found on Apple’s public web site. No information contained in the paragraphs above was to be considered a result of personal testing.

The Results

The most important “features” of a search engine technology are its accuracy and completeness. Accuracy means the results list returns documents that contain the query string. Completeness means that all documents containing the query string are returned.

Completeness usually isn’t the problem. If you look at Alta Vista or Google, you can easily get a few million documents returned on almost any query (although you can’t control if that list is complete, of course...). It’s the accuracy that is the problem with most search technologies. Today, there are a number of algorithms in the field that are capable of returning very accurate result lists with an acceptable degree of completeness.

Grokker is the worst of the three applications under review here. Grokker clearly only shines when used on web searches. There it relies on Google’s and others’ technology to perform the actual search. The visualisation of the results is equally good and quite astonishing whether you perform a web search or a local one. But visualising a totally inaccurate result list doesn’t help you at all.

Think Pro is very accurate and complete. It supports query term highlighting in the found document text, making your life a lot easier when you’re searching through a pile of documents for a particular term. The See Also feature is very useful because it too is quite accurate and capable of disclosing relationships that remain occult otherwise.

On Spotlight’s results I can’t comment much beyond what was shown at MacWorld Expo, for legal reasons. Still, it is obvious from what was shown there that Spotlight is capable in terms of accuracy and completeness, with slightly more weight given to accuracy. Spotlight has other advnatages too. It will allow you to keep some documents private, it will enable you to set search performance preferences, it will enable you to create folders based on saved queries that will update their result lists automatically, etc.

This makes Spotlight a hard to beat technology as it delivers all those capabilities on the operating system’s level. Which puts the question before us why you should need an application like Think Pro. The answer to that one is that Spotlight, for all its power, isn’t capable of discovering possible relationships between documents, and that it will not show you where your search terms inside the document can be found.

Spotlight works at file level, while Think Pro works at document level. So, to find the 1,200 page document that contains the term “proofing colour” you will use Spotlight, but to find the term itself inside the document, you’d better use an application like DEVONtechnologies’ Think Pro or Enterprise.

Disclaimer: the above information on Spotlight was obtained from the Technology Preview, a PDF document downloadable from Apple’s public web site. No information contained in the paragraphs above was subject to any non-disclosure agreement.

More recent articles and reviews by Erik Vlietinck, Publisher IT-Enquirer and Freelance IT-editor.
Extensive Review of Acrobat 7
Review of Roxio Popcorn, an application with which you can copy dual-layer DVDs to single-layer ones.
Discussion of the HP Photosmart 8450 photo printer in use with a Mac, iPhoto, Photoshop, etc.
Discussion of Apple's legal moves against web owners who have disclosed information on yet to be released products.
Review of SketchUp 4, @Last Software's object modeling application which is especially targeted to architects and mechanical designers.
Apple Market Position Survey, ten questions aimed at polling people's perception of Apple in the market place.

Cell Phone Themes Icons Mighty Mouse Cursors Software Reviews Widgets & Widgets

Maintained by the Staff of ResExcellence. This entire site ©1997-2006 ResExcellence
Privacy Statement? Sure we gotta Privacy Statement. [an error occurred while processing this directive]