Monday, March 9, 2009

''A Search Engine''
A program that searches documents for specified keywords and returns a list of the documents where the keywords were found. Although search engine is really a general class of programs, the term is often used to specifically describe systems like Google, Alta Vista and Excite that enable users to search for documents on the World Wide Web and USENET newsgroups.
Typically, a search engine works by sending out a
spider to fetch as many documents as possible. Another program, called an indexer, then reads these documents and creates an index based on the words contained in each document. Each search engine uses a proprietary algorithm to create its indices such that, ideally, only meaningful results are returned for each query.
A Web search engine is a tool designed to search for information on the World Wide Web. The search results are usually presented in a list and are commonly called hits. The information may consist of web pages, images, information and other types of files. Some search engines also mine data available in newsbooks, databases, or open directories. Unlike Web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input
''Three Types of Search Engines ''

The term "search engine" is often used generically to describe crawler-based search engines, human-powered directories, and hybrid search engines. These types of search engines gather their listings in different ways, through crawler-based searches, human-powered directories, and hybrid searches.
Crawler-based search engines
Crawler-based search engines, such as Google (
http://www.google.com), create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found. If web pages are changed, crawler-based search engines eventually find these changes, and that can affect how those pages are listed. Page titles, body copy and other elements all play a role.
The life span of a typical web query normally lasts less than half a second, yet involves a number of different steps that must be completed before results can be delivered to a person seeking information. The following graphic (Figure 1) illustrates this life span (from
http://www.google.com/corporate/tech.html):



1. The web server sends the query to the index servers. The content inside the index servers is similar to the index in the back of a book - it tells which pages contain the words that match the query.
2. The query travels to the doc servers, which actually retrieve the stored documents. Snippets are generated to describe each search result.
3.The search results are returned to the user in a fraction of a second.
Human-powered directories
A human-powered directory, such as the Open Directory Project (
http://www.dmoz.org/about.html) depends on humans for its listings. (Yahoo!, which used to be a directory, now gets its information from the use of crawlers.) A directory gets its information from submissions, which include a short description to the directory for the entire site, or from editors who write one for sites they review. A search looks for matches only in the descriptions submitted. Changing web pages, therefore, has no effect on how they are listed. Techniques that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.
Hybrid search engines
Today, it is extremely common for crawler-type and human-powered results to be combined when conducting a search. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search (
http://www.imagine-msn.com/search/tour/moreprecise.aspx) is more likely to present human-powered listings from LookSmart (http://search.looksmart.com/). However, it also presents crawler-based results, especially for more obscure queries.
THE 3 SEARCH ENGINES..ALTAVISTA,HOTBOT AND DOGPILE..
==Altavista==

it was one of the three largest and most important search engines for many years, but it is no longer as popular as it used to be. It had two distinct search modes: Basic Search and Advanced Search. In August 2000, it introduced a third: the Power Search. In Feb. 2002, the Power Search features were added to the Advanced Search page and then in Nov. 2002 were also moved to the More Precision page. There are some significant differences between the Basic and Advanced search pages, as will be seen below. In Feb. 2003, AltaVista was brought by Overture. Overture expected to merge the AltaVista and AlltheWeb databases later in 2003, but once Yahoo! bought Overture, AltaVista's database was replaced by a Yahoo!/Inktomi on March 25, 2004. Use the table of contents on the left to navigate this review.Database:AltaVista has a variety of databases:Web database: AltaVista's own indexed Web pages including PDF filesDirectory: Open Directory (formerly LookSmart)News: AltaVista's own crawled pages (formerly from Moreover)Ads: from OvertureImages: AltaVista's own crawled image filesAudio and Video: AltaVista's own crawled multimedia filesAltaVista has experimented with a variety of databases in addition to their regular Web page database. In the past, they have served results from Ask Jeeves, their own Usenet database, RemarqUsenet, Overture (formerly GoTo) ads, RealNames Internet Keywords, and Look smart categories. As of Dec. 2002, most of these additional databases are gone, except for the Overture paid positioning results which may appear at the top and bottom of results, labeled as "Sponsored Matches." AltaVista does have other databases available, including images, MP3/audio, video, directory, and News databases. In addition, there are the Altavista Shortcut which may show up at the top of regular search results. These provide quick links to selected popular information.

=== Hotbot===

HotBot, owned by Terra/Lycos, is one of older Web search engines. Originally it just used the Inktomi database and then added Derict hit and the Open directory. Then in Dec. 2002, it relaunchedas a multiple search engine with Inktomi, Fast, Google, and Teoma. In July 2003, they stayed with the same four databases, but renamed them HotBot, Lycos, Google, and Ask Jeeves. Lycos was dropped in March 2004. This review covers HotBot using the Inktomi database, which they now call "HotBot." See the Google and Teoma (Ask Jeeves) reviews for more details on how their database and interface work, bearing in mind that not all features are available at HotBot. The basic search screen shows no options, but choose Advanced Search for the full range of search features. To see how HotBot used to work, see the old Search Engine Showdown Review. Use the table of contents on the left to navigate this review.HotBot offers the choice of three search engine databases:* HotBot (which is actually a Yahoo!/Inktomi database, and the version reviewed here)* Google* Ask Jeeves (the Teoma database)HotBot is one of the early Internet search engines and was launched in May 1996 as a service of Wired Magazine. It was launched using a "new links" strategy of marketing, claiming to update its search database more often than its competitors. It also offered free webpage hosting, but only for a short time, and it was taken down without any notice to its users. Though competitive when it was acquired by Lycos in 1998, HotBot has in recent years reduced its scope. Today the website is merely a front end for third-party search engines Yahoo.com, and MSN, as well as Lycos' own lyGo.com. It was one of the first search engines to offer the ability to search within search results. The site still exists, however it is run by Yahoo! mainly.

==Dogpile==

Dogpile is a meta-search engine that transmits a search simultaneously to several individual search engines and their databases of Web pages. The default list of search engines queried varies. You can also customize Dogpile to search specific search engines. Results are retrieved in lists of 10 hits from each engine queried. If more than 10 results are found, a link to the next list of hits is given. It is important to remember that meta-search engines only spend a short time in each database and may only retrieve a small percentage of any of the results in any of the databases queried.Dogpile also for me is a best example of search engines for the reason that Dogpile fetches and ranks results from multiple search engines, letting you search for key word, pictures, audio, video, news,or phone numbers

No comments:

Post a Comment