Our Network: Proxy Directory NEW! Top Sites Learn SEO eBusiness solutions Proxy Browsing NEW! Anonymous Surfing Shrink your link! NEW!
 Submit new Articles
Classification of Search Engines PDF Print E-mail
User Rating: / 0
PoorBest 
Written by Seo Master   
Thursday, 16 August 2007

Image Often, the term "search engine" is misused to describe both directories and pure search engines. In fact, they are not the same thing; the difference lies in the way result listings are generated.

In fact, there are four major kinds of search engines you should know about. These are:

  • crawler-based (traditional, common) engines;
  • directories (mostly human-edited catalogs);
  • hybrid engines ( META engines and those using other engines' results);
  • pay-per-performance and paid inclusion engines.

Crawler-based SE, also referred to as spiders or web crawlers, use special software to automatically and regularly visit Web sites to create and supplement their giant repositories of Web pages.

This software is referred to as "bot", "robot", "spider", or "crawler". All these terms denote an identical concept. These programs run on search engines. They browse the pages that already exist in their repositories, and find your site by following links from those pages. Alternatively, after you have submitted your pages to a search engine, these pages have been queued for scanning by a spider; so it finds your page by looking through the lists of the pages awaiting its visit in this queue.

After a spider has found a page to scan, it retrieves this page via HTTP (like any ordinary Web surfer who types an URL into the browser's address field and presses "enter"). As any human visitor, the crawling software would leave a record on your server about its visit, so there's a way for you to know from your server log when a search engine has dropped in to your online estate.

Your Web server returns the HTML source code of your page to the spider. The spider then reads it (this process is referred to as "crawling" or "spidering") – and here the difference begins between the human visitor and the crawling software.

A human visitor can appreciate the quality graphics and impressive flash animation you've stuffed your page with. A spider won't. A human visitor can not read HTML comments and META description. A spider can. A human visitor will first notice the largest and most attractive text on the page. A spider will give more value to the text that's closest to the beginning and end of the page.

Perhaps you've spent a fortune creating a killer Website designed to immediately captivate your visitors and gain their admiration. You've even embedded lots of quality Flash animation and JavaScript tricks. Yet, a search engine spider is a robot which only sees that there are some images on the page and some code embedded into the "<script>" tag it is instructed to skip. These design elements will make additional obstacles on its way to your content. What's the result? The spider ranks your page low. No one finds it on the search engine and no one is able to appreciate the design.

SEO is the solution to make your page more search engine compatible. It is mostly oriented on the crawler-based engines. We're not telling you to avoid design innovations; instead, we will teach you how to properly combine them with your optimization needs.

Let's return to the way a spider works. After it reads your pages, it will compress them in a way convenient to store in a giant repository of Web pages called a search engine index. The data are stored in the search engine index so that it's available to quickly determine whether this page is relevant to a particular query, and pull it out to include in a result page shown to a Web surfer. The process of placing your page in the index is referred to as "indexing". After your page has been indexed, it will appear on search engine results pages for the words and phrases most common on the indexed web page. However, its position in the list may vary.

Later, when someone searches the engine for their terms, your page will be pulled out of the index and included into search results. The search engine now applies a sophisticated technique to determine how relevant your page is to these terms. It considers many on-the-page and off-the-page factors and finally the page is attributed a certain position, or rank, within other results found for the surfer's query. This process is called ranking.

Google (www.google.com) is a perfect example of the crawler-based SE.

Human-edited directories are different. The pages that are stored in their repository are added solely by manual submission. Mainly, the directories require manual submission and use certain mechanisms to prevent pages being submitted to them automatically. After you complete the submission procedure, your URL will be queued for reviewing by an editor, who is, luckily, a human.

When directory editors visit and read your site, the only decision they take is to accept or reject the page. Most directories do not have their own ranking mechanism – they use some obvious factor to sort the URLs, such as alphabetic sequence or Google Page Rank (explained later in this course). It is very important to submit a relevant and precise description to the directory editor, as well as take other parts of this manual submission seriously.

Spider-based engines often use directories as a source of new pages to crawl. As a result, it's self-evident in SEO that you should treat directory submission and directory listings as seriously and responsibly as possible.

While a crawler-based engine would visit your site regularly after it has first indexed it, and detect any change you make to your pages, it's not the same with directories. In a directory, result listings are influenced by humans. Either you enter a short description of your website, or the editors will.. When searching, only these descriptions are scanned for matches, so that website changes do not affect the result listing at all.

As directories are usually created by experienced editors, they generally produce better (at least better filtered) results. The best-known and most important directories are Yahoo (www.yahoo.com) and DMOZ (www.dmoz.org).

META and Hybrid engines. Some engines also have an integrated directory linking to them. They contain websites which have already been discussed or evaluated. When sending a search query to a hybrid engine, the sites already evaluated are usually not scanned for matches; the user has to select them explicitly. Whether a site is added to an engine's directory generally depends on a mixture of luck and contentquality. Sometimes you may "apply" for a discussion of your website, but you do not get any guarantee that it is done.

Usually, a hybrid search engine will favor one type of listings over another.

Yahoo (www.yahoo.com) and Google (http://www.google.com), although mentioned here as the examples of a directory and crawler respectively, are in fact hybrid engines, as are most major search machines nowadays. As a rule, a hybrid search engine will favor one type of listings over another. For example, Yahoo is more likely to present human-powered listings and Google its crawled listings. Another example is MSN Search that presents human-powered listings from LookSmart. However, it does also present crawler-based results (provided by its own Web crawler), especially for more obscure queries.

Meta Search Engines. Another approach to searching the vast Internet is the use of the multi-engine search, or meta-search engine that combines results from a number of search engines at the same time and lays them out in a formatted result page. A common or natural language request is translated to multiple search engines, each directed to find information the searcher requested. The search engine's responses thus obtained are gathered into a single result list. This search type allows the user to cover a great deal of material in a very efficient way, retaining tolerance of imprecise search questions or keywords.

Examples of multi-engines are MetaCrawler (http://www.metacrawler.com) and DogPile (http://www.dogpile.com). MetaCrawler refers your search to seven of the most popular search engines (including AltaVista and Lycos), then compiles and ranks the results for you.

Image

How META search engines work

Pay-for-performance and paid inclusion. As is clear from the title, with these engines you have no way other than to pay a recurring or one-time fee to keep your site either listed, or re-spidered, or top-ranked for the certain keywords of your choice. There are very few search engines that solely focus on paid listings, the most notable exception was former Overture. However, most major search engines offer a paid listing option as a part of their indexing and ranking system.

Unlike paid inclusion where you just pay to be included into search results, in an advertising program listings are guaranteed to appear in response to particular search terms, and the higher your bid, the higher your position will be for these terms. Paid placement listings can be purchased from a portal or a search network. Search networks are often set up in an auction environment where keywords and phrases are associated with a cost-per-click (CPC) fee. Such a scheme is referred to as Pay-Per-Click (PPC). Yahoo and Google are the largest paid listing providers, but MSN and other portals sometimes sell paid placement listings directly as well.

Comments (0)Add Comment

Write comment
quote
bold
italicize
underline
strike
url
image
quote
quote
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley
Smiley

security code
Write the displayed characters


busy
 

Ads

Polls

Where did your hear about WebUniver?
 

Who's Online

Image

Learn to Fail to be Successful in Making Money Online

28.08.2007 | Business

Being successful online is part of all bloggers dream and to be successful in making money online, you need to fail to gain experiences and learn from your mistakes so that it will not repeat its history. Many people do not understand the word "Failure" and once they fail…

Auditing and Improving Your Site

22.10.2007 | Marketing

You may ask yourself why we include material on site maintenance in a course that deals mainly with search engine optimization, promotion, and marketing – isn’t this the job of the webmaster or site administrator?Remember the Integrated Approach considers site quality maintenance a secondary yet obligatory addition to your promotion…

Image

What Are The Pros And Cons Of Using Flash Sites

22.08.2007 | Design & Development

Flash - based sites have been a craze since the past few years, and as Macromedia compiles more and and great features pursuit Flash, we can only predict practiced will exemplify more and more flash sites around the Internet. However, Flash based sites have been disputed to be bloated…