|
Let's first define "site archtecture". In terms of SEO / SEM, it refers to the entire framework that supports your website content and thus defines the way search engine spiders index it. Site architecture consists of the navigation structure of your Web site, the page layout and the structure of various elements on your page, your file and directory system, and the types of files you use. Search engine rankings are impacted by site architecture as long as it defines which pages will and will not be indexed. To make the architecture SE compatible, you should, always have unique and relevant content (as we already know), use the design elements that are spider-compatible, and use a navigation and linking structure that encourages regular indexing by search engines.
The elements of site architecture that influence your rankings include the file and directory system, file names and extensions, navigation menus on your pages, entry points / pages (e.g. landing pages), robots META Tag (or, alternatively, the robots.txt file), error pages, site maps, includes (Server Side Includes or SSI – if you need to find out whether you're using them, search for the files with .shtml or . shtm extension), introduction pages and dynamic content. File and folder structure It's a good idea to have SEO in mind from the moment you start creating your site. When you are building a site from the scratch, normally you'd first create the logical structure, how users will see and operate the site from outside; this scheme then defines the physical structure, i.e. how files and directories are placed on the hosting server. Many times, there isn't an opportunity to build a brand new file system, and you are forced to redesign and existing structure. In this case, it's useful to have a broken link checker at hand, the one similar to the Site Quality Auditor tool, when you're starting to make changes to your directory structure. So let' get down to business. It is believed that most search engines only index websites to a depth of 2 levels, or a maximum of 50-60 files. This is usually the case, although some advanced spiders (like Google) can dive 4 levels deep away from your home directory or the page where they've started. Therefore, try to keep important content that you want indexed on the top two directory levels of your site – i.e. www.yoursite.com/level1/level2/page.htm. If you have a very large site and need more of your pages and deeper content indexed, it is recommended that you utilize the Trusted Feed and Paid Inclusion services provided by some engines. As a general rule, pages closest to the root directory are considered the most important pages on your site by the search engine spiders. The two most important files residing here should be the home page, commonly named index and having different extensions depending on the technology your hosting server uses; and the Robots Exclusion Protocol, commonly named robots.txt. For large sites, around 100 to 200 pages should be kept in the root directory, for smaller sites, it's best to keep all pages under the root. While trying to place your most important pages at the first or second directory level, break it up into 50 files per directory. On a larger site (250-plus pages), a strategy to organize content-related files into separate directories may be considered proper. Be sure to name your files and directories with your keywords. To separate keywords in the directory names, use hyphens. Don't stuff too many keywords in your file or directory names. Make them keyword rich but not too long. There hasn't been extensive research done concerning files extensions. However there is some anecdotal evidence (yet unconfirmed by theory) that may give some clues of how search engine spiders deal with certain file extensions. As a rule, search engines do not have trouble with scanning dynamic URLs like http://www.yoursite.com/gallery.php?category=widgets&color=red&price=20&pricelimit=below&phpsessid=2o389rhfawe89gp9e4ibugq9p348 At least some of these types pages have a Page Rank value associated with them, although Google's official terms claim their spider won't index an URL has the "?" and "&" marks in it. The research on this topic is currently in progress. One thing that is certain is that if the dynamic page is called as a result of a user submitting a form, it will never get indexed, even if it contains your richest product catalog. However, if this page is called just by clicking a link, chances are good that it is open for spidering. Some SE experts are reluctant to use .shtm and .shtml file extensions. To some extent, they are right because search spiders have shown some disregard for files of this type (files with this extension use the Server Side Include technology which forms pages dynamically at the moment they are called instead of reading them from the Web server). File extensions that are completely OK to use are "htm", "html", "txt", and all kinds of Web images (.jpeg, .gif, .png and .bmp) for Web image search service of certain engines. Some advanced spiders like Google will also index your .pdf and .doc files if you link to these files. Google claims to be able to also index .swf files, however this has yet to be confirmed. Naming images after keywords is particularly important now that AltaVista and Google have image searches. Name your .pdf files after your keywords as well. Site Navigation Scheme Another element of site architecture is a site's navigation scheme. There are several kinds of navigation schemes; some are more spider-friendly than others. For example, a set of navigation buttons is often more spider-friendly than a DHTML pull-down menu. And a set of hypertext links is often more spider-friendly than a set of navigation buttons. Many designers use so-called "breadcrumbs" to show visitors where they are at the current moment and how they can get back to where they just were. Breadcrumbs are just a sequence of text links usually placed on the top of pages that specifies the logical path from the root page to the page where the visitor is now. For example: Home > biking widgets > red biking widgets In this case, users know they are currently browsing the category "red biking widgets"; "Home" and "biking widgets" are links to different levels of parent categories. Using breadcrumbs is a perfect idea, especially with large Web sites that use a dynamic technology; it enhances and fortifies your link structure. Types of Web Pages According to some usability professionals, there are seven types of Web pages. Others claim there are 11. Regardless of the number, it's important for SEMs to understand different types of Web pages do exist. How you write, design, optimize, and promote a Web page depends on the page type. Some of the most common Web page types are: - Home page
- Category/gallery page
- Product page
- Form page
- News/media page
- Services page
- Advertising (landing) page
- Search/search results page
- Credibility page
Optimization and design strategies are different for these kinds of pages because the calls to action are not the same for these page types. Entry pages Any page that is meant to bring you traffic (e.g. home page, credibility pages, landing pages of the advertising campaigns) can be called "entry pages" because these are the points where visitor flow enters your Web site. We recommend that you optimize and submit each of your entry pages first. Make them stand-alone like your home page. When a visitor comes to your site through an entry page, take care to let them know where they are (a good place to use breadcrumbs), who you are, and what the page is about. Include full navigation on all entry pages and make it obvious what the page and site is about. Visitors can find any of your entry pages on the search engine, so be aware that much of your traffic will arrive at places other than your home page. If your visitors come through your "contact us" page, for example, and all they see is a form, that doesn't tell them where they are or what the page or site is about.
|