Importance of site maps
A site-map can be a search engine crawler friendly page.
Create a HTML page and fill it with all the links on your site (see ed-u.com's site map for an example).
Linking to a site map page from your homepage helps the search engine spiders find and index all the pages of your site easier. It has the added benefit of getting a lost visitor back on track.
If you have hundreds of pages, you could split up the site map pages to include only 50 URLs (Uniform Resource Locators / web addresses) each and make sure all these pages are linked to each other. See also how Wired.com have utilised crawl pages at www.wired.com/news/archive - click on any one of their links to understand the structure.
Notes:
If your links are embedded in javascript, search engines have problems travelling through them.
Avoid using image maps, most search engines cannot crawl them.
The Robots.txt File
The robots.txt file is a file that can tell search engine robots which
areas of your site not to index.
It is particularly useful if you want to keep some areas of your web-site away from public view, such as password protected pages or "office use only" pages.
A robot is a small software program that visits sites and automatically requests documents from them for indexing. They use the hypertext links found within documents to travel to their next destination. A search engine is a database of HTML pages gathered by a robot. Robots are also known as webbots, bots, web crawlers or spiders.
More on the robots.txt file:
Robots Exclusion
Database of Robots
Searchtools.com Robots
Internet Tips Robots
Robots Check
The Web Robots Database
Come into my parlour said the spider...
Cloaking and Spiders
Below are some web-sites that contain information about the risky subject of page cloaking and hiding your HTML from the search engines (not for the novice).
Spider Food
Spiderhunter
Search Engine Base
Search Engine Forums
Fantomaster Cloaking FAQ
Beware, even though you may think you are cloaking ethically, you might make a mistake and be flagged as a spammer. The need to keep up to date with changing spider IP addresses is also a problem.
MSN say that they will ban sites that use cloaking technology and will share information about companies they've caught with Inktomi (the search provider for many major portals) and with other search engines. Google.com calls cloaking "egregious" and says "it goes against our ideal of us seeing the same thing a search engine sees".
Do you use frames on your site? Click here to find out why frames are maybe not such a great idea...
Save this site
Please don't forget to add ed-u.com to your list of favorites/bookmarks.
Internet Explorer users please click here and others, right click here - - ed-u.com - - Also, you can learn how to make any ed-u.com page your start page by clicking here.