Defining and Eliminating Search Engine Roadblocks
When launching a website you need to be sure it is Search Engine friendly. Some website may look nice and pleasing to the user but not for the Search Engines. For the optimization process it is vital to ensure there are no roadblocks for the crawlers.
Here is a lost of things to check:
- Ensuring search engine visibility: it is important to make sure your website is open for search engines. I have experienced developers who accidentally denied search engines and crawlers, which caused the client websites disappear from the search engine result pages. To make sure you will not have that problem, check the META tag of all your pages. This should be on the very top of every page and should read: <meta name=”robots” content=”index,follow”>
- Checking the robots.txt file: Go to www.yourwebsiteurl.com/robots.txt to make sure it exists and if you see anything like Disallow: / in the content, then get it fixed and consult your website developer.
- Provide browsing options: Search Engines can’t submit forms, thus they will have trouble navigating to pages that are visible only after submission of forms or selection of a category. It would be nice to include a static HTML sitemap on the website, or maybe a directory of all products sorted out in their correct sections, a location database that can be accessed by clicking links, etc…
- Eliminate registration form: Some websites use registration form to block users from reading the content on the pages and forcing them to subscribe. This hurts you in two ways:
- it reduces the number of people who will read that content, while you need the opposite
- the search engines can’t fill out forms, thus they cannot reach that page with quality content to cache.
- Eliminating login forms: Considering the above mentioned reasons, login forms also cause a problem for search engines. So try to escape them unless its a secured page that should not be cached by the search engine.
- Using AJAX and DHTML: These are quote popular scripting options for websites that need to provide some interactivity or load content from other pages or database without reloading the page. They are catchy and look pleasing for the user. But what about the search engines? Unfortunately those scripting languages use JavaScript which search engines to not read. Though, there are ways to get over this and still have a search engine friendly AJAX or DHTML. Ask your developers to ensure the scripts are search engine friendly and readable.
- Avoiding all-FLASH pages: Adobe Flash is one of the best ways to create interactive and pleasing animations, but unfortunately search engines can’t read its content either. Thus you have to ensure your developers dont use much Flash, another weakness of Flash is that it does not work on a number of mobile devices.
- Avoiding client-side redirects: Webmasters used to redirect the client browsers from page to page which also redirected the search engines. Thus the big search providers dont like to see any kind of client side redirects. It is a must to use server side 301 or 302 redirects. Search Engines prefer 301 redirect since it tells them the page has been moved permanently and they will not visit the old URL any more
- Avoiding duplicated content: There are several reasons to avoid duplicated content, here is why:
- Duplication used to be another method to fool the search engines
- Duplicated content creates confusion
- Other websites linking to your content might link to either version, while with one version you could have higher valued pages
You can search through duplicated content via Google. Here are the steps to do it:
- Go to Google and type in site:www.yourwebsiteurl.com
- Click through all the result pages
- Click “Repeat the Search with the Omitted Results Included
- Linking to homepage: There may be different texts and images linking to your homepage. Make sure they link to one and same URL rather then like this:
- www.yourwebsiteurl.com
- yourwebsiteurl.com
- www.yourwebsiteurl.com/index.php
- Using consistent URLs: It is important that you use URLs that do not automatically generate duplicated content. This should be done by the developers. Also it is important to use canonical URLs which are easier for search engines to understand.
- Dealing with broken links: Every time you add or delete a page from your site, the search engines might have its URL cached, or other people might have it bookmarked. So if you change the URL of a page or you completely delete it. Make sure you redirect the old URL to the new one, or at least to a 404 Error page.
- Removing code bloat: Search Engines like to see clean websites with a high ration of content to code. So ask your developers to remove any useless code from the website to have higher ratio of content
- Code validation: Make sure your website uses the W3 Standards for CSS and XHTML.
- Removing inline JavaScript and CSS: The JavaScript and CSS should be placed in separate files, this will make your site cleaner and the load faster.
- Structure of the website: From very first days of website development and content structuring, it is important to create a user and search engine friendly content clusters. Make sure your website is structured both for users and search engines.
- Breadcrumbs: It is important to have breadcrumbs, this a line of text that shows the user where they are located with the option to click on the parent category of that page and go back. This is also connected to canonical URLs of the website.
- Text links: Refrain using “Click here” or other similar keywords to have your visitors and search engines visit a page about your product. Its better to have keywords and phrases linked to specific pages.
- Keyword rich URLs: Use the text that you used for linking in the URL of that specific page also
- Having a semantic outline: Search Engines look for a semantic outline which conforms to XHTML and HTML conventions. It is preferred to use your page’s keywords there also:
- Title Tag
- H1 heading
- H2, H3 and so on
- Paragraph text
- Links, captions, etc.
- Trust optimization: Lately the search engines started to include trust in their algo. This is a new concept that is called TrustRank. Here are a few factors influencing the TrustRank:
- Domain age
- Security service evaluation (Verisign Secured)
- Physical address and phone number on every page
- Take all reasonable security precautions
- No linking to sites that might have a low TrustRank














Thank you, I have recently been searching for information about this topic for ages and yours is the best I have discovered so far.
Good stuff, and useful info.