Featured Post

Google Penguin Update: Just Seeing the Tip of The Iceberg

Most people have heard about Google’s Penguin update, which caused quite a stir in the SEO community. But while we’ve been running around and trying to fix our websites for the update, it’s important to understand what that update was really all about. It’s one thing to know what changes it caused...

Read More

Avasoft Solution - Best Business WebSite Design and Web Hosting Company

Common Technical Problems with SEO and How to Fix Them

Posted by Avasoft Team | Posted in Website Technical Issues | Posted on 24-09-2012

Tags: , , , ,

0

SEO is about much more than just utilizing the right keywords on your site. There are also a number of technical issues that must be corrected if you want your site to rank in the search engines. Here are 10 of the most common technical problems you may run across in the world of SEO:

  1. Query parameters – Many eCommerce sites that are driven by a database have this problem, although it can pop up on any website. For example, if site users can search for a product by color, you could end up with a URL like www.example.com/product-category?color=12. If there is more than one query parameter included in the URL, then there is the issue of duplicate content coming up because there are multiple versions of the site that return to the same page. One has one query first, and the other has the other query first. Also Google will only crawl a certain amount of your site, depending on your PageRank. You could leave a lot of pages un-crawled if you don’t fix this problem. To do so, first determine which keywords you want to use. Then figure out which attributes users are searching for with those keywords. For example, they may be searching for a specific brand. Create a landing page with a URL that will return that brand’s page without using a query. Then make sure that the desired URL structures are included in your robots.txt file. However, if your site has been around for a while, Google has already indexed certain pages on your site, so this won’t fix the problem. You’ll have to use a rel=canonical tag to patch over the problem by covering up the URLs that don’t need to be indexed and redirecting the crawlers to the URLs you do want to be indexed.
  2. More than one homepage version – Although this is an issue more often affecting .NET sites, it can occur on others as well. URLs like www.example.com/default.aspx or www.example.com/index.html can be a big problem for the search engines. To correct this problem, run a crawl on your site, export that crawl to a CSV file and then filter according to the META column so that you can easily see all versions of your home page. Then use 301 redirects on all duplicates to send users to the right version.
  3. Lowercase / uppercase URLs – Sites with a .NET extension often have an issue with their URLs. The server tends to be configured to URLs that have uppercase letters rather than redirecting to the URL’s lowercase version. To fix this, use this URL rewrite tool which will fix this issue on IIS 7 servers.
  4. 302 redirects – This type of redirect is temporary, so search engines expect the page to come back at some point. On the other hand, a 301 redirect is permanent, so link equity passes to the new page. IIS SEO Toolkit or Screaming Frog are both great crawling programs that will enable you to look for 302 redirects. You can then change them to 301 redirects if necessary.
  5. Soft 404 – This issue returns to users a page telling them that the page they asked for can’t be found. However, there’s a 200 code being sent to the search engines. This code tells the search engines that the site is working properly, which results in the incorrect page being crawled and indexed. To locate the soft 404 pages on your site, use Google Webmaster Tools. Another helpful tool is Web Sniffer. Once you’ve located the 404 code sites that are returning a 200 code to the search engines, you can reset them to return the proper 404 code to them instead.
  6. Robots.txt file problems – Sometimes you place a command in your robots.txt file indicating that you want a certain page blocked, but it gets crawled anyway because the combination of commands you used just didn’t work. Use Google’s testing feature to see how Googlebot will crawl your site with the robots.txt file you’re using and then make adjustments to it as needed.
  7. Outdated XML sitemaps – Sitemaps help search engines crawl the pages you want them to crawl, but many sites generate maps only one time, so it doesn’t take long for them to become outdated as new pages are added or old pages are changed. To fix the problem, use a tool like Map Broker to find all the broken links on your map. Then change the sitemap so that it will update on a regular basis, however often you need it to.
  8. Base64 URLs –Once in a while you might discover that Webmaster Tools is reporting numerous 404 codes, but when you visit the pages on your site, you won’t be able to see the problem immediately. If you’re using Ruby on Rails, there’s a chance the framework could be generating authentication tokens in an attempt to keep cross-site requests from occurring. When Google attempts to crawl each of these tokens, it receives a 404 code. Since these codes are generated as they happen and each one is unique, you won’t be able to find the 404 codes that Webmaster Tools is reporting. You might be able to fix the issue by adding Regex to your robots.txt file. This should get Google to quit crawling the URLs created by those authentication tokens.
  9. Invisible characters in your robots.txt file – Occasionally you may see a warning “Syntax not understood” in your Google Webmaster Tools. When you open the file, there doesn’t seem to be a problem. However, if you use the command line to pull up the file, you’ll see an invisible character that didn’t show up in the actual file. To correct this problem, all you need to do is rewrite your robots.txt file and send it back through the command line to make sure it’s working.
  10. Servers that are misconfigured –Sometimes you might discover that the main homepage on a site is not ranking on the search engines, even though it had been previously. In most cases, browsers send the “accept” header, which shows the types of files it can understand. However, if the server is misconfigured, it will send a content-type that’s a mirror of the very first file in your “accept” header. If this appears to be the problem, change your user agent over to Googlebot.

Write a comment