5 Magento SEO Questions
Important information about how we handle your privacy and data.
Session IDs are very common with Magento and they can cause severe duplicate content issues, as there’s no limit to the number of duplicated URLs that can be generated. Session IDs are used to track a user’s session and they’re usually generated via the checkout, to track what items have been added to cart. A session ID will be appended to the end of URLs, here’s an example: domain.com/page?__SID=df23n54jtklg.
The best way to deal with this is to properly fix the issue, which will require development resource. If you’re not in a position to fix this, I would recommend blocking the URLs via the robots.txt or assigning meta robots rules (noindex, follow) – I would also recommend providing instructions to Google via the parameter handling options in Google Webmaster Tools.
You should also block you checkout pages so that crawlers won’t access pages with the session ID appended to the end of the URL.
In the SEO configuration settings (catalog > search engine optimisation) you can choose whether to serve category paths in product URLs – this is the main thing you need to consider for product URLs. I would recommend using top-level product URLs (eg: domain.com/product.html rather than domain.com/category/sub-cat/product.html) as this will prevent the product being duplicated if it’s featured in more than one category.
I personally really like BazaarVoice, because it provides a lot of customisation options and has some really advanced functionality. That said, BazaarVoice is quite expensive and it’s subscription-based.
The default Magento module is actually quite good, although you might want to use additional module or work with your developer to get more from it.
BazaarVoice also has schema support, making it easier to get the ratings schema appearing on your product listings.
Layered / faceted navigation is probably the most common Magento SEO issue and it’s also probably the most detrimental.
I would recommend instead using the noindex, follow meta robots tag, which tells search engines not to index the pages, but to still follow the links. You should still have the canonical tag, however I’ve seen very few cases where it’s prevented over-indexation issues when it’s not been added from the start. I’ve also seen plenty of cases where the canonical tag has been implemented from day one of an ecommerce launch, however the pages have still been indexed and lead to duplicate content issues.
In the event that you’re having issues with crawl budget, I would recommend using the robots.txt to block the pages.