Technical SEO is a foundational part of optimising websites for search engines. Fundamentally, pages should be crawlable and indexable (without errors) for the best chance to rank.
Elements of a standard Tech SEO strategy include looking at:
|
|
Some activities have a greater impact than others.
Content, both on-page and off-page, has the greatest impact when published on a site with good technical SEO – having an amplifying effect on search engine visibility.
What is technical SEO?
Technical SEO is search engine optimisation that focuses on ensuring your website’s pages can be crawled, understood, and indexed well, with the ultimate aim of increasing visibility and rankings.
What does crawling a website mean?
Like a spider crawls a web, search engine bots crawl websites. Googlebot is the most well-known, which crawls to index pages on Google.
Search engine bots like Googlebot read content from pages including specialisms, meet the team consultant profile pages, job pages and media hub pages – and uses the links within them to find more pages. There are a number of ways you can control what gets crawled on your site.
One of the main ways to signpost to search engines where they can and can’t go on your website is through a robots.txt file. A reference to your sitemap location should be found here too, which all our recruitment sites do as part of their pre-go-live tech checks at set-up.
However, it’s important to note that Google crawls a site at its discretion, and may still index a page that is being disallowed in your robots.txt file if links are pointing to the pages.
A stricter way of controlling the behaviour of crawl bots is at server level (for example, a .htaccess file). This is an advanced process.
Many crawlers support crawl-delay directives in the robots.txt file, which allows you to set how frequently they can crawl pages. Google doesn’t respect this, though you can file a special request to reduce the crawl rate in Search Console.
Access restrictions also affect your site’s crawlability. Whether it’s a login system, HTTP authentication or IP whitelisting, you can allow a group of users to access certain pages, while search engines won’t be able to and won’t index them. Examples include 401 status codes, which have been applied to dashboard pages on our recruitment sites.
You can see Google’s crawl activity for your site in the ‘Crawl stats’ report in Search Console. This will show you more information on how and what Google is crawling.
A more advanced form of analysing the crawl activity on your website can be done through accessing server logs and using data analysis tools.
After pages are crawled, they’re rendered and then sent to the index. The index is a master list of pages, like a database, that can be returned as results from search queries.
How does indexing work?
There are various ways to signpost to Google and other search engines that you’d like a page to be indexed.
Robots directives enable guidance to search engines on how to crawl or index a particular page. The robots meta tag is a HTML snippet added to the <head> section of a page.
By default, if a <meta name=”robots” content=”noindex” /> is omitted, a crawler will attempt to index. The use of the robots nofollow attribute will tell search engines not to follow links on the page as well. A full list of Robots Meta Tags Specifications can be found here.
Canonicalisation is the signposting to search engines that one specific URL should be shown in their search results. Where there can be multiple versions of the same type of ...