Googlebot is not a person, it is a robot and it crawls the web via links. It finds and reads content, analyses it in many different ways and if it deems it relevant, will add it to it’s index.
Google’s sole aim is to provide an answer to a query that is so relevant to the intent of the user that there is no need to go anywhere else. More often than not these days, the searcher doesn’t even have to click through to the web page as the answer is presented in the form of a featured snippet or in the Knowledge Graph.
What is a crawl budget
Crawl budget is the number of pages that Google will crawl on your site in any one day, depending on factors that include the number of links to your site (how popular are you), the size of your site and generally how healthy it is. As detailed in the sitemap blog, if Google is trying to crawl your site and it has little or irrelevant content on it, it’s highly likely that Googlebot will get bored and won’t stick around to crawl the really useful content you would like to rank.
How to optimise your crawl budget.
This isn’t necessarily going to be an easy job but it’s worth doing for the results. First of all you need to make sure that you have reduced as many errors on your site as possible. Using a combination of your server logs, Search Console and tools such as Screaming Frog, make sure that all the pages you want to be crawled are returning codes such as 200 (this page is ok), or 301 (this page is ok but it’s pointing somewhere else now thanks and can you go there instead)
Secondly, make sure that any pages that may be taking up your crawl budget but are not of any use to your visitors or search engines are blocked from crawling. This will allow Googlebot to use your allocated crawl budget for other, more important areas of your site.
For more information on how crawling and indexing work please come back to the blog soon. In the meantime please get in touch if you have any questions or if you would like me to have a look at your site.