Although its introduction was never officially announced, if you read Google’s webmaster documentation, you’ll find that Googlebot has a 15MB crawl limit when searching websites. This limit has been put in place to prevent Google from overloading websites with too much traffic and consuming too much of their bandwidth. While this can be helpful for site performance, the limit can have a negative impact on some websites’ SEO. Here, we explain what Googlebot is and what its crawl limit means for websites.
What is Googlebot?
Googlebot is the web crawler used by Google to index and rank websites in their search results. Its function is to crawl as many web pages as possible on the internet and gather information about their content, structure and links. This information is then used by Google’s search algorithms to determine which pages should be included in their search results and in what order they should be ranked.
For several years now, Googlebot has had a maximum crawl limit of 15MB. This refers to the maximum amount of content that Googlebot will download from a website’s pages during a crawl. The search engine’s intention here is to prevent Googlebot from putting too much stress on a site’s server or swallowing up too much bandwidth.
It is important to note that the 15MB crawl limit applies only to the amount of content that Googlebot will download from a single page during each crawl. It does not limit the number of pages that Googlebot will crawl or the frequency at which a crawl will happen. Google will continue to crawl a website as often as necessary in order to keep its index up to date.
How does the 15MB limit affect SEO?
When Googlebot crawls a website, it first downloads the page’s HTML code and then follows any links on the page to other pages on the site. During the crawl, it keeps track of the amount of data that it has downloaded. Once the data exceeds the 15MB limit, Googlebot will then stop indexing the rest of the page’s content.
From an SEO perspective, the 15MB crawl limit can have a significant impact on a website’s search engine visibility. If a website has a page with more than 15MB of content, Googlebot may be unable to crawl the entire page. As a result, any content that is missed out will remain unindexed by Google.
If it is not indexed, Google will not know the content is there. This means if someone searches for that content, the page it is on will not be considered for ranking by Google’s algorithm and will not appear in search results. In effect, this means the website could experience a decrease in search engine visibility and a drop in organic traffic.
How to avoid being affected
If an entire page and all its content are to be indexed, then website owners need to keep their web pages smaller than 15MB. Editing content to make the page shorter is not the ideal solution, nor Google’s intention – unless of course there is so much information on one page that it would be better to divide it up into smaller, more readable chunks.
Website owners should also ensure that their internal linking structure is properly optimised. Internal links are important because they help Googlebot navigate a website and understand the relationship between pages. They also enable other pages to be found and indexed. By organising internal links in a clear and logical way, Googlebot is better able to crawl a site and index all of the content. It is important to remember that if a page is more than 15MB in size, a link after the cut-off point, towards the bottom of the page, will not get crawled. If this is the only link on the site to that page, then it is unlikely that it will be indexed at all.
Googlebot is an important tool used by Google to index and rank websites in their search results. The 15MB crawl limit can affect a website’s search engine visibility if the content of the page goes beyond that limit. To prevent this from happening, website owners should optimise their site’s content and internal linking structure to make the page smaller than 15MB and ensure that internal linking is well organised. Looking for secure, high-performance business hosting with guaranteed 100% uptime?