How to check that your site is well indexed in Google?
I s are currently more than 1.6 billion website in the world, a number that continues to grow with over 2 million blog are published on a daily basis. With these numbers, being on the first page of Google results is not an easy goal. But should you give it up when you put your website online? Absolutely not ! Indeed, the first thing to do is to make sure that your website appears in the results. In technical terms, this is to check that it has been properly indexed by search engines. Usually this step is done automatically by robots and you often don’t have to do anything. However, sometimes there are issues or errors like indexing unnecessary or outdated pages that may impact your performance. In addition to showing you the steps to perform this check , I give you some good practices to follow.
What to understand by indexation?
Indexing is the process by which a search engine program or crawler goes over a website, crawls or browses and indexes its content. Therefore, when we say that your site has been indexed, it means that the search engine robots have visited it and copied its content for storage in the search engine servers
Note that the index term I use here refers to Google indexing with its Googlebot robot.
And this, since it is the most used search engine by Internet users with a worldwide share of 90.6% . However, keep in mind that the principle of operation is quite similar to that of other search engines like Yahoo, Bing and many others.
How the Google index works
The main objective of the crawl or the scanning by the Google robot is to understand the nature, content and quality of a website . This allows him subsequently to be able to associate this page with one or more search intentions.
But when it comes to the page ranking based on search queries or intentions, note that it depends on a large number of criteria defined by Google’s algorithms .
In addition, there are two other types of indexing robots, apart from search engine robots. Know that you do not necessarily need to be interested in these since the first can already fully help you improve your visibility. These are:
- Web service robots: They explore a particular type of page in order to extract specific data. These are for example backlinks for majesticseo.
- Hacking robots: They take care of testing the security vulnerabilities of the sites they browse and try to break into them.
The size of the Google index and the crawl budget
The size of the Google index
In 2016, Google announced that it had passed the 130 trillion URL mark for the size of its database for indexing . Which implies that the brand has indexed so many web pages and although there are no statistics for this year, it is easy to guess that this figure has been greatly exceeded.
It is important to note that Google only indexes “indexable” pages and not the entire web, especially since the web is far too large. In addition, with such impressive statistics, the firm is forced to manage a daily budget that is dedicated to the exploration of new pages: the crawl budget.
The crawl budget of a website
The crawl budget designates the number of pages to index per day for each site and depends on the importance of it. It is very useful to know this budget in order to be able to position the most relevant pages of your website.
For this, you must use a very powerful tool developed by the firm: Google Search Console . If you don’t know how to configure Search Console, you can read this article which also introduces you to another powerful tool: Google Analytics .
After configuring your account, all you have to do is click on the “Crawl” tab to view the data. However, keep in mind that the crawl budget is constantly changing, but you can still have an average that will serve you in your analyzes.
In addition, there are two types of crawl linked to the Google indexing of a website, namely:
- The light crawl : It is carried out daily and only takes into account the most important pages such as the homepage. Its verification is therefore superficial and it is therefore recognized as being the light version;
- The deep crawl : It is carried out approximately every month and allocates a much larger budget , so this is the heavy version since it also takes into account the new pages.
Apart from these types of crawls, it is important to know that Google has preferences when it comes to indexing web pages.
Identify the sources of 404 errors
Sometimes there are pages or even media like videos, images … that remain in the Google index even after you have deleted them from your website. These pages or media are likely to generate 404 errors, which should be resolved quickly.
Good content structure
This factor will allow your visitors and crawlers to navigate smoothly and logically throughout your website . So be sure to build your structure without forgetting to highlight your most relevant pages, your recent articles … in order to lighten the task of robots.
An impeccable internal mesh
Your content must be accessible in order to be indexed. To this extent, pages whose links will not appear directly on your home page are likely not to be quickly indexed by the Google robot.
Indeed, Google can sometimes take a long time to consider your various modifications. But by improving your internal mesh, you will be able to easily lead it towards these and I recommend you in this case:
- To set up a system of related articles or associated products;
- To create a “Site map” page which lists all your content;
- To create a module of the most commented posts or featured articles, etc.
Also, don’t forget to include links from your old articles on your new articles, especially when it is relevant to do so. This will bring them back to life, especially if they are too buried in your website.
Creating a sitemap file
It gathers all the URLs of your website and by submitting it to Google, this file can make it very easy to index all the pages on your website . Especially since he will know them all and will be able to access them automatically.
Note however that this file does not replace a good structure and a good internal mesh. On the other hand, a good combination of the three elements is likely to be very effective especially when you have a large number of web pages.
Good external links
Even if you have respected the previous practices, I invite you to add this one. Indeed, it is about creating links to the pages of your website in order to benefit from:
- Better indexation;
- Traffic and therefore customers …
Some factors that can block the indexing of your website
Although you have made all the necessary configurations, it may happen that your website is still not indexed . Indeed, there are some configurations that can prevent search engines from indexing your site or some of your pages.
Check if you are not preventing search engines from indexing your website through WordPress options. To do this, connect to your Dashboard and click on the “Settings” section on the left sidebar.
Then select the “Read” option and check if the ” Ask search engines not to index this site ” box is checked or not. If it is, uncheck it to allow indexing.
A website under maintenance
Some plugins like those of “coming soon” offer options that prevent your website from being indexed by search engines when it is under maintenance. And as long as maintenance continues, indexing cannot be done.
So if you have such plugins, do not hesitate to deactivate them or configure them correctly.
Other factors that prevent indexation
You may also have handled some files such as:
- robots.txt blocks the access of robots;
- .htaccess blocks the access of robots.
If these files are intact, your site should not have difficulty indexing. And you just have to wait a few more days for all the pages of your website to be correctly indexed by the search engines.
When you create a website to have a lot more visibility , it is very important that it appears in the first results of the search engines. But still it would have to be able to actually be present in the results. It is in reality its indexing which is ensured by the robots of the search engines in an automatic way. However, it is very useful to check frequently , especially to spot possible problems or to ensure that only the most relevant pages on your website have been indexed. Moreover, if you have the ambition to be in the first results , you should not neglect anything.