Home » Blog » Office Hours » Why is it Important to Regularly Check the Page Indexing Report?
Checking the page indexing report through Google Search Console might reveal many things you need to know.
Tricia: I got the not found 404 in Google Search Console. Another thing it says is soft 404. And that has a ton, but I noticed that if I go in and look at one, it’s not actually going to 404. It’s going someplace else because it’s a page with a lot of coding or something after it.
David: So, the soft 404 is another category of page. And you’re looking at a report that’s called why aren’t these pages indexed, right? So, Google will not index a 404 page. And that is where the weekly report you received is getting from the 404 page because it’s looking for the words “page not found” in your title tags and telling you how many views of pages are page not found. It’s important to look at Search Console on a weekly basis. And this is one of those places you’re going to want to look at on a weekly basis. But soft 404 means, basically, there’s not enough content on this page for Google to care about it. And so, a soft 404 means that Google is telling you we think you mean this page shouldn’t exist, so we’re not going to put it in the index. So, if it’s a page with a lot of code, that’s almost a structural page or a template or something like that. Sometimes those end up in there. Yeah, good. Thank you for not indexing that page, Google. Now we don’t want to take it out. We want to make it private or draft or whatever. But God forbid one of your soft 404 pages is a service page of a product you’re trying to offer. That’s terrible. That means Google says there’s not enough on this page for Google to even care about it.
Tricia: One of the things where I see it is, as I said, they’re a gym, which means that they have classes. And so, they had their schedule. So, when I look at this, it’s got the page and then the post. And it says, Display Event Date 2019, 2018. So, when you click on the actual thing to get there, it says Events for, and then it says no event scheduled for April 3, 2019. So, is that something that pretty much could have been somewhere that people were clicking on in April 2019?
David: Well, so remember what Search Console is doing? It’s not people. It’s a Google bot. So, this is why looking at this report on a weekly basis is super important because what you’re discovering is the way the calendar widget is set up for them literally will go on forever. You might be able to reach May 1 year 3075, if the Google bot keeps crawling far enough. You might be able to go back to 1865 if the Google bot can crawl back far enough. So, what you have is what we’re calling a spider trap, meaning the Google bot can keep going into stuff that… Don’t go there, Google bot. Don’t look at 1865. We know we did not have a class on November 6, 1865. Right? And we probably won’t have a class in the year 3000 either. So sometimes these widgets, especially calendar widgets, can accidentally and automatically create pages that the Google bot will start following and keep following and keep following on until, well, frankly, your website crashes because your integer is too big for WordPress to handle it. So, that’s telling you there’s a spider trap on your website that you need to fix to prevent Google from scrolling indefinitely through all time and space. Because when Google visits your website, it doesn’t crawl every page of your site. It only crawls a few of the pages of your site. So, if it’s spending time crawling to the year 3000, that’s 1000 pages it’s visiting that you don’t want Google to see. Meanwhile, it’s not even looking at the pages you want it to see. So, it’s worth investigating to see how you can prevent this event widget from creating pages too far in the future. Or maybe you want to tell Google, don’t bother to look at the events. Humans will be able to look at it. Maybe you robot it out or something like that, discouraging Google from going that far. But there are strategies to do that because you very well could have a spider trap. We call it that because the Google spider is caught crawling through the whole site.
Tricia: Okay, well, that’s really good information. Because I will tell you just by looking at this, if I just go to soft 404s, it’s 2205 pages. So that is a big Spider trap.
David: Yeah. Well, if you think about it, it’s probably not even telling you all of them.
Tricia: Yeah, because Google just stopped at one point and said…
David: I’m glad you’re bringing this up because, really, you should look at the page indexing report once a week. The goal of looking at this report once a week is to work your way through it a little bit at a time. Eventually, you’ll get to the point where, okay, I know it’s doing this, I know it’s doing that. Okay. That’s fine. And now I fixed everything. And I’m not surprised by what I find in there.
Tricia: Going down this, if there is a page with a redirect, is that something I need to be concerned about? Or they’ve got it on redirect… I’m just kind of looking at the page index.
David: Yeah. So, there are a lot of little pieces of information here. If you click on them specifically, you can get to Google’s documentation describing what they are. Sometimes these are things that you should be aware of, but frankly, if it’s a redirect, it’s telling you the page is redirected, so we’ve not put it in the index. Which is what you want it to do. But you might check to make sure a page has been redirected that you don’t want to be redirected. Or excluded by no index tag. Good. I don’t want you on that page, Google.
Tricia: There are some of those. Those are good.
David: But make sure that you’re not “no indexing” pages that you want Google to see. But go through those, click on each of them in Search Console, read up on them, and ask yourself, is this what I want Google to do with pages like that? And if it’s not, then I need to take other actions.
Tricia: And I think for this client, I’m going to probably go through and look at some of the 404s and give them my opinion on those. And then, send them the page from the Google Search Console and tell them to go through the rest of that because this is a little bit out of my scope as far as what I’m doing for them. As far as finding a bunch of all the other stuff, I feel like they had somebody update their website, and they need to then go back and finish fixing things.
David: Yeah. You’re bringing up a really good point, which is scope.
Tricia: Yeah. And that’s like, here are two big problems I see. You all work and figure it out because their website in and of itself is really not my specific responsibility. I want to learn and know this so that I know these are the problems. Okay, client, you work with whoever did these changes because I believe there were rather recent changes to the site.
David: Probably, that’s why you saw the increase of errors, and that’s the whole purpose of this is to catch that so you could delve in. Remember, this is not a slight to developers, but the developers’ job is to make it work. When they removed the page, they did what the client wanted, and they walked away. If we’re worried from an SEO perspective, we need to fix that by adding a redirect or taking action. If you’re not being paid to do SEO for them, well, maybe this is an opportunity to say, by the way, your developer is doing their job. They did what you said. The problem is the developer didn’t understand the SEO implications, and if you care about that, you might need to hire me for some SEO work.
SEO seems hard- you have to keep up with all the changes and weed through contradictory advice. This is frustrating and overwhelming. Curious Ants will teach you SEO while bringing your website more traffic and customers- because you’ll learn SEO while doing it.