What Should You Do When a Webpage Has a Conflicting Canonical Tags?

Canonical tags and canonical URLs can be very confusing. Here is an example of how to troubleshoot canonical tag contradictions.

Video Transcript

David: Tricia is asking a question about a very confusing topic, canonical tags and canonical URLs. She has a problem where the client doesn’t understand it, the developer is unclear on it, and she’s trying to coach everybody on what the problem even is, let alone how to solve it. Does that about sum it up?

Tricia: Yes, yes.

David: Okay, so it sounds to me like what’s happening on this website for the homepage is that the homepage has one URL, and the canonical tag is set to be example.com/16/home.htm or something like that.

Tricia: Yeah.

David: But when you try to access 16/home.htm, it actually redirects to example.com.

Tricia: Correct, and when you look at the source, it says canonical equals that 16/home.htm.

David: Okay, so just to be clear, you are able to view the domain example.com?

Tricia: Correct.

David: And when you view the source code on example.com, the canonical tag is not set to example.com. It’s set to example.com/16/home.htm.

Tricia: Correct.

David: Okay, great.

Tricia: And to add a little bit to the mix, I don’t know if this matters, but this also has the “site is not responsive,” and there’s also a separate m dot for the mobile, which is done the same way. But I don’t know if that impacts their answer. But just so you know, there are basically two sites that are done the same way.

David: Right, right. So, we’ll get there eventually, but let’s just talk about the main site canonical. So, a canonical tag is an invisible tag in the source code that tells Google if you visit this page and can view it from any other permutation of URL, I want you, Google, to consider this page whatever is within the canonical tag. So, in this case, what’s happening is whenever Google gets to the home page of this website, we’re calling it example.com. Google reads the page and sees a canonical tag in there. So, whatever is on the top of the URL, the canonical tag is saying, forget about that. Instead, I want you to consider the real version of this page to be example.com/16/home.htm. So, what’s happening is we are sending Google a contradictory message. One is, hey, you landed on example.com, and you would think that that would be the main homepage for the site, right? It’s not always. There are reasons why it wouldn’t be, but we’re saying Google, hey, forget that example.com exists. Instead, we want you to consider 16/home.htm to be the actual canonical or official version of the page. Okay, so the implications are that, theoretically, Google should not index or send people to example.com. However, whenever someone asks for this website, the homepage of the website should send them to 16/home.htm of this domain. However, there’s a redirect setup. So, Google and nobody can actually ever get to 16/home.htm. So, Google says, wait, you want me to forget about the homepage, and you want me to send everybody to 16/home.htm, but you don’t let me get there, right? So, what we’d find in search console is probably a message saying something like “canonical tag not selected,” or I forget how they phrase it in search console. But if you have access to the search console data, it’s worth looking in there because, basically, Google is going to say, you want me to go here, but you won’t let me go there. So, I’m not going to consider the canonical tag.

Tricia: Okay.

David: So, that probably means that Google is ignoring the canonical tag in this case because it can’t even get to that page.

Tricia: Okay.

David: However, we don’t want that confusion. We want one and only one version of the homepage. So, we really want to go in and update the canonical tag to reflect the true homepage, which is example.com.

Tricia: The weird thing is when I put in example.com in redirect detective because that’s where I was looking, it shows me… Let me make sure. I’m just pulling it up to make sure. So, example.com says it is a permanent redirect.

David: Okay, 301 redirect.

Tricia: Yeah. Let me try this again because I think I’m confusing myself. So, I put in the page that actually comes up, which is example.com dot. I put that into redirect detective. And it says there’s a 301 permanent redirect to the example.com/16/home.htm. That’s the final destination.

David: Oh, interesting. So, I guess the thing is, humans get to redirect to the homepage, but bots do not.

Tricia: Yeah. Is that what’s happened? Because when I actually go to example.com, that page comes up even though it’s not canonical.

David: Right.

Tricia: And when I put it in a redirect detective, that’s been permanently redirected to this home.htm. But when you go to that, let me see if I go to that home. When I go to the canonical, it actually brings up the example.com. It’s very confusing.

David: So, let me pause you for a moment here. Enter whatever the canonical full URL is and put that into redirect detective. What do you go to?

Tricia: Okay, let me put that in.

David: So, redirectdetective.com is a free redirect checker that will show you the path that any redirects may pass through. It’s super helpful when diagnosing these things.

Tricia: This is very bizarre. It says no redirects found.

David: There you go.

Tricia: But when I go…

David: No, you’re not crazy.

Tricia: But when I click and go to that, it goes back.

David: I know, I know.

Tricia: Okay, so what does that mean? That’s what’s confusing me so much.

David: Okay, okay, so here’s probably what’s going on. So, there are several ways you can redirect a page. And when we say redirect, it means you ask the browser to go here, and it sends you there. Okay? And so, we can do that on the server side, which is the server that hosts the web files and can say I want to go here instead of here. And that is usually done in Nginx or the htaccess file and basically is a 301, or a 302, or a 307, or a 305, or whatever redirect. That’s what redirect detective is picking up: those server redirects. However, there’s an old-school way of doing redirects, which is called a JavaScript redirect.

Tricia: Okay.

David: A JavaScript redirect is in the JavaScript code on a page and says, okay, you viewed this page, but I’m going to send you to this page. And so, what’s happening is there’s a JavaScript redirect that redirect detective cannot follow, which is why you don’t want JavaScript redirects. Google does not respect JavaScript redirects. So, in other words, it is set up right for Google, ironically, because Google, well actually, nowadays Google can follow the JavaScript redirect. It didn’t used to, but now it does. It’s just not good practice for SEO. So, that’s why you are experiencing going one place, and the redirect detective is saying, no, it’s going to another, which actually ends up making it a worse problem for Google because then Google says, oh, canonical tag, the server is telling me to go here, which is the same as the canonical tag. But Google is also detecting a JavaScript redirect, and it can follow that. So, that means Google’s getting a contradictory message.

Tricia: Yeah.

David: And that’s what we want to avoid. So, there are many ways or reasons why someone might have done this. The JavaScript redirect probably was just to make it clean because no one wants to go to view a page on the homepage called 16/home.htm. And so the developer fixed it, but they didn’t fix it in a Google friendly way. So, someone who didn’t know would say yeah, it works, it goes to example.com. But in reality, this JavaScript redirect was probably set up at a time when Google didn’t respect JavaScript redirects because that’s consistent with this whole mobile website thing.

Tricia: I believe it’s saying that it’s been like this since 2011.

David: Yep, that’s ancient history in the web world.

Tricia: Well, when I saw that, I thought, whoa, that was a long time ago. Technology has changed significantly.

David: Oh yeah, oh yeah. There were definitely responsive websites in 2011. So, the question becomes how worth it is fixing this?

Tricia: How worth it is, and how would you fix it?

David: So, the fact is, the way the website was built, it might have to be there.

Tricia: I think that you might be right.

David: I’m pulling from like ancient website building history that sometimes you would create the structure of files and folders and stuff like that, and then lay upon it the friendliness. And that’s kind of what’s going on here. Because of that and the mobile site problem, which we’ve talked about in the past, which is ideally you don’t manage two sites. You have one website that renders the same for mobile and desktop because right now, in the Google climate, 13 years after this website was built, the algorithm is mobile first. So, it’s not even necessarily looking at the main homepage; it’s looking at MDOT.

Tricia: Which is doing the same exact thing.

David: Right. So, search console is your friend here because it’s telling you how Google is interpreting the site. Okay, so what I would do is go into the search console and type in the homepage that you view, example.com, in the search bar at the top.

Tricia: Yes.

David: This is a great tip for anybody who’s like, I wonder what Google thinks of my page. In search console, you can literally just type in any URL that you have permission to view. Right? That’s why you verify search console and see what it says. Can you do that for us and tell us what it says?

Tricia: Yes, if I can.

David: Do it without the 16. All you have to do is go to the very top. There’s a search box, and you can enter your URL there.

Tricia: Okay, now, am I doing just example.com, or am I doing the full example with the HTML?

David: In this case, do it without the 16/home.htm.

Tricia: Okay, just the regular.

David: They’re trying to avoid using the client’s domain name to avoid their embarrassment.

Tricia: This says the URL is not on Google, page is not indexed, and the page is redirect.

David: There you go. So, the homepage is not indexed without 16/home.htm. So, now add 16/home.htm and see what happens.

Tricia: Okay, the URL is on Google, page is indexed.

David: Good. Does it say it usually should say something about how it found it, like referring page?

Tricia: Yes. It has a page on Squarespace. It’s unusual. It says referring page, and it has two. It has one page on Squarespace and then it has the http://www.example.com. That’s it.

David: Right. So that’s not the canonical tag. So, the homepage is indexed. People can find it. It’s just a weird URL. And if someone types in example.com, they will get to view the homepage. It’s just not ideal.

Tricia: Okay.

David: So, this is not ideal. It should be cleaned up, but it’s probably not a priority because Google’s able to index it.

Tricia: Yeah. So, I guess to me, basically, as far as SEO, while it’s not ideal, it sounds like it’s working now. The only thing to probably do is to look at redoing their website in the future to use a platform that is up to date. Is that okay?

David: That’s probably not built on a platform, to be honest. It’s probably custom.

Tricia: Probably part of the message was that you know how the platform is coded. This is how the canonical URLs are created and specified.

David: Yep, yep, exactly. And the fact is it’s got an .htm file. That’s a file TLD. Right. So that means there’s probably a directory, directory number 16, that contains a home.htm file. That .htm file is where the actual content of the pages is.

Tricia: There’s probably, yeah, it’s probably something like that. And it says crawled as Googlebot smartphone. So, they’re finding out.

David: So, then do MDOT, do they have an SSL certificate on the site?

Tricia: Oh, let me do this because it didn’t come up right. I’m thinking, oh, you know what? When they gave this to me, they did not give it to me with DNS. They did it by property. And so, it’s saying, you don’t have access to this.

David: Right. There you go.

Tricia: And that was why I had a hard time finding it when I was first looking at it: because it wasn’t in order, or it looked weird.

David: That’s a good reason why. And a friendly reminder to all of us. We want the full data of the full site. If we verified search console on a registrar level, we would have been able to get to the MDOT version of this site. All we can get is the version of the site with or without HTTPs, with or without triple w. So, what we don’t know is how Google interprets the non-triple w version, or the MDOT version, or the SSL, or without the SSL. So that’s why using the server verification with search console gives us a fuller picture because Google has indexed this, but because the smartphone, well, we don’t know if the smartphone feature is even working, frankly, because we can’t see if Google’s even able to index the MDOT.

Tricia: Yeah.

David: It could be that smartphones are being sent to the main site and are not responsive. Very hard to read.

Tricia: Ah, okay.

David: But until we have the DNS…

Tricia: Yeah, I have a call with them on Friday to work on Google Tag Manager as well. So, I’m going to put this on my list to do while I’m screen sharing and just can get it done for them.  

David: So, the short version is not ideal, but probably not worth fixing because it’s basically inherent in the site, and Google can read the homepage.


Have a question about this process? Ask it here:

Add Curious Ants as a preferred source on Google

Get started doing SEO today

SEO seems hard- you have to keep up with all the changes and weed through contradictory advice. This is frustrating and overwhelming. Curious Ants will teach you SEO while bringing your website more traffic and customers- because you’ll learn SEO while doing it.