& in sitemap links, are they correct?

by Marco Demaio   Last Updated June 25, 2016 08:01 AM

Simple question, I'm asking just to make sure.

A Google sitemap generator generated a sitemap.txt fle with links written like this:

http://www.domain.com/category.htm?name=some-name&cat_id=8

is it correct to use the & in these links in place of the & or it's just an error made by the sitemap generator?

Thanks.

Tags : sitemap links


Answers 5


That is correct. It is the HTML entity for an ampersand (&) and is the proper character representation of it in a properly encoded URL. Ampersands (&) and well as < and > are special characters in XML and HTML and need to be displayed using their special character entities.

John Conde
John Conde
October 21, 2010 18:40 PM

Your Sitemap file must be UTF-8 encoded (you can generally do this when you save the file). As with all XML files, any data values (including URLs) must use entity escape codes for the characters.

This may help out, http://sitemaps.org/protocol.php

Jeremy
Jeremy
October 21, 2010 18:41 PM

URL-Encoding and XML entity encoding are not the same things. You need URL-Encoding to replace special characters in URLs, such as & which can only be used for the separation of query parameters. XML entity encoding is for encoding special characters in XML (also XHTML). This means, if you have a URL in an XML (or XHTML) file, and this URL includes some & characters, you have to entity encode it to &amp;. So in a sitemap.xml you will have urls like in the question from Marco Demaio.

bdadam
bdadam
October 26, 2010 11:58 AM

You can also convince yourself by checking

You can't really argue against the official xml sitemaps protocol page :)

Tom
Tom
February 03, 2011 23:41 PM

Google rejects the sitemap as broken if it has a & character in an URL. It accepts it when you replace & with &amp;

BUT: if you later check the list of crawling errors in the Google webmasters tool, it will report this URL of the sitemap file as broken, because it contains &amp; instead of &.

Thus the correct solution is to change the URL such that it does not contain &. Or report this as bug to Google.

Klaus Hartnegg
Klaus Hartnegg
June 10, 2014 14:12 PM

Related Questions


Updated April 06, 2017 09:04 AM

Updated June 29, 2016 08:01 AM

Updated August 08, 2015 14:01 PM