Skip to main content

How to Create Sitemaps

Caroline Danielsson

Sitemaps have been around for what feels like forever. They are an important part of your SEO work and come with several advantages—provided they are built correctly. With the help of a sitemap, you can make things easier for Googlebot and speed up indexing. In this article, we’ll help you understand the importance of sitemaps and how to create them.

In Swedish, they are sometimes called webbplatskartor, but within SEO it’s more common to use the English term, sitemaps. Therefore, we’ll use “sitemaps” throughout this article.

Different Types of Sitemaps – XML and HTML

First, we should explain that what we are talking about are XML sitemaps. There are also sitemaps in HTML that are located directly on the site. An HTML sitemap is a page where all the site’s links are collected and used by visitors. An XML sitemap is hidden and intended only for search engines and can look something like this:

<urlset xmlns=”http://www.minsida.com”>
<url>
<loc>

</loc>
</url>
<url>
<loc>

</loc>
</url>
<url>
<loc>

</loc>
</url>
</urlset>

In short, an XML sitemap is a file in XML format where you list the pages on the site that you want to be indexed and that you consider important. When Googlebot crawls your site and finds a sitemap, it can crawl much more efficiently. It sees which pages you want to be indexed and which you consider the most important. Because a sitemap is just a long list, it does not have to search through internal links from level 1 and downward as it does on the actual site.

Each crawl has only a set number of links allocated to it before Googlebot moves on to the next site, a so‑called crawl budget. If you have a large site with thousands of links, the entire site may not be crawled before Googlebot moves on, meaning it might miss the latest content you have worked so hard to create.

How to Create a Sitemap

First, decide which pages should be included. Here, you should exclude pages that you need but do not want to rank in the organic search results. These could be login pages, the shopping cart, duplicate content that is needed but should not rank, and similar pages. Also exclude 404 pages and pages blocked by robots.txt.

A sitemap has certain restrictions:

  • It may not contain more than 50,000 URLs.
  • The file size may not be larger than 50 MB (uncompressed).

If your sitemap exceeds any of these limits, you can split it into several smaller sitemaps. Be sure to split it smartly, for example, one sitemap for products and one for categories. Although it is rare for a site to be that large, if you do split your sitemap files, they must have unique names to avoid appearing as duplicates.

There are several free tools available to create sitemaps, or you can create one yourself. If you use WordPress, plugins like Yoast SEO can handle sitemap creation automatically. It’s recommended to find a trustworthy tool that generates the sitemap file for you. If you want to create one manually from scratch, you need to know HTML coding. You can create the file in a text editor like Notepad and then convert it to an XML file, but it may not be worth the effort compared to using free generators or plugins.

Some things to consider if you choose to create your sitemap yourself:

  • Google crawls the URLs exactly as they are written. Therefore, you should:
  • Make sure to list the canonical URLs.
  • All URLs should be either HTTP or HTTPS, depending on which one you use, but not both.
  • All URLs should consistently use either www or non-www; do not mix them.
  • A sitemap must be encoded in UTF-8.
  • A sitemap can only contain ASCII characters.
  • If you have separate URLs for mobile and desktop, only include one version.

If your site is available in multiple languages, you can add these using an xhtml:link tag in your sitemap. An example of how it looks is shown below:

<?xml version=”1.0″ encoding=”UTF-8″?>
<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″
xmlns:xhtml=”http://www.w3.org/1999/xhtml”>
<url>
<loc>http://www.minsida.com/en/index.html</loc>
<xhtml:link
rel=”alternate”
hreflang=”en”
href=”http://www.minsida.com/en/”
/>
<xhtml:link
rel=”alternate”
hreflang=”de”
href=”http://www.minsida.com/de/”
/>

Include tag for changes

Your sitemap file should have a so-called Lastmod tag. This metadata can be read by Google to see when the most recent changes occurred. If you have recently created new content, Googlebot can see that you recently changed something in the sitemap and therefore has a greater reason to crawl it again. Such a tag can thus speed up indexing and deindexing on your site.

You can, for example, include URLs with “noindex” in your sitemap. When Googlebot sees that your sitemap has changed, there is a higher chance that it will crawl it again, find the URLs you have marked “noindex”, and deindex them. However, these “noindex” pages should not remain live permanently. The idea is to add them and then remove them once they have been deindexed.

A Lastmod tag looks like this:

<?xml version=”1.0″ encoding=”UTF-8″?>

<urlset xmlns=”http://www.sitemaps.org/schemas/sitemap/0.9″>

<url>

<loc>http://www.example.com/</loc>

<lastmod>2005-01-01</lastmod> (OBS! Endast detta datumformat är godkänt!)

</url>

</urlset>

When you upload your sitemap

Now you have hopefully created a perfect sitemap for your site! What should you do with this file?

First, upload it to your site. Place the XML file in the root directory of your site. If your sitemap is in a subdirectory, Googlebot will only crawl URLs from that level and below instead of the whole domain. According to Google’s John Mueller in an older video, Googlebot cannot read URLs above the subdirectory but only URLs at that level and lower.

If your sitemap is uploaded in Google Search Console, this should not be a problem, but to be safe, it is always best to place it in the root directory.

Speaking of Google Search Console, you should also upload your sitemap there regardless of where it is on the site. There you will also get a response on whether it is readable and correct. Otherwise, GSC will give you a warning. You can also send a request to Google for indexing after uploading your sitemap to speed up the process. It is not a guarantee, but at least you have done your part to get your new sitemap indexed.

Then you can add your sitemap in robots.txt if you want. It is written like this:

Sitemap: http://www.example.com/sitemap.xml

If you have multiple sitemaps, all should of course be loaded into the root directory, Google Search Console, and robots.txt. Therefore, it is important that they have unique names.

Remember that every time you change your site, you need to update your sitemap. If you made one yourself, you need to create a new one and upload it the same way as above. If you have a plugin or similar, the program can often handle uploading and updating for you.

Caroline Danielsson Head of SEO

Caroline is one of our senior SEO specialists at our Örnsköldsvik office, and the Head of SEO.