Sitemap & SEO for Static Sites

DodaTech Updated 2026-06-22 9 min read

In this tutorial, you'll learn about Sitemap & SEO for Static Sites. We cover key concepts, practical examples, and best practices.

Sitemaps and SEO configuration for static sites ensure search engines can discover, crawl, and index every page efficiently, maximizing organic traffic through proper technical SEO foundations and structured data markup.

What You'll Learn

Why It Matters

Technical SEO is the foundation of organic search visibility. An incorrectly configured sitemap can leave thousands of pages unindexed. Missing canonical URLs can cause duplicate content penalties. Absent structured data means missing rich results in SERPs. For static sites deployed on CDNs, proper SEO configuration is particularly important because there is no server-side logic to dynamically generate sitemaps, redirects, or meta tags. At DodaTech, our sitemap includes all 2,900+ pages with proper lastmod dates, and our structured data generates rich results for tutorials and FAQs in Google Search.

Real-World Use

A documentation site discovers that 40% of its pages are not indexed because the sitemap excludes paginated pages. An e-commerce site loses rankings after a domain migration because canonical URLs were not updated. A recipe blog gains 200% more organic traffic after adding Recipe structured data that generates rich results with cooking time and ratings.

SEO Architecture for Static Sites

flowchart LR
  A[Hugo Build] --> B[Sitemap XML]
  A --> C[Robots.txt]
  A --> D[Canonical URLs]
  A --> E[Structured Data JSON-LD]
  A --> F[Open Graph Meta]
  A --> G[Twitter Cards]
  B --> H[Google Search Console]
  D --> H
  E --> I[Rich Results SERP]
  F --> J[Social Media Previews]
  G --> J
  style A fill:#f90,color:#fff

Sitemap Configuration

A sitemap is an XML file that lists all URLs on your site with metadata about each page's importance and last update time.

Hugo Sitemap Template

{{ $pages := .Site.Pages -}}
{{ $sitemap := .Site.Config.Services.Sitemap -}}
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
        xmlns:xhtml="http://www.w3.org/1999/xhtml">
  {{- range $pages -}}
  {{- if and .Permalink (not .Params.sitemap_exclude) -}}
  <url>
    <loc>{{ .Permalink }}</loc>
    {{- if not .Lastmod.IsZero -}}
    <lastmod>{{ .Lastmod.Format "2006-01-02T15:04:05-07:00" }}</lastmod>
    {{- end -}}
    {{- with .Params.sitemap_priority -}}
    <priority>{{ . }}</priority>
    {{- else -}}
    <priority>{{ if .IsHome }}1.0{{ else if .IsSection }}0.8{{ else }}0.5{{ end }}</priority>
    {{- end -}}
    {{- if .IsHome -}}
    <changefreq>daily</changefreq>
    {{- else -}}
    <changefreq>weekly</changefreq>
    {{- end -}}
    {{- range .Translations -}}
    <xhtml:link rel="alternate" hreflang="{{ .Language.Lang }}" href="{{ .Permalink }}"/>
    {{- end -}}
  </url>
  {{- end -}}
  {{- end -}}
</urlset>

Expected behavior: The sitemap includes all pages with their full URL, last modified date, priority (home=1.0, sections=0.8, pages=0.5), change frequency, and hreflang links for multilingual pages. Pages with sitemap_exclude: true in frontmatter are excluded.

Hugo Sitemap Config

# hugo.toml -- Sitemap configuration
baseURL = "https://tutorials.dodatech.com"

[sitemap]
  changefreq = "weekly"
  filename = "sitemap.xml"
  priority = 0.5

[params]
  sitemap_exclude_kinds = ["404", "robotsTXT"]

Expected behavior: Hugo generates /sitemap.xml at the site root with weekly change frequency and 0.5 default priority. The 404 page and robots.txt are excluded. The sitemap is submitted to Google Search Console for indexing.

Robots.txt Configuration

Robots.txt tells search engines which URLs to crawl and which to avoid.

Hugo Robots.txt Template

User-agent: *
Allow: /

{{ if hugo.IsProduction -}}
Sitemap: {{ "sitemap.xml" | absURL }}
{{ end -}}

Disallow: /admin/
Disallow: /api/
Disallow: /pagefind/
Disallow: /tags/
Disallow: /categories/
Disallow: /*/page/2/
Disallow: /*/page/3/

Crawl-Delay: 10

Expected behavior: Search engines read /robots.txt first and follow the rules. The sitemap URL is included for discovery. Admin pages, API endpoints, and thin content pages (tags, categories, paginated archives) are excluded from crawling.

Canonical URLs

Canonical URLs tell search engines which version of a page is the authoritative one, preventing duplicate content issues.

Canonical URL Implementation

{{ if .Params.canonicalURL -}}
  <link rel="canonical" href="{{ .Params.canonicalURL }}">
{{ else -}}
  <link rel="canonical" href="{{ .Permalink }}">
{{ end -}}

Expected behavior: Every page includes a <link rel="canonical"> tag pointing to its own URL. If a page has a canonicalURL frontmatter parameter (useful for syndicated content), that takes precedence.

Structured Data (JSON-LD)

Structured data helps search engines understand your content and generate rich results.

Article Schema for Tutorials

{{ if eq .Kind "page" -}}
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "{{ .Title }}",
  "description": "{{ .Description }}",
  "datePublished": "{{ .Date.Format "2006-01-02" }}",
  "dateModified": "{{ .Lastmod.Format "2006-01-02" }}",
  "author": {
    "@type": "Organization",
    "name": "DodaTech",
    "url": "https://dodatech.com"
  },
  "publisher": {
    "@type": "Organization",
    "name": "DodaTech",
    "logo": {
      "@type": "ImageObject",
      "url": "{{ "images/logo.png" | absURL }}"
    }
  },
  "mainEntityOfPage": {
    "@type": "WebPage",
    "@id": "{{ .Permalink }}"
  },
  "image": {
    "@type": "ImageObject",
    "url": "{{ with .Params.image }}{{ . | absURL }}{{ else }}{{ "images/default-og.png" | absURL }}{{ end }}"
  },
  "keywords": "{{ delimit .Params.tags ", " }}"
}
</script>
{{ end -}}

Expected behavior: Google reads the JSON-LD and may display rich results including the headline, description, author, publish date, and image in search results. Article schema can enable the Articles carousel and news box.

BreadcrumbList Schema

{{ if .Ancestors -}}
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [
    {{ range $i, $page := .Ancestors.Reverse }}
    {
      "@type": "ListItem",
      "position": {{ add $i 1 }},
      "name": "{{ $page.Title }}",
      "item": "{{ $page.Permalink }}]
    }{{ if ne (add $i 1) (len $.Ancestors) }},{{ end }}
    {{ end }}
  ]
}
</script>
{{ end -}}

Expected behavior: Google displays breadcrumb paths in search results, helping users understand the site hierarchy and improving click-through rates.

SEO Tool Comparison

Feature	Hugo Built-in	Yoast SEO (WordPress)	Rank Math (WordPress)
Sitemap	Built-in	Yes	Yes
Canonical URLs	Manual template	Automatic	Automatic
Structured data	Manual template	Automatic	Automatic
Meta tags	Manual frontmatter	Automatic	Automatic
Open Graph	Manual template	Automatic	Automatic
Robots.txt	Built-in	Yes	Yes
Breadcrumbs	Built-in	Yes	Yes
Content analysis	No	Yes	Yes
Schema generator	No	Limited	Advanced

Common Errors

1. Sitemap Exceeding 50,000 URLs

Google only processes the first 50,000 URLs in a single sitemap file. For larger sites, create a sitemap index file that references multiple sitemaps divided by section or content type.

2. Missing `lastmod` or Incorrect Dates

Without lastmod dates, Google recrawls pages less frequently. Incorrect dates (showing yesterday for a page that has not changed in years) waste crawl budget. Ensure the sitemap reflects accurate last-modified timestamps.

3. `noindex` on Important Pages

Accidentally adding <meta name="robots" content="noindex"> to important pages prevents them from appearing in search results. Always verify that indexable pages are not marked as noindex.

{{ if .Params.noindex -}}
  <meta name="robots" content="noindex, nofollow">
{{ else -}}
  <meta name="robots" content="index, follow, max-image-preview:large">
{{ end -}}

4. Blocking CSS and JS in Robots.txt

Blocking CSS and JS files prevents Google from rendering the page correctly, potentially hurting rankings. Only block admin pages and API endpoints, never static assets.

5. Missing Hreflang Tags for Multilingual Sites

Without hreflang annotations, Google may show the wrong language version in search results. Use the xhtml:link element in the sitemap and <link rel="alternate"> tags in the page head.

{{ if .IsTranslated -}}
  {{ range .Translations -}}
  <link rel="alternate" hreflang="{{ .Language.Lang }}" href="{{ .Permalink }}">
  {{ end -}}
  <link rel="alternate" hreflang="x-default" href="{{ .Site.BaseURL }}">
{{ end -}}

6. Over-Optimizing Title Tags

Title tags that are too long (over 60 characters) get truncated in SERPs. Tags that are too short miss keyword opportunities. Ideally, keep titles between 50-60 characters with the primary keyword near the beginning.

Practice Questions

1. How many URLs can a single sitemap contain and what should you do for larger sites?

A single sitemap can contain a maximum of 50,000 URLs. For larger sites, create a sitemap index file that references multiple sitemap files, optionally divided by section or content type.

2. What is the purpose of the hreflang attribute in sitemaps and page headers?

hreflang tells search engines which language version of a page to show in search results for a given locale. Without it, multilingual sites risk showing the wrong language version to users.

3. How does canonical URL prevent duplicate content penalties?

When identical or very similar content appears at multiple URLs, the canonical tag tells search engines which URL is the authoritative version. All ranking signals (links, engagement) are attributed to the canonical URL.

4. What is the difference between index, follow and noindex, nofollow robots directives?

index, follow allows the page to be indexed and links on the page to be followed. noindex, nofollow prevents indexing and link following. Use noindex for thin content pages, admin pages, and duplicate content.

5. Challenge: Set up Google Search Console for a Hugo site and verify the sitemap is processed correctly.

Add the site to Google Search Console, verify ownership via DNS TXT record or HTML file, submit the sitemap URL, and monitor the Coverage report for indexing errors.

Mini Project: Comprehensive SEO Setup for a Hugo Static Site

Implement a complete technical SEO foundation for a Hugo site:

Sitemap: Customize the sitemap template to include priority, changefreq, lastmod, and hreflang for all pages
Robots.txt: Create a robots.txt that allows crawling of content while excluding admin and thin content
Canonical URLs: Add canonical link tags to every page with a frontmatter override option
Structured data: Add Article schema to all tutorial pages and BreadcrumbList schema globally
Open Graph: Add Open Graph and Twitter Card meta tags for social media previews
Meta tags: Create a partial for dynamic title, description, and robots meta tags based on frontmatter
Verification: Add Google Search Console and Bing Webmaster Tools verification meta tags

{{/* layouts/partials/seo.html - Complete SEO partial */}}
{{/* Title */}}
<title>{{ if .Title }}{{ .Title }} | {{ end }}{{ .Site.Title }}</title>
<meta name="description" content="{{ .Description }}">

{{/* Canonical */}}
<link rel="canonical" href="{{ .Permalink }}">

{{/* Robots */}}
{{ if .Params.noindex -}}
  <meta name="robots" content="noindex, nofollow">
{{ else -}}
  <meta name="robots" content="index, follow, max-image-preview:large">
{{ end -}}

{{/* Open Graph */}}
<meta property="og:title" content="{{ .Title }}">
<meta property="og:description" content="{{ .Description }}">
<meta property="og:url" content="{{ .Permalink }}">
<meta property="og:type" content="{{ if .IsPage }}article{{ else }}website{{ end }}">
<meta property="og:site_name" content="{{ .Site.Title }}">

{{/* Twitter Cards */}}
<meta name="twitter:card" content="summary_large_image">
<meta name="twitter:title" content="{{ .Title }}">
<meta name="twitter:description" content="{{ .Description }}">

{{/* Structured Data */}}
{{ partial "schema.html" . }}
{{ partial "breadcrumb-schema.html" . }}

{{/* Verification */}}
<meta name="google-site-verification" content="{{ .Site.Params.googleVerification }}">
<meta name="msvalidate.01" content="{{ .Site.Params.bingVerification }}">

Test the implementation by:

Running hugo server and inspecting the HTML source for all SEO tags
Submitting the sitemap to Google Search Console
Using Google's Rich Results Test tool to verify structured data
Checking the robots.txt at /robots.txt
Verifying social media previews with the Open Graph Debugger

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

← Previous RSS/Atom Feeds Best Practices Next → Custom Shortcodes & Partial Patterns — Extend Hugo Templates Like a Pro

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Static Sites