A sitemap is a file where you provide information about your site's pages, videos, and other files, and the relationships between them.
A sitemap makes it easy for Search engines like Google to crawl a site more efficiently.
The sitemap is usually added to the website's root directory or sitemap
directory. It’s just an XML file containing all possible website routes.
If you are unfamiliar with sitemap, visit this article to learn more.
We are what we repeatedly do. Excellence, then, is not an act, but a habit. Try out Justly and start building your habits today!
At canopas, we wanted a sitemap for the Job posting website, in which each job has a separate page.
Initially, we tried many solutions like plugins and some open-source repositories, but they have either not been updated recently or do not fulfill requirements, even though the requirements are very simple.
So eventually we decided to create an API
for generating a sitemap and run it at the time of building the web app using shell script
which will put sitemap.xml
file in the website’s public
directory.
Golang provides a feature to create XML format output using structures. So we created the API in Go.
Here is the XML structure of a simple sitemap. It’s simple! Basically, sitemap is a set of URLs.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://canopas.com</loc>
<changefreq>monthly</changefreq>
<lastmod>2022-03-01T00:00:00.000Z</lastmod>
<priority>1</priority>
</url>
<url>
<loc>https://canopas.com/jobs</loc>
<changefreq>monthly</changefreq>
<lastmod>2022-03-01T00:00:00.000Z</lastmod>
<priority>1</priority>
</url>
<url>
<loc>https://canopas.com/contact</loc>
<changefreq>monthly</changefreq>
<lastmod>2022-03-01T00:00:00.000Z</lastmod>
<priority>0.9</priority>
</url>
<url>
<loc>https://blog.canopas.com</loc>
<changefreq>monthly</changefreq>
<lastmod>2022-03-01T00:00:00.000Z</lastmod>
<priority>0.8</priority>
</url>
</urlset>
I’m assuming you have basic knowledge of gin, routers, databases, and CI/CD configuration.
To start, create a new golang project, add the database and gin’s engine configuration in main.go
file.
Now, let’s add code to generate an XML sitemap using API with the following steps.
All sitemap URLs contain loc
, changefreq
, lastmod
and priority
fields, We will create a structure URL that contains all these required fields.
Also, we have to add the name of XML which is url
to this structure. For that added XMLName
field in the URL structure.
package sitemap
type URL struct {
XMLName xml.Name `xml:"url"`
Loc string `xml:"loc"`
ChangeFreq string `xml:"changefreq"`
LastMod string `xml:"lastmod"`
Priority string `xml:"priority"`
}
And there is one more parent node urlset
. We have to create a structure for that also.
type URLSet struct {
XMLName xml.Name `xml:"urlset"`
XMLNS string `xml:"xmlns,attr"`
URL []URL `xml:"url"`
}
We can add XML node’s attributes like XMLNS using the attr key as shown in the structure. We have taken an array of the website’s URLs inside the urlset
.
We have done database configuration before, Now it’s time to get all jobs for the website.
package jobs
import (
"db"
log "github.com/sirupsen/logrus"
)
type Job struct {
Id int `json:"id"`
Title string `json:"title"`
Description string `json:"description"`
}
func GetJobs() (jobs []Job, err error) {
err = db.Select(&jobs, `SELECT id, title, description FROM jobs WHERE is_active = 1`)
if err != nil {
log.Error(err)
return nil, err
}
return jobs, nil
}
We will get baseUrl of the website as a query parameter of API.
baseUrl := c.Query("baseUrl")
Form all required static and dynamic URLs for the sitemap like below.
changefreq
andlastmod
will be same for all URLs, we will add it after forming array of all static and dynamic urls.
jobsUrl := baseUrl + '/jobs'
// Init all static urls for sitemap
sitemapUrls := []URL{
{Loc: baseUrl, Priority: `1`},
{Loc: jobsUrl , Priority: `1`},
}
// Init all dynamic urls for sitemap
jobs, err := GetJobs()
for i := range jobs {
sitemapUrls = append(sitemapUrls, URL{Loc: jobsUrl + `/` + jobs[i].Id, Priority: `0.9`})
}
Now it’s time to add lastmod
and changefreq
.
lastmod: considered
changefreq
asmonthly
, so we will addlastmod
as first date of current month.
// Add changefreq and lastmod to all urls
year, month, _ := time.Now().Date() //get current month and year
lastmod := time.Date(year, month, 1, 0, 0, 0, 0, time.UTC).Format("2006-01-02T00:00:00.000Z")
for i := range sitemapUrls {
sitemapUrls[i].ChangeFreq = "monthly"
sitemapUrls[i].LastMod = lastmod
}
Add XML header and return XML response to the API.
urlset := URLset{URL: sitemapUrls, XMLNS: "http://www.sitemaps.org/schemas/sitemap/0.9"}
c.Header("Content-Type", "application/xml")
c.XML(http.StatusOK, urlset)
Here is the full source code of sitemap generation in go.
Create an API using this function and use it at the time of building the website.
router := gin.Default()
router.GET("/sitemap", GenerateSitemap)
The next step is to create a sitemap.sh
file in the website project and add the following commands in it.
#! /bin/bash
set -e
BASE_URL=$1
BASE_API_URL=$2
API_URL="$BASE_API_URL/sitemap?baseUrl=$BASE_URL"
xml=$(curl -X GET --header "Accept: */*" $API_URL)
echo $xml >> public/sitemap.xml
In the above snippet —
BASE_URL
.BASE_API_URL
.API_URL
.echo
Run this shell script in CI/CD configuration, just before npm run build
.
sh sitemap.sh https://example.com http://localhost:8080
Cheers, we have automated dynamic sitemap generation with CI/CD.
Hope this will help you get started on the automation of sitemap generation. This is the simplest method to generate a sitemap without using any external library. Once you have done this, you will never have to worry about generating sitemap again!
We’re Grateful to have you with us on this journey!
Suggestions and feedback are more than welcome!
Please reach us at Canopas Twitter handle @canopas_eng with your content or feedback. Your input enriches our content and fuels our motivation to create more valuable and informative articles for you.