Table of Contents
What is XML Sitemap and robots.txt ?
XML Sitemap :
- “XML” stands for “Extensible Markup Language“
- A “Sitemap” or “XML Sitemap” is a file of website which helps search engine to search,crawl and index the all contents of website.
- It contains a list of all important pages of website.
- Search engines like Google, Yahoo and Bing use sitemap to find different pages on website.
- It looks like –
- It plays an important role in SEO.
- It tells Google to crawl and index the website.
- It tells Google what to crawl and what kind of information is available on website.
Types of XML Sitemap :
There are four types of XML sitemap-
- Image Sitemap
- Video Sitemap
- News Sitemap
- Normal Sitemap
How to create XML Sitemap for website ?
- Yoast SEO plugin is best plugin to create sitemap beacause It updates automatically.
- There are many tools available online to create sitemap.
- I am explaining free and simplest way to create sitemap.
Step-1 : Type “XML Sitemap” on google
Step-2 : Click on first link “www.xml-sitemaps.com“
Step-3 : Enter your website URL
Step-4 : It will create XML file and you can download that.
How to submit sitemap on Google Search Console ?
Step-1 : Download the XML file of your website after creating it,explained in above.
Step-2 : Upload that XML file in the root folder of your website.
Step-3 : Check your sitemap by typing “domain.com/sitemap.xml“.
Step-4 : Go to Google Search Console > Sitemaps > paste your sitemap link
Step-5 : Hit the “SUBMIT” button.
robots.txt :
- robots.txt file is also known as robots exclusion protocol.
- It is a text file which give instruction to web robots how to crawl pages of website.
- It tells webrobots which page to crawl and which page to not crawl.
- Basic syntex of robots.txt file –
User-agent: * Disallow: / User-agent: Googlebot Allow: /
- Basic format of robots.txt file –
Sitemap: https://www.domain.com/sitemap.xml User-agent: * Disallow: /blog/ Allow: /blog/post-title/
- If you want to see robots.txt file of website then follow the syntex –
https://www.domain.com/robots.txt
Why we need robots.txt file ?
- To prevent the crawling of duplicate contents.
- To prevent server overload.
- To keep any section of website private.
- To specify the location of sitemap.