Sitemap is a model of website content, designed to help users and search engines navigate the website. A sitemap can be a hierarchical list of pages (with links) organized by topic, an organizational chart, or an XML document that provides instructions to search engine crawlers.
If we need to analyze these URLs with PHP, we first need to read the sitemap.xml file.

Code

PHP comes with built-in functions for reading sitemaps, so we don’t need to use fopen. The function we need is simplexml_load_file.

This program first reads the sitemap.xml file and then stores it in an array.

1
2
3
4
5
$xml = simplexml_load_file('sitemap.xml');
$urls = array();
foreach ($xml->url as $url) {
$urls[] = $url->loc;
}

The array $urls contains all the addresses of the website.