PHP: Get Page URLs / Directoy Listings
A simple function to easily process a website or file and return an array of all the URLs found. The main purpose I created this was to make it easy to process “Directory Listings Pages” which is a page which shows all the files and folders in a websites directory if there inst a default/index page or the directory listings is enabled.
I have used it to find images on another website (which has directory listings enabled) and display the found URLs / Images on another website. It it up to you how you use it but I am not sure what its limit are in terms of URLs per page.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
function sr_get_urls($url){ $file = file_get_contents($url); $regexp = "<a \s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)< \/a>"; if(preg_match_all("/$regexp/siU", $file, $matches)) { $return_array = array(); for($a=1; $a < sizeof($matches[2]); $a++){ if(!strpos($matches[2][$a], '/')) $return_array[] = urldecode($matches[2][$a]); } if(sizeof($return_array) <= 0) return false; return $return_array; } return false; } |