PHP CLI: Recursive Search and Replace
PHP CLI : Introduction | Paths | PHP INI | Process Running | Detached Processes | Hidden Process | User Input | Files | Search & Replace | Recursive Search & Replace
|
|
UniServer 5.0-Nano PHP CLI. |
PHP CLI Recursive Search and Replace
On the previous page I covered single file search and replace although adequate for most applications occasionally you will need to search a folder and its sub-folders for the files. Not a problem download one of the many classes that can be found on the Internet.
I find they come with too many bells and whistles hence this page covers a basic PHP script to search and replace text in files in any folder and its sub-folders using a recursive function.
By now you will appreciate I like working code examples these you can hack around and tailor to your own applications. In keeping with this the following examples take a look at problems associated with a recursive design using PHP.
Initial test setup
Edit our two test files Run.bat and test_1.php contained in folder UniServer to have the following content:
Run.bat|| | |
TITLE CLI TEST BAT COLOR B0 @echo off cls echo. usr\local\php\php.exe -n test_1.php echo. pause |
Note 1: UniServer Mona users change paths as shown:
All scripts on this page will require the above change. |
test_1.php | |
<?php $sfolder = "./usr/local/mysql"; // Start folder $Array=recur_dir($sfolder); // Retrieve file list foreach($Array as $line){ // Print list echo $line."\n"; } //=== Recursive Directory ============================================== function recur_dir($dir){ $dirlist = opendir($dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list $newpath = $dir.'/'.$file; // Create path. Either dir or file $Array[]= $newpath; // Save full file path to array } closedir($dirlist); // Close handle return $Array; // Return array of files for } // further processing //========================================== END Recursive Directory ===== exit(0); ?> |
Recursive function:
|
Run the batch file (double click on Run.bat) Result as follows:
./usr/local/mysql/. ./usr/local/mysql/.. ./usr/local/mysql/bin ./usr/local/mysql/data ./usr/local/mysql/my.cnf ./usr/local/mysql/share Press any key to continue . . . |
The output contains a mixture of folders (bin, data, share) and files (my.cnf, mysqlrun.bat, mysqlstop.bat, README.txt) You will notice there are two special sub-directory names [.] and [..] these are normally hidden note that every folder contains them. A single period [.] means "the current default directory." Two periods [..] means "the directory which contains the current default directory" also known as the parent directory. They are useful for navigation however will cause problems if not removed from a directory listing. |
Note: The above script is only a skeleton and needs refining.
Refine - Remove and separate
The two special sub-folders need to be removed from any listings. Files and folders require separation. The following example is the next step before looking at recursion.
Edit test_1.php to have the following content:
test_1.php | |
<?php $sfolder = "./usr/local/mysql"; // Start folder $Array=recur_dir($sfolder); // Retrieve file list foreach($Array as $line){ // Print list echo $line."\n"; } //=== Recursive Directory ============================================== function recur_dir($dir){ $dirlist = opendir($dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder recur_dir($newpath); // yes: Repeat this function } // for that new folder else{ // no: Its a file $Array[]= $newpath; // Save full file path to array } } } closedir($dirlist); // Close handle return $Array; // Return array of files for } // further processing //========================================== END Recursive Directory ===== exit(0); ?> |
Run the batch file (double click on Run.bat) Result as follows: ./udrive/usr/local/mysql/my.cnf Well what a pain! What happened to the recursion? After all the function is calling it-self. |
Modification
if (is_dir($newpath)){ // Is it a folder recur_dir($newpath); // yes: Repeat this function echo "Path = ".$newpath."\n"; } // for that new folder |
It prints out $newpath displaying any folders. Run the batch file. |
Result
Path = ./usr/local/mysql/bin Path = ./usr/local/mysql/data/mysql Path = ./usr/local/mysql/data/phpmyadmin Path = ./usr/local/mysql/data Path = ./usr/local/mysql/share/charsets Path = ./usr/local/mysql/share/english Path = ./usr/local/mysql/share ./usr/local/mysql/my.cnf Press any key to continue . . . |
For all sub-folders to be visible means all files must have been processed otherwise the script would have been stuck in the while loop. Why are only files in the starting directory listed? You may have noticed from the initial test setup all folders were listed firsts. This means the function is called before processing any files. The script works down a folder chain until no more folders are found then works back up the chain processing files. If it encounters a folder works down that folder chain. Net result the initial starting folder files are processed last. |
Solutions
To answer the question local variables and local arrays are not retained between function calls. So very time the function calls it-self a new array is created and any previous data stored is lost. The solution is to use static arrays however these are supported only for classes hence why so many recursive solutions using classes.
Another solution is to use a global array, its not neat because its detached from the function hence the need to remember its name and to clear it before use.
A neater solution is to pass the array when calling the function; this keeps the array alive and data intacked see next example
Refine - pass array back to function
Solution is to pass the array back to the function during a recusre call as follows:
Edit test_1.php to have the following content:
test_1.php | |
<?php $sfolder = "./usr/local/mysql"; // Start folder $Array=recur_dir($sfolder); // Retrieve file list foreach($Array as $line){ // Print list echo $line."\n"; } //=== Recursive Directory ============================================== function recur_dir($dir,&$Array){ $dirlist = opendir($dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder recur_dir($newpath,$Array); // yes: Repeat this function } // for that new folder else{ // no: Its a file $Array[]= $newpath; // Save full file path to array } } } closedir($dirlist); // Close handle return $Array; // Return array of files for } // further processing //========================================== END Recursive Directory ===== exit(0); ?> |
Warning: Missing argument 2 for recur_dir(), called in ... test_1.php on line 3 and defined in ... test_1.php on line 10 Note: List of all files including sub-folder files. Generally warnings are not an issue however this one is a pain and needs to be resolved. This warning occurs because of a parameter mismatch. We have seen this before the solution is to change this line: function recur_dir($dir,&$Array){ To: function recur_dir($dir,&$Array=false){ Initial call to function pass a single parameter. Recursive calls pass two parameters Essentially that’s it for recursion all that is required is to add some filtering see next section: |
Refine - Add file filtering
Filtering files can be achieved using the function preg_match($pattern_regex, $string_to_search)
The pattern was covered in preg_replace() it has the following format '/regex_patern/'
Hence to filter files with a specific extension use something like this:
'/(\.txt|\.cnf|\.conf)/' |
Pattern is delimited using '/' The entire regex is enclosed between brackets allowing the vertical bar (special character meaning or) to be used. The period (full stop) is a special regex character hence requires escaping using a backslash. |
Final recursive file search
Edit test_1.php to have the following content:
test_1.php | |
<?php $sfolder = "./usr/local"; // Start folder $File_list_array=recur_dir($sfolder); // Retrieve file list foreach($File_list_array as $line){ // Print list echo $line."\n"; } //=== Recursive Directory ============================================== function recur_dir($dir,&$Array=false){ $f_str='/(\.txt|\.cnf|\.conf)/'; // Filter, required files $dirlist = opendir($dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder recur_dir($newpath,$Array); // yes: Repeat this function } // for that new folder else{ // no: Its a file if (preg_match($f_str, $newpath)){ // Filter extension. Required files $Array[]= $newpath; // Save full file path to array } // includes file name } } } closedir($dirlist); // Close handle return $Array; // Return array of files for } // further processing //========================================== END Recursive Directory ===== exit(0); ?> |
/(\.txt|\.cnf|\.conf)/ Note 1: The Start folder was moved allowing more folders to be to searched. Outside of the function changed $Array to $File_list_array to avoid confusion. Note 2: The array is passed to the function using the and operator &$Array=false referred to as passing by pointer. It looks a little odd the array name is a pointer, however the array is not created until a value is assigned to it. If its not created it cannot be passed to the function for recursion. What the & operator does is to create a variable to hold a pointer to the array. This will be created when the function is first called. File filtering is performed using preg_match() if a match found save the file to $Array if (preg_match($f_str, $newpath)){ $Array[]= $newpath; } Complete: Essentially that completes the recursive file search template. You can now add replace code either externally to the function or convert it to perform both search and replace. |
Result of running the above script
./usr/local/apache2/conf/httpd.conf ./usr/local/apache2/conf/ssl.conf ./usr/local/apache2/LICENSE.txt ./usr/local/mysql/my.cnf
Press any key to continue . . . Top
Search and replace example 1
Edit test_1.php to have the following content:
test_1.php | |
<?php $s_str = '/\nListen\s\d+/'; // String to search for $r_str = "\nListen 8080"; // Replacement string $sfolder = "./usr/local"; // Start folder $File_list_array=recur_dir($sfolder); // Retrieve file list foreach($File_list_array as $sfile){ // Scan file list $fh = fopen($sfile, 'r'); // Open file for read $Data = fread($fh, filesize($sfile)); // Read all data into variable fclose($fh); // close file handle $Data = preg_replace($s_str, $r_str, $Data); // Search and replace $fh = fopen($sfile, 'w'); // Open file for write fwrite($fh, $Data); // Write to file fclose($fh); // close file handle } //=== Recursive Directory ============================================== function recur_dir($dir,&$Array=false){ $f_str='/(\.txt|\.cnf|\.conf)/'; // Filter, required files $dirlist = opendir($dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder recur_dir($newpath,$Array); // yes: Repeat this function } // for that new folder else{ // no: Its a file if (preg_match($f_str, $newpath)){ // Filter extension. Required files $Array[]= $newpath; // Save full file path to array } // includes file name } } } closedir($dirlist); // Close handle return $Array; // Return array of files for } // further processing //========================================== END Recursive Directory ===== exit(0); ?> |
To perform a global search and replace:
|
Search and replace example 2
The previous examples were designed to demonstrate certain concepts and potential issues during a recursive function design. Interestingly making the function what I refer to as self-contained most of the issues disappear.
There was no real need to return a function containing a list of files, having found a matching file why not just perform a string search and replace. All that is required is to throw parameters at the function and let it get on with the job. This example does that I have also changed a few names to make them more meaningful.
Edit test_1.php to have the following content:
test_1.php | |
<?php $start_dir = './usr/local'; // Start folder $file_type = '/(\.txt|\.cnf|\.conf)/'; // Filter, required files $search_str = '/\nListen\s\d+/'; // String to search for $replace_str = "\nListen 8080"; // Replacement string if(file_sr_global($start_dir,$file_type,$search_str,$replace_str)){ echo "\n Search and replace complete\n"; } //=== Recursive File Search and replace ======================================= function file_sr_global($start_dir,$file_type,$search_str,$replace_str){ $dirlist = opendir($start_dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $start_dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder // yes: Repeate this function file_sr_global($newpath,$file_type,$search_str,$replace_str); } // for that new folder else{ // no: Its a file if (preg_match($file_type, $newpath)){ // Filter extension. Required files $fh = fopen($newpath, 'r'); // Open file for read $Data = fread($fh, filesize($newpath)); // Read all data into variable fclose($fh); // close file handle $Data = preg_replace($search_str, $replace_str, $Data); // Search and replace $fh = fopen($newpath, 'w'); // Open file for write fwrite($fh, $Data); // Write to file fclose($fh); // close file handle echo $newpath."\n"; //***** Delete this line *************************** } } } } closedir($dirlist); // Close handle return true; // Return } //=================================== END Recursive File Search and replace === exit(0); ?> |
Since the line: $Array[]= $newpath; has been replaced there is no feedback hence this test line: echo $newpath."\n"; displays files that have been searched. After testing it can be removed.
$replace_str = "\nListen 8080"; To: $replace_str = "\nListen 80"; Alternatively you can open the file and change it. There is one final step! Turn it into a finished function and use it. |
Final recursive search and replace function
One thing that really annoys me professional programmers that never documents one line of code, how do they know where it all fits in a few years time. I do tend to go overboard but that’s my personal preference. I know the above code is not perfect but at least you have some idea what each line does. Similarly when turned into a function I add extra information as follows:
//=== Recursive File Search and replace ========================================== // Inputs: $start_dir Absolute or relative path to starting folder. Do not // include a forward slash at the end. c:/test ./test // $file_type A regex patern containg file types to be searched // e.g. $file_type = '/(\.txt|\.cnf|\.conf)/' // $search_str A regex patern e.g $search_str = '/\nListen\s\d+/' // $replace_str A plain text string e.g. $replace_str = "\nListen 8080" // // Output: Returns true --- Need to add error checking // // Notes : Searches for files of the specified type starting at $start_dir and // incluse all sub-folders. Each file found a search and replace is // performed. // // ----------------------------------------------------------------------------------- function file_sr_global($start_dir,$file_type,$search_str,$replace_str){ $dirlist = opendir($start_dir); // Open start directory while ($file = readdir($dirlist)){ // Iterate through list if ($file != '.' && $file != '..'){ // Skip if . or .. $newpath = $start_dir.'/'.$file; // Create path. Either dir or file if (is_dir($newpath)){ // Is it a folder // yes: Repeat this function file_sr_global($newpath,$file_type,$search_str,$replace_str); } // for that new folder else{ // no: Its a file if (preg_match($file_type, $newpath)){ // Filter by file extension. $fh = fopen($newpath, 'r'); // Open file for read $Data = fread($fh, filesize($newpath)); // Read all data into variable fclose($fh); // Close file handle $Data = preg_replace($search_str, $replace_str, $Data,-1,$count);// S & R if($count){ // Was a replacement made $fh = fopen($newpath, 'w'); // yes: Open file for write fwrite($fh, $Data); // Write new $Data to file fclose($fh); // Close file handle echo $newpath." Replaced ".$count."\n"; //***** Delete this line ******* } } }//eof else } }//eof while closedir($dirlist); // Close handle return true; // Return } //=================================== END Recursive File Search and replace ====== |
OK its true I never practice what I preach.
Summary
To be honest I have never found any old fashioned PHP code to perform the above hence the reason for writing it, seems the trendy thing is classes. To justify this everyone wants to add far more than is required. I like simple! It’s less error prone.
Well I am not biased in anyway and will hack any code if it gets the job done. This series has resulted in the creation of some original code.
Conclusion
True objective of this tutorial series was to give you an insight into UniServer 5.0-Nano’s new control architecture. All examples in the tutorial can be found within this control architecture or support scripts.
Esoteric batch files have been reduced to nothing more than interfaces to the PHP scripts. Uniform Server is now uniform regarding both scripting (control) and Web page language.
This I hope will allow you to tailor the server to meet any specific functionality you require without the need to compile any code.
MPG (Ric) |