PHP CLI: Files Search and Replace

From The Uniform Server Wiki
Revision as of 09:56, 15 August 2009 by Ric (talk | contribs) (New page: {{Uc nav PHP CLI}} '''''CLI Files Search and Replace''''' I think I have mentioned never reinvent the wheel when you can obtain code from the Internet. That said there are times when you ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

 

MPG UniCenter

UniServer 5.0-Nano
PHP CLI.

CLI Files Search and Replace

I think I have mentioned never reinvent the wheel when you can obtain code from the Internet. That said there are times when you do need to hack out a bit of new code either because you are transferring useful scripts from one language to another or you can not find what you want, square wheels!

Search and replace in some scenarios is not deterministic! A neat way to let your PC meet its maker is to run a recursive search and replace script by some unknown. My point is, use what you understand, objects and classes for simple application why! Overkill with more belles and whistles you can shake a stick at for me just does not make sense.

This page does not offer definitive solutions, it provides examples you can explore which hopefully are tailor-able to meet your own requirements.

Initial test setup

Edit our two test files Run.bat and test_1.php contained in folder UniServer to have the following content:

Run.bat test_1.php;
TITLE CLI TEST BAT
COLOR B0
@echo off
cls
echo.
usr\local\php\php.exe -n test_1.php
echo.
pause
<?php
// === Create test file ==================================
if (!is_dir("z_test")){         // Does test folder exist
  mkdir("z_test");              // no create it
}
$test_file = "usr/local/apache2/conf/httpd.conf";
$new_file = "z_test/httpd.conf";
copy($test_file,$new_file);      //Copy test file
// ============================== END Create test file ===

exit(0);
?>

Run the batch file (double click on Run.bat) Confirm folder UniServer\udrive\z_test created and contains file httpd.conf.

Note 1: Each example uses this script-snippet to ensure a clean test file is used. Not that we are going to corrupt this file!

Note 2: UniServer Mona users change paths as shown:

  • udrive\usr\local\php\php.exe -n test_1.php
  • udrive/z_test
  • udrive/usr/local/apache2/conf/httpd.conf
  • udrive/z_test/httpd.conf

All scripts on this page will require the above change.

Top

Basics

Before you can replace a string you need to find it. Two options are available either a straight character search or use a regular expression (regex). Majority of PHP functions use “regex” hence is difficult to avoid. That said PHP provides a useful function to convert straight text into “regex” format which is suitable for any function requiring a “regex”

preg_quote()

Modify test_1.php to have the following content:

test_1.php  
<?php
// === Create test file ==================================
if (!is_dir("z_test")){               // Does test folder exist
  mkdir("z_test");                    // no create it
}
$test_file = "usr/local/apache2/conf/httpd.conf";
$new_file = "z_test/httpd.conf";
copy($test_file,$new_file);           //Copy test file
// ============================== END Create test file ===

$sfile = "z_test/httpd.conf";         // File to search
$s_str = '"^\.ht"';                   // String to search for
$s_str_backup = $s_str;               // Original saved

$fh = fopen($sfile, 'r');             // Open file for read
$Data = fread($fh, filesize($sfile)); // Read all data into variable
fclose($fh);                          // close file handle

echo "\n Search string          : $s_str \n";  // Display string
//$s_str = preg_quote($s_str,'/');          // Convert to regex format
echo " Converted Search string: $s_str \n"; // Display string

if(preg_match("/$s_str/", $Data)){    // Search $Data for match  
   echo "\n String $s_str_backup found in file $sfile\n"; // found
}
else{                                                     // not found
   echo "\n String $s_str_backup not found in file $sfile\n"; 
}

exit(0);
?>


Run the batch file (double click on Run.bat)

Result: String "^\.ht" not found

However that string does exist in file httpd.conf line (415) <Files ~ "^\.ht">

Function preg_match() takes as parameter a regular expression, to avoid the regex engine issuing an error message I purposefully choose the string "^\.ht" because it is a valid regex.

A regex uses a small range of characters that perform a special function since we were trying to perform a straight character search these special characters require escaping hence why the search failed.

The string contains three of these special characters as follows:

  • ^ Logic not or start of line
  • \ Indicates a short form e.g. \d any digit
  • . Any single character

String after escaping looks like this: "\^\\\.ht"

Full range of regex characters are: . \ + * ? [ ^ ] $ ( ) { } = ! < > | :

If you want to perform only a straight character search use function preg_quote() automatically quotes any special characters in a string.

To see this in action un-comment line as shown and re-run batch file:

$s_str = preg_quote($s_str,'/'); // Convert to regex format

Note 1: Using function preg_quote() allows you to use standard strings as opposed to regular expressions. It does mean you loose the power of regexes. However the next example shows why this is not really an issue.

Top


preg_replace() - standard text search

This example consolidates what we have covered; it opens a file, reads data into a variable, which is searched for a string and replaced with another if found. The new variable is written back to the same file. Hence we have a file search and replace.

Modify test_1.php to have the following content:

test_1.php  
<?php
// === Create test file ==================================
if (!is_dir("z_test")){          // Does test folder exist
  mkdir("z_test");               // no create it
}
$test_file = "usr/local/apache2/conf/httpd.conf";
$new_file = "z_test/httpd.conf";
copy($test_file,$new_file);      //Copy test file
// ============================== END Create test file ===

$sfile = "z_test/httpd.conf";         // File to search
$s_str = '"^\.ht"';                   // String to search for
$r_str = "This is a test";            // Replacement string

$fh = fopen($sfile, 'r');             // Open file for read
$Data = fread($fh, filesize($sfile)); // Read all data into variable
fclose($fh);                          // close file handle

$s_str = preg_quote($s_str,'/');      // Convert to regex format
$s_str = '/'.$s_str.'/';              // Create regex pattern

$Data = preg_replace($s_str, $r_str, $Data); // Search and replace

$fh = fopen($sfile, 'w');             // Open file for write
fwrite($fh, $Data);                   // Write to file
fclose($fh);                          // close file handle

echo "\n File updated check line 415\n";
exit(0);
?>


Run the batch file (double click on Run.bat)

Result: String "^\.ht" replaced with This is a test see file UniServer\udrive\z_test\httpd.conf line 415

  1. The first three lines setup the variables ($sfile, $s_str and $r_str ) for file path,search string and replacement string.
  2. Next three lines saves the content of a file we wish to perform the search on into a variable named $Data.
  3. We wish to perform a standard string search hence any special regex-characters are escaped using function preg_quote(). Function preg_replace() requires the search string to be delimited with ‘/’ hence a regex pattern is created.
  4. Variables ($s_str, $r_str, $Data) are passed to the function preg_replace() where the search and replace is performed the result is assigned to variable $Data.
  5. The next three lines write this variable back to the file where it is overwritten with the new file.
  6. Finally some information is displayed to a user.

Note 1: The file is written regardless of any changes if no substitution performed is a waste of time.

Note 2: As an alternative to using preg_replace() you can use ereg_replace() disadvantage of this it is slower. Function preg_replace() is preferable because it is faster and uses Perl-compatible regular expression syntax.

Note 3: Although the above is performing a standard string search and replace, it is easily converted to use pure regex.

Replace

$s_str = '"^\.ht"';

With

$s_str = '/"\^\\\.ht"/';

And delete these lines:

$s_str = preg_quote($s_str,'/');      // Convert to regex format
$s_str = '/'.$s_str.'/';              // Create regex pattern

Top


preg_replace() - regex search

This example illustrates the use of regex we have an Apache configuration file that has been changed by a user. He has tried to move the server to a different port however not all configuration directives were change accordingly. Objective is to move the server to port 8080.

The standard configuration directives are: 

  • Listen 80
  • ServerName localhost:80 in two locations

    

Unique features:

To target these directives we need a unique characteristic “localhost:” is unique however “Listen “ is not it appears within the text body several times. Remember we read the file in, as one complete serial string “Listen ” following a new-line charter is unique.

Using unique features is not enough to target the configuration directives because they are followed by some unknown number of digits. Clearly these cannot be fully targeted using a “standard string search” What is required are two regex search strings with variable digit targeting. Two new replacement strings are also required to complete the search and replace.

From the above we can start building the regex patterns ignoring the variable digit issue we have:

‘/\nListen\s/’

Pattern needs to be enclosed in quotes with delimiters. A new line character needs to be detected “\n” following this is the string “Listen” a space requires a special regex “\s

‘/localhost:/’

Pattern needs to be enclosed in quotes with delimiters. “localhost:” requires nothing special.

To complete the search patterns we need to detect a variable number of digits. Detecting a digit using regex you can use either [0-9] or the special character \d, we need to detect one or more of these. The regex special character “+” performs this. Hence to detect a variable number digits use either [0-9]+ or \d+

Search pattern 1

‘/\nListen\s/\d+’

Replacement string.

‘\nListen 8080’

Search pattern 2

‘/localhost:/\d+’

Replacement string.

‘localhost: 8080’

Modify test_1.php to have the following content:

test_1.php  
<?php
// === Create test file ==================================
if (!is_dir("z_test")){          // Does test folder exist
  mkdir("z_test");               // no create it
}
$test_file = "usr/local/apache2/conf/httpd.conf";
$new_file = "z_test/httpd.conf";
copy($test_file,$new_file);      //Copy test file
// ============================== END Create test file ===
$sfile = "z_test/httpd.conf";         // File to search

$fh = fopen($sfile, 'r');             // Open file for read
$Data = fread($fh, filesize($sfile)); // Read all data into variable
fclose($fh);                          // close file handle

$s_str = '/\nListen\s\d+/';           // String to search for
$r_str = "\nListen 8080";             // Replacement string
$Data = preg_replace($s_str, $r_str, $Data); // Search and replace

// Need to repeat with new regex
$s_str = '/localhost:\d+/';           // String to search for
$r_str = "localhost:8080";            // Replacement string
$Data = preg_replace($s_str, $r_str, $Data); // Search and replace

$fh = fopen($sfile, 'w');             // Open file for write
fwrite($fh, $Data);                   // Write to file
fclose($fh);                          // close file handle

echo "\n File updated check lines 124, 287 and 983\n";
exit(0);
?>


Run the batch file (double click on Run.bat)

Result: Apache port updated to 8080

  1. First line sets the file path variable ($sfile).
  2. Next three lines saves the content of the file we are performing the search and replace on into a variable named $Data.
  3. We wish to perform a regex search the variable $s_str set accordingly as is the replacment string variable $r_str.
  4. Variables ($s_str, $r_str, $Data) are passed to the function preg_replace() where the search and replace is performed the result is assigned to variable $Data.
  5. We have a second search string hence variables ($s_str, $r_str) are set and passed to preg_replace() for processing. This undates variable $Data (file data).
  6. The next three lines write this variable back to the file where it is overwritten with the new file.
  7. Finally some information is displayed to a user.

Top


preg_replace() - regex search array

If the number of searh and replace strings is large its probably easier to list these in two array pairs and pass these to the preg_replace() function.

Note: Array keys are processed in the order they are entered into the array. This may not match the index number order. To avoid any problems use function ksort() on each array prior to calling preg_replace() this adjusts array order to match index numbers.


Modify test_1.php to have the following content:

test_1.php  
<?php
// === Create test file ==================================
if (!is_dir("z_test")){          // Does test folder exist
  mkdir("z_test");               // no create it
}
$test_file = "usr/local/apache2/conf/httpd.conf";
$new_file = "z_test/httpd.conf";
copy($test_file,$new_file);      //Copy test file
// ============================== END Create test file ===
$sfile = "z_test/httpd.conf";         // File to search

$s_str[0] = '/\nListen\s\d+/';        // String to search for
$r_str[0] = "\nListen 8080";          // Replacement string

$s_str[1] = '/localhost:\d+/';        // String to search for
$r_str[1] = "localhost:8080";         // Replacement string

ksort($s_str);                 // Array ordered by index number
ksort($r_str);                 // Array ordered by index number   

$fh = fopen($sfile, 'r');             // Open file for read
$Data = fread($fh, filesize($sfile)); // Read into variable
fclose($fh);                          // close file handle

$Data = preg_replace($s_str, $r_str, $Data); // Search & replace

$fh = fopen($sfile, 'w');             // Open file for write
fwrite($fh, $Data);                   // Write to file
fclose($fh);                          // close file handle

echo "\n File updated check lines 124, 287 and 983\n";
exit(0);
?>


Run the batch file (double click on Run.bat)

Result: Apache port updated to 8080

  1. First line sets the file path variable ($sfile).
  2. Build array search/replace pairs.
  3. To avoid problems order these arrays by index number using function ksort()
  4. Next three lines saves the content of the file we are performing the search and replace on into a variable named $Data.
  5. The arrays $s_str and $r_str along with variable $Data are passed to function preg_replace() for processing. Where each array element pair is consecutively read and the search and replace performed the result is assigned to variable $Data. The process is repeated until all array elements have been read.
  6. The next three lines writes variable $Data back to the file where it is overwritten with the new file.
  7. Finally some information is displayed to a user.

Top

Summary

The above single file search and replace although adequate for most applications occasionally you will need to search a folder and its sub-folders for the files.

Recursive search and replace is covered on the next page.

Top


MPG (Ric)