| 1. |
Solve : Exclamation Marks [!] in DOS text files cause 'FOR /F' processing problems.? |
|
Answer» I'm using 'FOR /F' to change every occurrence of 'X' to 'Y' in every line of a text file. ST, thanks again for the 2 solutions which you provided -- both of which preserve case sensitivity. Yes, that is correct. The SET command used without quotes ignores trailing spaces, so to be sure that, if %outchar% should happen to be a space, it is added, the quotes are used as you saw. [Update] I tested the batch without quotes in that line, and it seems to work just the same. I think I just assumed you needed them. ST, you warned me, in your initial reply, that "poison characters" embedded in text strings can cause problems when PROCESSING them in batch scripts. Sure enough, when my test text file was modified to contain ampersands [&], percent signs [%], and double quotes ["], in ADDITION to the dreaded exclamation marks [!], problems were encountered. Sorry to say, all 3 of your solutions -- which worked fine for exclamation marks [!] -- failed when processing the aforementioned additional rogue characters. I think you might be interested in a solution, I have developed, which seems to cope with all of the above "poison characters". It is based on some of your conversion code (but after many experiments with combinations of enabling/disabling delayed expansion, and in or out of sub-routines). I have commented the script as a reminder of how different methods work, or don't work, in certain situations. My input file [Test-myfile.txt] now contains... Code: [Select]Line 01 containing x but not X! Line 02 containing A but not X!Watch this space!x Line 03 containing X but not Y! and another xylophone. Line 3A containing X and 1st percent % sign, 2nd percent % sign, and 3rd percent % sign. Line 3B containing X and just 2 percent % signs. (Here's the 2nd percent % sign). Line 3C containing X and 1st percent ^% sign, 2nd percent ^% sign, and 3rd percent ^% sign. Line 3D containing X and just 2 percent ^% signs. (Here's the 2nd percent ^% sign). Line 04 containing X and Y and another x and a word like axiom. Line 4A containing x and an ampersand & and another & followed by one more & and a percent % sign. Line 4B containing x and an ampersand & on its own. Line 4C containing 2 ampersands -- this one & and this one & as well. Line 05 containing Y but not x; also an upper-case 'X' in single quotes Line 5A containing W but not X; also a lower-case "x" in double quotes Line 06 containing Xxx (but not Yyy): Line 07 containing B and Z and Y~ Line 08 containing Y but not X# Line 09 containing X and Y and another Xylophone My MS-DOS batch code is now as follows... Code: [Select]@ECHO OFF REM ==================================================== REM Code to change every occurrence of 'X' to 'Y' -- REM in every line of 'Test-myfile.txt' -- without losing -- REM 'special characters'. e.g. [!][%][&][#][~]["][']. REM Preserving upper/lower case is selectable. REM ==================================================== REM Display/obtain case preservation options/choice... ECHO. ECHO When changing 'X' to 'Y' should the original case be preserved? ECHO. ECHO [0] Case-preservation is not important. (Default - Quicker) ECHO [1] Preserve original case. (Slower) ECHO. SET /P CASE_CHOICE=Enter option... ECHO. IF "%CASE_CHOICE%" EQU "1" ECHO Original case will be preserved. (Slower option) IF "%CASE_CHOICE%" NEQ "1" ECHO Original case may not be preserved. (Quicker option) ECHO. PAUSE REM Initialise a count for lines processed. SET LINES_COUNT=0 REM Create an empty Output File. @ECHO OFF > newfile.txt REM One at a time, read each complete line of the Input File into a single string. FOR /F "tokens=1* delims=" %%a in (Test-myfile.txt) DO ( REM ECHO Original line..."%%a" REM At this point, complete string is intact. REM Copy the complete intact string -- REM enclosing in quotes to protect special characters (most). SET string1="%%a" REM At this point, string1 contains the complete intact quoted string -- REM but appears to be empty/undefined if echoed. REM For each line of the Input File, string processing requires 'EnableDelayedExpansion' -- REM which can only be executed a maximum of 16 times before reaching the -- REM 'Maximum setlocal recursion level' (despite also executing 'DisableDelayedExpansion'). REM Also, within 'EnableDelayedExpansion' %%a will lose exclamation marks -- REM fortunately we no longer need %%a (for the current line) as it has been copied to string1. REM For these reasons, the main string processing is done in a subroutine... CALL :REPLACE_CHARS ) ENDLOCAL ECHO Done! PAUSE EXIT :REPLACE_CHARS REM Increment lines processed count... SET /A LINES_COUNT=%LINES_COUNT%+1% REM Enable delayed environment variable expansion -- REM so that the value of certain variables can be dynamically redefined at run time -- REM using !Variable! instead of %Variable% (or a combination)... SETLOCAL EnableDelayedExpansion REM Display progress... ECHO Original Line !LINES_COUNT! ...!string1!... REM At this point, string1 still contains the complete intact quoted string. REM Ascertain whether preservation of original case is required... IF "!CASE_CHOICE!" EQU "1" GOTO PRESERVE_CASE REM Original case is not required to be preserved... REM Convert all 'X's in the string (line) to 'Y's -- REM upper and lower-case 'X's will be converted to upper-case 'Y's. SET string2=!string1:X=Y! GOTO WRITE_LINE REM Original case must be preserved... :PRESERVE_CASE REM Initialise an index for the current character position -- REM and an empty string to construct the converted line... set j=0 set string2= :Loop REM One at a time, isolate each individual character in the string (line)... set inchar=!string1:~%j%,1%! IF "!inchar!END"=="END" GOTO :ExitLoop REM Copy the current character for passing to the modified string (line)... SET outchar=!inchar! REM Convert 'X's to 'Y's -- preserving the original case... IF "!inchar!"=="X" set outchar=Y IF "!inchar!"=="x" set outchar=y REM Construct the new string (line) by concatenating the current/modified character... SET "string2=!string2!!outchar%!" REM Increment the index for the next character position, and repeat... SET /A j=!j!+1% GOTO :Loop :ExitLoop :WRITE_LINE REM Strip-off the leading/trailing quotes from the processed string (line) -- REM then write it to the Output File -- ensuring no spaces are added at end of line. ECHO !string2:~1,-1!>> newfile.txt REM Display progress... ECHO Modified Line !LINES_COUNT! ...!string2!... REM Disable delayed environment variable expansion. SETLOCAL DisableDelayedExpansion REM Return whence we came. GOTO :EOFQuote REM For each line of the Input File, string processing requires 'EnableDelayedExpansion' -- I'm not sure what you mean by this - you only enable delayed expansion once, at any point before the loop or other structure. You don't have to re-enable it for each line that is read from a file! Isn't this better? Sooner or later if you are doing SERIOUS textfile processing you are going to need to look at something else. Visual Basic Script is present in every Windows installation these days. Save as a .vbs file and run with Cscript //nologo yourname.vbs Code: [Select]Set fso = CreateObject("Scripting.FileSystemObject") Const ForReading = 1, ForWriting = 2, ForAppending = 8 Const FormatSystemDefault = -2, FormatUnicode = -1, FormatASCII = 0 ReadfileName ="Input.txt" WriteFileName="Output.txt" Wscript.echo "Read file..." Set InputFile = fso.openTextFile (ReadFileName, ForReading, FormatASCII) Set OutputFile = fso.openTextFile (WriteFileName, ForWriting, FormatASCII) Do While Not InputFile.AtEndOfStream InputLine = InputFile.readline Wscript.Echo "Input " & InputLine TempLine = InputLine TempLine = Replace(TempLine, "X", "Y") TempLine = Replace(TempLine, "x", "y") OutputLine = TempLine Wscript.Echo "Output " & OutputLine OutputFile.WriteLine(OutputLine) Loop InputFile.Close Outputfile.Close Set fso = Nothing Set Shell = Nothing |
|