| 1. |
Solve : Get selective columns from a dump file using MS DOS and make new dump? |
|
Answer» Hi: I've found a problem in the script. It doesn't recognize blank field. Please help me in this issue as well. Picked column 3, 4 and 6. For Example:Not a problem with the script. It is more of a problem with your data. The FOR /F command always sees consecutive delimiters as one. If you would have provided that information upfront, I could have coded for that instance. That is the main problem with providing obfuscated data. It is better to provide real world examples up front.Thanks for your feedback. As I've said the script works fine if all the fields have value like below: (real time example) SUBCOS|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY| 2|30|20150830|0|0|20160226235959|11|3|20170101000000| 2|44|20140329|0|0|20140923235959|12|4|20140328003330| 2|20|20140722|2|0|20150118235959|11|3|20140518151446| But if few fields are NULL, i.e. in second ROW there is no RATE (3, 4 & 6 columns are picked) SL|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY| 2|30|20150830|55|99|20160226235959|11|3|20170101000000| 2||20140329|33|90|20140923235959|12|4|20140328003330| 2|20|20140722|22|91|20150118235959|11|3|20140518151446| Output comes: ---------------- DATE|ACCSTATE|ACTTIME| 20150830|55|20160226235959| ----Ok 33|90|12| -----As RATE field is blank, insted of picking DATE column, it picks ACCSTATE as column 3 which actually is 4 20140722|22|20150118235959| -----Ok Please let me know if you need any information. Thanks, TanvirThis uses dbenham's ParseCSV.bat to reformat the file and then Squashman's code to output the data. Save both batch files into the same folder as file.csv and run this one. Code: [Select]echo off call ParseCSV.bat "/i:|" "/o:|" <file.csv >file.csv.tmp for /F "tokens=3,4,6 delims=|" %%G in (file.csv.tmp) do >>newfile.csv echo %%~G^|%%~H^|%%~I del file.csv.tmp ParseCSV.bat Code: [Select]if (X)==(Y) end /* harmless hybrid line that begins a JScrpt comment ::************ Documentation *********** ::parseCSV.bat version 1.0 ::: :::parseCSV [/option]... ::: ::: Parse stdin as CSV and write it to stdout in a way that can be safely ::: parsed by FOR /F. All columns will be enclosed by quotes so that empty ::: columns may be preserved. It also supports delimiters, newlines, and ::: quotes within quoted values. Two consecutive quotes within a quoted value ::: are converted into one quote. ::: ::: Available options: ::: ::: /I:string = Input delimiter. Default is a comma. ::: ::: /O:string = Output delimiter. Default is a comma. ::: ::: /E = Encode output delimiter in value as \D ::: Encode newline in value as \N ::: Encode backslash in value as \S ::: ::: /D = Escape exclamation point and caret for delayed EXPANSION ::: ! becomes ^! ::: ^ becomes ^^ ::: :::parseCSV /? ::: ::: Display this help ::: :::parseCSV /V ::: ::: Display the version of parseCSV.bat ::: :::parseCSV.bat was written by Dave Benham. Updates are available at the original :::posting site: http://www.dostips.com/forum/viewtopic.php?f=3&t=5702 ::: ::************ Batch portion *********** echo off if "%~1" equ "/?" ( setlocal disableDelayedExpansion for /f "delims=: tokens=*" %%A in ('findstr "^:::" "%~f0"') do echo(%%A exit /b 0 ) if /i "%~1" equ "/V" ( for /f "delims=:" %%A in ('findstr /bc:"::%~nx0 version " "%~f0"') do echo %%A exit /b 0 ) cscript //E:JScript //nologo "%~f0" %* exit /b 0 ************ JScript portion ***********/ var args = WScript.Arguments.Named, stdin = WScript.Stdin, stdout = WScript.Stdout, escape = args.Exists("E"), delayed = args.Exists("D"), inDelim = args.Exists("I") ? args.Item("I") : ",", outDelim = args.Exists("O") ? args.Item("O") : ",", quote = false, ln, c, n; while (!stdin.AtEndOfStream) { ln=stdin.ReadLine(); if (!quote) stdout.Write('"'); for (n=0; n<ln.length; n++ ) { c=ln.charAt(n); if (c == '"') { if (quote && ln.charAt(n+1) == '"') { n++; } else { quote=!quote; continue; } } if (c == inDelim && !quote) c='"'+outDelim+'"'; if (escape) { if (c == outDelim) c="\\D"; if (c == "\\") c="\\S"; } if (delayed) { if (c == "!") c="^!"; if (c == "^") c="^^"; } stdout.Write(c); } stdout.Write( (quote) ? ((escape) ? "\\N" : "\n") : '"\n' ); } I had found another post online about changing the DOUBLE PIPES to |#null#| and then running that output to another FOR /F command to un-delimit the output. But then you have to run that data through another SET command to remove the #null#. Would probably have issues with poison characters as well.Thanks to both of you. It returns the correct output now but each field comes with " like below: "2"|"147500"|"1906342342"|"2"|"20140830125243" "2"|"0"|"1903312025"|"197"|"20140327111829" "2"|"765952"|"1903312029"|"160"|"20140327174850" I've tried removing " from parser file but the output become wrong again. Just need to eliminate the ". As I'm very beginner with DOS need further help. Another help I need - How can I call multiple files instead of a single one here call ParseCSV.bat "/i:|" "/o:|" <file.csv >file.csv.tmp i.e. call ParseCSV.bat "/i:|" "/o:|" <*.csv >file.csv.tmp ---I tried but it doesn't work Please help. Thanks/ TanvirIs it OK if I ask a question? If this was a CSV file -- Why not just use a SPREADSHEET to parse it. Just asking. Quote from: Geek-9pm on September 08, 2014, 08:42:21 AM Is it OK if I ask a question?I think the OP's last post answers your question. They need to do it on multiple files. Would you want to open up multiple files and manually process all of them on a daily basis? Quote from: tanvir.khan on September 08, 2014, 08:27:03 AM Thanks to both of you. It returns the correct output now but each field comes with " like below:That is not the correct output. It has more than 3 fields. Nor would it be the correct input because you said you wanted columns 3, 4 & 6 and you only have 5 columns. I used your last data example Code: [Select]SL|RATE|DATE|ACCSTATE|PREVBALANCE|ACTTIME|USERSTARE|RATEPLAN|EXPIRY| 2|30|20150830|55|99|20160226235959|11|3|20170101000000| 2||20140329|33|90|20140923235959|12|4|20140328003330| 2|20|20140722|22|91|20150118235959|11|3|20140518151446|And this is the output I got. Code: [Select]DATE|ACCSTATE|ACTTIME 20150830|55|20160226235959 20140329|33|20140923235959 20140722|22|20150118235959Program is working for the 2nd time as you described your data. I used the following input which has surround quotes around each field Code: [Select]"2"|""|"147500"|"1906342342"|"2"|"20140830125243" "2"|""|"0"|"1903312025"|"197"|"20140327111829" "2"|""|"765952"|"1903312029"|"160"|"20140327174850"And the output does not have any surround quotes. I did not change any of the code that Foxidrive posted. Code: [Select]147500|1906342342|20140830125243 0|1903312025|20140327111829 765952|1903312029|20140327174850Dear Geek-9pm: Actually it's not a CSV, those are text files as I mentioned in my first post. Dear Squashman: No, files will come in a folder automatically with extension .txt; I've to parse those make a single dump (an auto job will run every 1 hr) and upload into a Oracle database using SQL Loader. So, I need to parse all a files of .txt extension an make a single dump. My Input files don't have quotes. But when I run the Parse Script the output comes with quote. These quotes appear because of this line in the parse script: if (c == inDelim && !quote) c='"'+outDelim+'"' ---I removed quotes from here, quotes disappear but returns wrong result Did you use the same parse script foxidrive provided and also parsed from text file? Need further help. Thanks/Tanvir Quote from: tanvir.khan on September 08, 2014, 10:01:29 AM My Input files don't have quotes. But when I run the Parse Script the output comes with quote. These quotes appear because of this line in the parse script:No. It is not because of that line. Quote from: tanvir.khan on September 08, 2014, 10:01:29 AM Did you use the same parse script foxidrive provided and also parsed from text file?Quote from: Squashman on September 08, 2014, 09:50:42 AM I did not change any of the code that Foxidrive posted. |
|