Saved Bookmarks
| 1. |
Solve : Sort a Phone List by Last Name.? |
|
Answer» Quote from: Squashman on January 07, 2018, 03:48:34 PM Can you do a quick TEST with the Powershell code I posted. I think it will be around the same time as Dave's Jsort. 88,800 names input: sorted Z-A output: sorted A-Z GNU sort 0.40 sec Benham jsort 6.13 sec Powershell 8.49 sec Batch method 664.99 sec (11 min 4.99 sec) Technically this is a Powershell one-iner, but I broke it down to 4 physical lines for readability. If you type this at the Powershell command prompt, just keep typing when the line wraps. Code: [Select](Get-Content .\PHONE.txt) -replace '(.*?\d{3})\s(.*?)', '$1-$2' | ConvertFrom-Csv -Delimiter ' ' -Header First,Last,Phone | Sort-Object Last | Format-Table * -AutoSize You can change the path and file name in the first line. Technically this is a Powershell one-liner. I broke it into 4 physical lines for readability. If you do type this in a Powershell window, type all 4 lines as a single line and just keep typing when the line wraps. The interpreter will understand. Code: [Select](Get-Content .\phone.txt) -replace '(.*?\d{3})\s(.*?)', '$1-$2' | ConvertFrom-Csv -Delimiter ' ' -Header First,Last,Phone | Sort-Object Last | Format-Table * -AutoSize The path and file name can be changed as needed. It did it, but it ECHOED the sorted output to the console. Powershell has cmdlets for outputting to a file, however in this case redirection might be the simpler way to go. Code: [Select](Get-Content .\Phone.txt) -replace '(.*?\d{3})\s(.*?)', '$1-$2' | ConvertFrom-Csv -Delimiter ' ' -Header First,Last,Phone | Sort-Object Last | Format-Table * -AutoSize -HideTableHeaders > .\Phone.new If the preference is to have the headers in the output file, remove the -HideTableHeaders parameter from the Format-Table cmdlet. Added a timer: Code: [Select]$t = Measure-Command { (Get-Content .\Notabs_names-rev-sorted.names.txt) -replace '(.*?\d{3})\s(.*?)', '$1-$2' | ConvertFrom-Csv -Delimiter ' ' -Header First,Last,Phone | Sort-Object Last | Format-Table * -AutoSize > out.txt } echo "Time: $t" Result: Code: [Select]Time: 00:00:13.2834694 The script is clearly doing more work than just sorting: for example it is justifying the columns (input file: 2.6 MB output file: 7.8 MB) Python 88,800 names 0.138 seconds! Code: [Select]python sortfile.py > sorted.txt 2018-01-10 21:24:18.494000 2018-01-10 21:24:18.632000 Code: [Select]from __future__ import print_function from datetime import datetime import csv import operator tstart = datetime.now() reader = csv.reader(open("Notabs_names-rev-sorted.names.txt"), delimiter=" ") for line in sorted(reader, key=operator.itemgetter(1)): print(" " . join(line)) tend = datetime.now() print (tstart) print (tend)Better Python (27) Code: [Select]from __future__ import print_function from datetime import datetime import csv import operator tstart = datetime.now() f = open('output.txt', 'w') reader = csv.reader(open("input.txt"), delimiter=" ") for line in sorted(reader, key=operator.itemgetter(1)): print(" " . join(line), file=f) f.close() tend = datetime.now() print ("Elapsed", tend - tstart, "seconds") 88,800 names, 5 runs: Code: [Select]Elapsed 0:00:00.172000 seconds Elapsed 0:00:00.156000 seconds Elapsed 0:00:00.172000 seconds Elapsed 0:00:00.157000 seconds Elapsed 0:00:00.172000 seconds Here are 88,800 names sorted randomly: [attachment deleted by admin to conserve SPACE]Here's the other, 88,800 names sorted alphabetically by column (2) in reverse ORDER. I notice that the sorted file compresses better. [attachment deleted by admin to conserve space]It's quicker to sort the reverse-sorted file than the randomly sorted file. Code: [Select]reverse 0:00:00.156000 seconds random 0:00:00.265000 seconds |
|