

InterviewSolution
1. |
Solve : If case insensitive VB.NET? |
Answer» Is there a way in VB.NET to compare strings with an if statement without case sensitivity. like in MS-DOS batch with the /I switch? 2. StrComp() iterates through each character in the string- the moment it finds a difference based on the given comparison, (also likely involving a UCase/Lcase or equivalent), it breaks out. THIS is important- the strcomp function is specially constructed to break out once it knows the answer to the function- (IE, characters differ).although this is just an assumption that the code break out on first occurance. but i believe the microsoft guys do break the loop on first occurance. how many bytes is one character? if 1 byte, comparing char a-z lower-uppercase is as simple as toggling the 6th bit. if 2 byte, then thus vb.net syntax reminds me of java.it's not an assumption- it's the only explanation for it running faster. Also- it would be dumb to do otherwise. toggling the 6th bit won't work, either. since that is simply subtracting or adding 32. what about non-alphabetic characters? toggling the 6th bit on the exclamation mark, code 33, makes it 1, a control character. All Strings in Visual BASIC, and the .NET languages are Unicode, and thus 2 bytes. They can be treated as ANSI, but dealing with the unicode itself via lenb, chrb and ascb is fully possible. Regarding conversion of case this is provisioned by various mapping routines. Since all ANSI characters are easily converted to unicode by using 0 in the high word of the equivalent unicode character- (for example, ANSI A is chr$(&H41), Unicode A is ChrB$(0) & ChrB$(&H41). Thankfully VB converts to and fro as NECESSARY, which to be honest can be a pain in the *censored*, since in order to pass the string as an actual Unicode string to an API routine one needs to instead pass a Long pointer to the string. Oh well. Quote thus vb.net syntax reminds me of java. That's probably just the object-oriented ness of them both. in a similar vein via my BCFile file library I can do this: Code: [Select]filesystem.CreateFile("C:\Windows\test.dat").OpenAsBinaryStream(GENERIC_WRITE,FILE_SHARE_DELETE+FILE_SHARE_READ,CREATE_NEW,FILE_FLAG_SEQUENTIAL_SCAN).WriteString("text!") given it's a little messier then using the FileSystemObjects, but the FileSystemObjects don't support showing the right-click explorer menu like my library does... as well as a few other things, like alternate Data stream enumeration. Back on topic, however- the StrComp() function is preferable simply because it takes account of unicode differences, for example the special characters that combine other characters such as Æ (ligatures) I believe are compared equal to the expanded version. BC, i bought that explanation. you have put up a good point. i am going to use strcomp() today onwards, especially for intense string-worker loop.It definitely speeds things up, thats for sure. I got lazy in a few comparisons in my Expression evaluator and used Ucase for a few comparisons; after replacing them the routine, although called over 10000 times to parse a complicated expression, decreased it's total time to almost half, even though I was using StrComp() in most places. Also, it looks like .NET has a few overloads for "Compare", to allow for a LocaleID. Another interesting string manipulation is counting the occurences of a string within another string- in one line: Code: [Select]Public Function CountStr(Byval SearchIn as String,SearchFor as String) as Long CountStr = (Len(searchin) - Len(replace$(searchin,searchfor,""))/Len(searchfor) End Function The Replace$() function is so useful. So StringComp is the one i should use?With VB.NET, StrComp() is just in there for compatibility. Use the Static String.Compare or CompareTo() Methods, as I show in my first post. The "old" functions are there purely for compatibility, and in fact the Lcase() function as provided as a subtle bug, in that the Lcase$(nothing)=Nothing, yet the Ucase(nothing)=""... MS won't waste time fixing these bugs- nowadays EVERYTHING must be in a object... *Sigh* as much as I love object oriented programming, Some operations simply don't require instance data. I wonder if the .NET framework version 6 will have replaced all the string static methods with classes. I can see it now. classes like CLStringToUpper,ClStringToLower and so forth, all requiring the passing of the string in the constructor and retrieving the Lcased/Ucased value via a property.Or model it like Batch's If statement with the /I switch, to not compare it by case. Quote from: macdad- on March 31, 2009, 05:39:04 PM Or model it like Batch's If statement with the /I switch, to not compare it by case. wat. No BC, the method you give is not faster. You are so very WRONG. I am disappointed in you. Because he had to read your post and come back and you had to come back, That takes time. Going for minor improvement of run-time execution speed is a poor programming practice unless speed is a know issue. Speed of development and deployment and correct documentation take precedence over a millisecond in a project that takes days. Don't include speculation in your observations without say hing that is an apparent observation. With Microsoft libraries thane is not easy way to know what they were thinking when they did the code. It is not Open Source. We are not allowed to see it. If you could see it, you might take a line from Bill Gates,: "That's the dumbest thing I ever saw!" Also, it's interesting to note that trying to duplicate the StrComp() function with this: Code: [Select]Private Function StrCompA(ByVal StringA As String, ByVal StringB As String, comparemethod As VbCompareMethod) As Integer If comparemethod = vbTextCompare Then StringA = UCase$(StringA) StringB = UCase$(StringB) End If If StringA < StringB Then StrCompA = -1 ElseIf StringA = StringB Then StrCompA = 0 Else StrCompA = 1 End If End Function Is FUNDAMENTALLY broken. If the Option Compare Text is used at the module level, you cannot use this StrComp() replacement to perform Case-sensitive comparisions. Also, just as I said- the call to Ucase twice as well as the use of several comparison operators drags performance down. Trust me- when it comes to Core-level routines- microseconds count, because those microseconds add up, especially in core-level routines. Quote It definitely speeds things up, thats for sure. I got lazy in a few comparisons in my Expression evaluator and used Ucase for a few comparisons; after replacing them the routine, although called over 10000 times to parse a complicated expression, decreased it's total time to almost half, even though I was using StrComp() in most places. THIS was NOT speculation. I saw it with my own eyes. NOTHING else in my library changed except for a accidental "=" that I changed to an StrComp(String,String,vbtextcompare) = 0. it is NOT speculation. note I've stripped the prolog from this dissassembly- or so it seems, anyway. Dissassembly of StrComp in MSVBVM60.dll: Code: [Select] 734E7A2D: 83 7E 14 00 cmp dword ptr [esi+14h],0 734E7A31: 76 39 jbe 734E7A6C 734E7A33: 8B 46 18 mov eax,dword ptr [esi+18h] 734E7A36: 8B 04 B8 mov eax,dword ptr [eax+edi*4] 734E7A39: 85 C0 test eax,eax 734E7A3B: 74 23 je 734E7A60 734E7A3D: 8B 08 mov ecx,dword ptr [eax] 734E7A3F: 8D 55 0C lea edx,[ebp+0Ch] 734E7A42: 52 push edx 734E7A43: 53 push ebx 734E7A44: 50 push eax 734E7A45: FF 11 call dword ptr [ecx] 734E7A47: 85 C0 test eax,eax 734E7A49: 75 15 jne 734E7A60 734E7A4B: 8B 45 0C mov eax,dword ptr [ebp+0Ch] 734E7A4E: 3B 45 FC cmp eax,dword ptr [ebp-4] 734E7A51: 75 07 jne 734E7A5A 734E7A53: C7 45 F8 01 00 00 mov dword ptr [ebp-8],1 00 734E7A5A: 8B 08 mov ecx,dword ptr [eax] 734E7A5C: 50 push eax 734E7A5D: FF 51 08 call dword ptr [ecx+8] 734E7A60: 83 7D F8 00 cmp dword ptr [ebp-8],0 734E7A64: 75 06 jne 734E7A6C 734E7A66: 47 inc edi 734E7A67: 3B 7E 14 cmp edi,dword ptr [esi+14h] 734E7A6A: 72 C7 jb 734E7A33 734E7A6C: 8B 4D FC mov ecx,dword ptr [ebp-4] 734E7A6F: EB 9F jmp 734E7A10 734E7A71: 55 push ebp 734E7A72: 8B EC mov ebp,esp 734E7A74: 83 EC 24 sub esp,24h 734E7A77: 8B 45 24 mov eax,dword ptr [ebp+24h] 734E7A7A: 53 push ebx 734E7A7B: 33 DB xor ebx,ebx 734E7A7D: 56 push esi 734E7A7E: 3B C3 cmp eax,ebx 734E7A80: 57 push edi 734E7A81: 8B F1 mov esi,ecx 734E7A83: 74 06 je 734E7A8B 734E7A85: C7 00 01 00 00 00 mov dword ptr [eax],1 734E7A8B: 8B 06 mov eax,dword ptr [esi] 734E7A8D: 56 push esi 734E7A8E: FF 50 04 call dword ptr [eax+4] 734E7A91: 8B 45 08 mov eax,dword ptr [ebp+8] 734E7A94: 8B 4D 0C mov ecx,dword ptr [ebp+0Ch] 734E7A97: 8B 55 18 mov edx,dword ptr [ebp+18h] 734E7A9A: 89 45 DC mov dword ptr [ebp-24h],eax 734E7A9D: 8B 45 10 mov eax,dword ptr [ebp+10h] 734E7AA0: 89 4D E0 mov dword ptr [ebp-20h],ecx 734E7AA3: 89 45 E4 mov dword ptr [ebp-1Ch],eax 734E7AA6: 8B 45 14 mov eax,dword ptr [ebp+14h] 734E7AA9: 83 F8 02 cmp eax,2 734E7AAC: 89 45 E8 mov dword ptr [ebp-18h],eax 734E7AAF: 89 55 EC mov dword ptr [ebp-14h],edx 734E7AB2: 89 5D F0 mov dword ptr [ebp-10h],ebx 734E7AB5: 89 5D F4 mov dword ptr [ebp-0Ch],ebx 734E7AB8: 89 5D F8 mov dword ptr [ebp-8],ebx 734E7ABB: 89 5D FC mov dword ptr [ebp-4],ebx 734E7ABE: 76 4C jbe 734E7B0C 734E7AC0: 68 57 00 04 60 push 60040057h 734E7AC5: FF 15 C8 11 42 73 call dword ptr ds:[734211C8h] 734E7ACB: 33 DB xor ebx,ebx 734E7ACD: 39 5D F8 cmp dword ptr [ebp-8],ebx 734E7AD0: 74 2D je 734E7AFF 734E7AD2: 33 FF xor edi,edi 734E7AD4: 39 5D 1C cmp dword ptr [ebp+1Ch],ebx 734E7AD7: 76 16 jbe 734E7AEF 734E7AD9: 8B 45 F8 mov eax,dword ptr [ebp-8] 734E7ADC: 8B 04 B8 mov eax,dword ptr [eax+edi*4] 734E7ADF: 3B C3 cmp eax,ebx 734E7AE1: 74 06 je 734E7AE9 734E7AE3: 8B 08 mov ecx,dword ptr [eax] 734E7AE5: 50 push eax 734E7AE6: FF 51 08 call dword ptr [ecx+8] 734E7AE9: 47 inc edi 734E7AEA: 3B 7D 1C cmp edi,dword ptr [ebp+1Ch] 734E7AED: 72 EA jb 734E7AD9 734E7AEF: FF 75 F8 push dword ptr [ebp-8] 734E7AF2: 53 push ebx 734E7AF3: FF 35 CC E7 52 73 push dword ptr ds:[7352E7CCh] 734E7AF9: FF 15 94 12 42 73 call dword ptr ds:[73421294h] 734E7AFF: 8B 06 mov eax,dword ptr [esi] 734E7B01: 56 push esi 734E7B02: FF 50 08 call dword ptr [eax+8] 734E7B05: 5F pop edi 734E7B06: 5E pop esi 734E7B07: 5B pop ebx 734E7B08: C9 leave 734E7B09: C2 24 00 ret 24h unfortunately between the compiler optimizations and the fact that dissassembly cannot preserve the original symbols, the original meaning is hard to determine. There are several jumps that skip large sections; additionally it calls the OLE functions for string comparison, which is not surprising. The fact that StrComp() works AND is faster then a less robust "capitalization" scenario tells us that SOMETHING is different between the two methods. It's fairly apparent that it simply doesn't waste time copying two string variables to be uppercase, analyzing the two strings (possibly determining a difference between them early on, thus causing the time spent on uppercasing (or whatever) to the strings a complete waste of time. The main reason for using StrComp() over some silly Ucase or Lcase kludge would be the fact that Ucase and Lcase aren't Turing complete and fail in several relatively benign scenarios, generally involving the fact that Ucase$() acts on the ANSI value which can cause compare errors with common Unicode symbols that are otherwise compared properly with Strcomp(). An additional fact adding to this is my previously explained ligature comparison ability. So really- aside from the fact that it's both faster and does a better job, and implements Locale-specific functionality I guess there is no reason to use StrComp over capitalizing (or lcasing) both of them. Or, we could use the string comparision Class found on VBSpeed: Code: [Select]' By Chris Lucas, [emailprotected], 20011204/20020607 ' Thanks to Olaf for the class implementation concept Option Explicit Private Declare Function ArrPtr& Lib "msvbvm60.dll" Alias "VarPtr" (ptr() As Any) Private Declare Sub RtlMoveMemory Lib "kernel32" (dst As Any, src As Any, ByVal nBytes&) Private Header1(5) As Long Private Header2(5) As Long Private SafeArray1() As Long Private SafeArray2() As Long Private Sub Class_Initialize() ' Set up our template for looking at strings Header1(0) = 1 ' Number of dimensions Header1(1) = 4 ' Bytes per element (long = 4) Header1(4) = &H7FFFFFFF ' Array size ' Force SafeArray1 to use Header1 as its own header RtlMoveMemory ByVal ArrPtr(SafeArray1), VarPtr(Header1(0)), 4 ' Set up our template for look at search text Header2(0) = 1 ' Number of dimensions Header2(1) = 4 ' Bytes per element (long = 4) Header2(4) = &H7FFFFFFF ' Array size ' Force SafeArray1 to use Header1 as its own header RtlMoveMemory ByVal ArrPtr(SafeArray2), VarPtr(Header2(0)), 4 End Sub Private Sub Class_Terminate() ' Make SafeArray1 once again use its own header ' If this code doesn't run the IDE will crash RtlMoveMemory ByVal ArrPtr(SafeArray1), 0&, 4 RtlMoveMemory ByVal ArrPtr(SafeArray2), 0&, 4 End Sub Friend Function IsSameText03(String1 As String, String2 As String, Compare As VbCompareMethod) As Boolean ' By Chris Lucas, [emailprotected], 20011204 Dim i&, SLen&, tmp1&, tmp2&, tmp3&, tmp4&, alt& SLen = LenB(String1) If SLen <> LenB(String2) Then Exit Function Header1(3) = StrPtr(String1): Header2(3) = StrPtr(String2) If Compare = vbTextCompare Then For i = 0 To SLen \ 4 - 1 tmp1 = SafeArray1(i) tmp2 = (tmp1 And &HFFFF&) tmp3 = SafeArray2(i) tmp4 = (tmp3 And &HFFFF&) Select Case tmp2 Case 97& To 122&: alt = tmp2 - 32 Case 65& To 90&: alt = tmp2 + 32 Case 49&: alt = 185 Case 50&: alt = 178 Case 51&: alt = 179 Case 138&: alt = 154 Case 140&: alt = 156 Case 142&: alt = 158 Case 154&: alt = 138 Case 156&: alt = 140 Case 158&: alt = 142 Case 159&: alt = 255 Case 178&: alt = 50 Case 179&: alt = 51 Case 185&: alt = 49 Case 192& To 214&: alt = tmp2 + 32 Case 216& To 222&: alt = tmp2 + 32 Case 224& To 246&: alt = tmp2 - 32 Case 248& To 254&: alt = tmp2 - 32 Case 255&: alt = 376 Case 338&: alt = 339 Case 339&: alt = 338 Case 352&: alt = 353 Case 353&: alt = 352 Case 376&: alt = 255 Case 381&: alt = 382 Case 382&: alt = 381 End Select If alt <> tmp4 Then If tmp2 <> tmp4 Then Exit Function End If tmp2 = (tmp1 And &HFFFF0000) tmp4 = (tmp3 And &HFFFF0000) Select Case tmp2 Case &H610000 To &H7A0000: alt = tmp2 - &H200000 Case &H410000 To &H5A0000: alt = tmp2 + &H200000 Case &H310000: alt = &HB90000 Case &H320000: alt = &HB20000 Case &H330000: alt = &HB30000 Case &H8A0000: alt = &H9A0000 Case &H8C0000: alt = &H9C0000 Case &H8E0000: alt = &H9E0000 Case &H9B0000: alt = &H8A0000 Case &H9C0000: alt = &H8C0000 Case &H9E0000: alt = &H8E0000 Case &H9F0000: alt = &HFF0000 Case &HB20000: alt = &H320000 Case &HB30000: alt = &H970000 Case &HB90000: alt = &H310000 Case &HC00000 To &HD60000: alt = tmp2 + &H200000 Case &HD80000 To &HDE0000: alt = tmp2 + &H200000 Case &HE00000 To &HF60000: alt = tmp2 - &H200000 Case &HF80000 To &HFE0000: alt = tmp2 - &H200000 Case &HFF0000: alt = &H1780000 Case &H1520000: alt = &H1530000 Case &H1530000: alt = &H1520000 Case &H1600000: alt = &H1610000 Case &H1610000: alt = &H1600000 Case &H1780000: alt = &HFF0000 Case &H17D0000: alt = &H17E0000 Case &H17E0000: alt = &H17D0000 End Select If alt <> tmp4 Then If tmp2 <> tmp4 Then Exit Function End If Next i If (LenB(String1) \ 2 And 1) Then tmp2 = (SafeArray1(i) And &HFFFF&) tmp4 = (SafeArray2(i) And &HFFFF&) Select Case tmp2 Case 97& To 122&: alt = tmp2 - 32 Case 65& To 90&: alt = tmp2 + 32 Case 49&: alt = 185 Case 50&: alt = 178 Case 51&: alt = 179 Case 138&: alt = 154 Case 140&: alt = 156 Case 142&: alt = 158 Case 154&: alt = 138 Case 156&: alt = 140 Case 158&: alt = 142 Case 159&: alt = 255 Case 178&: alt = 50 Case 179&: alt = 51 Case 185&: alt = 49 Case 192& To 214&: alt = tmp2 + 32 Case 216& To 222&: alt = tmp2 + 32 Case 224& To 246&: alt = tmp2 - 32 Case 248& To 254&: alt = tmp2 - 32 Case 255&: alt = 376 Case 338&: alt = 339 Case 339&: alt = 338 Case 352&: alt = 353 Case 353&: alt = 352 Case 376&: alt = 255 Case 381&: alt = 382 Case 382&: alt = 381 End Select If tmp2 <> tmp4 Then If alt <> tmp4 Then Exit Function End If End If IsSameText03 = True Else For i = 0 To SLen \ 4 - 1 If SafeArray1(i) <> SafeArray2(i) Then Exit Function Next i If (LenB(String1) \ 2 And 1) Then If (SafeArray1(i) And &HFFFF&) <> (SafeArray2(i) And &HFFFF&) Then Exit Function End If IsSameText03 = True End If End Function Friend Function IsSameString02(String1 As String, String2 As String) As Boolean ' By Chris Lucas, [emailprotected], 20020607 Dim i&, Len1&, Len2&, tmp& ' Grab the string lengths Len1 = LenB(String1) \ 2: Len2 = LenB(String2) \ 2 ' Make an informed decision as to whether we should continue If Len1 <> Len2 Then GoTo BailOut ' Compare the strings Header1(3) = StrPtr(String1): Header2(3) = StrPtr(String2) tmp = Len1 \ 2 ' The first two characters come cheap If SafeArray1(i) <> SafeArray2(i) Then GoTo BailOut Else i = i + 1 DoLoop: If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If SafeArray1(i) <> SafeArray2(i) Then GoTo NotEqual Else i = i + 1 If i <= tmp Then GoTo DoLoop NotEqual: ' some characters don't match, but we need to check to ' see if it happened after the end of the string, a ' nasty side-effect of cascading ifs If i > tmp Then IsSameString02 = True BailOut: ' Lengths don't match, let's do absolutely nothing End Function this method is twice as fast as StrComp(), which is three times faster then any comparison operator. Since the strCompA() routine I provided earlier implementing the whole uppercasing to test thing uses at LEAST one comparison operator, it will be slower then either of the other methods even in the best case. Of course it would be silly to use this class except in extremely string-manipulation heavy programs/routines, since it has a tendency to crash during debugging (works 100% compiled). |
|