# findstr, exact match problem



## cmdnoob

Hi

I have a problem with findstr.

Im trying to do an exact string match in several files.

For example when I do


Code:


findstr "\<STRING\>" myfile.txt

It doesn't match STRINGGGG, STRING1 etc, which is perfect.

But when myfile.txt contains STRING; or STRING? for example it 
see it as a match, which I don't want.

Any ideas or hints appreciated


----------



## Ninjaboi

http://ss64.com/nt/findstr.htmlhttp://ss64.com/nt/findstr.html

Examples of findstr in the above link.

For what your doing though, your wanting it to locate not only the string STRING, but also any other word in myfile.txt that has STRING inside it ( STRINGGGGGG is a good example ). To do this in your code you provided, try changing it to this:




Code:


findstr "\<STRING.*" myfile.txt


That should work for you. If it doesn't, or that wasn't the answer you were looking for, just reply and I'll try to help further.


----------



## cmdnoob

Thanks for the reply, much appreciated

I might have explained my problem a bit unclear.

If myfile.txt contains the word STRING
and I execute 


Code:


findstr "\<STRING\>" myfile.txt

then I want findstr to produce a match.(which works)

If myfile.txt contains the word STRRINNGG and I execute 


Code:


findstr "\<STRING\>" myfile.txt

I don't want a match(which works)

These 2 cases works as I want.

BUT if myfile.txt contains the word STRING? or STRING; for example.
and I execute 


Code:


findstr "\<STRING\>" myfile.txt

then I want findstr to do a "no match"(but It findstr see it as a match)

Hope I made myself a bit clearer


----------



## Ninjaboi

Ah, so are you saying that your trying to make it not accept the string if a symbol is after it? Ex:

STRING!
STRING?
STRING.
STRING$
STRING#
STRING/

Again, that link I provided is very good. However, I'm not sure how you'd be able to do that. Possibly by adding a space after STRING ? If that wasn't what you meant, please tell me. Also, if that was what you needed, and it didn't work, tell me so I can attempt to figure it out.


----------



## TheOutcaste

Punctuation is a word boundary, so it should find those.

Try this:
*"\<STRING\>[^;?]"*
That will find *STRING.*, *STRING,*, *STRING:*, etc, but not *STRING;* or *STRING?*. you'd have to list every non-text character that you don't want it to match on.

This means that STRING must be followed by some character though, or it won't match. Not sure if it will count a CR/LF, didn't test it.


----------



## cmdnoob

Thank's for the replies :smile:


I tried


Code:


findstr "\<STRING\>[^;?]" myfile.txt

Which works as you said, except that it doesn't match just the word STRING
The reason for this is that findstr wont recognize whitespace or a CR/LF I guess.

1.The string Im searching for is always 4 characters long. 
2. If followed by CR/LF or whitespace or "nothing" then it is a match
3. Followed by anything else, it's a no match

I cant change the files I'm searching.

Any ideas ?


----------



## TheOutcaste

Create a text file named Searchstrings.txt with these two lines:


Code:


\<STRING[^[email protected]^<^>\[-`{-~\-=+'0-9]
\<STRING$

Search using this:
findstr /G:Searchstring.txt myfile.txt

This will find the word *STRING* only if at the end of a line, or not followed by punctuation or numbers.
It will find *STRING* if followed by space, tab, or any unicode character, like *STRINGé*.
If that won't work, you'd be better off using VBScript, which has much better RegEx support.

What exactly are you wanting to find?
*STRING* followed _only_ by space or CR? Or can it be followed by any white space character, like tab, vertical tab, form feed, line feed?
When a match is found, what do you want done then? Do you need the line that contained a match, or just need to know that a match exists someplace in the file?


----------



## cmdnoob

Thank's again 



> What exactly are you wanting to find?
> *STRING* followed _only_ by space or CR?


Yes, or tab, followed by anything else should be a no match

This "almost" does it


Code:


\<STRING[^;0-9a-zA-z-~\-=+']

It works with space, tab but not with CR/LF......

Im searching a directory structure recursive, looking at all the files contaning the word STRING, if the string is not found , pipe that filename to the screen.


----------



## cmdnoob

Just an update. I dont know if it's the best way to do it, but it seems to solve my problem.


Code:


FOR /R %mydir% %%i in (*.txt) do (findstr /i "\<%searchstr%[^;0-9a-zA-z-~\-=+']" "%%i" 1>nul 
            if errorlevel 1 findstr /i "\<%searchstr%$" "%%i" 1>nul
            if errorlevel 1 echo. echo Missing %searchstr% in file %%i)

Since there are not that many files to search, performance is not an issue.


----------



## TheOutcaste

Using a file to store the search strings and specifying it with */G:Searchstring.txt* lets you search on more than one pattern, so you can do the end of line search *\<STRING$*
Guess I did leave the A-Z and a-z out, didn't even think to test with that.

If it works, that all that matters. This VBScript will find the string only if it's followed by space, tab, or a CR:


Code:


Const ForReading = 1
StrFileName = Wscript.Arguments(0)
StrSearch = Wscript.Arguments(1)
Set objFSO = CreateObject("Scripting.FileSystemObject")

' Set search words
Set objRegEx = CreateObject("VBScript.RegExp")
objRegEx.IgnoreCase = True
objRegEx.Global = True
objRegEx.Pattern = "\b" & StrSearch & "[ \t\r]+"
' Open a file
Set objFile = objFSO.OpenTextFile(StrFileName,ForReading)
strContents = objFile.ReadAll
objFile.Close
' Search for the words
Set colMatches = objRegEx.Execute(strContents)  
Wscript.Echo colMatches.Count

Save it as find.vbs, then you can run it with this batch file


Code:


@Echo Off
Set mydir=C:\Test
Set Searchstring=STRING
For /F "Tokens=* Delims=" %%I In ('Dir /A-D /B /S "%mydir%\*.txt"') Do (
Echo.Checking %%I
For /F "Tokens=* Delims=" %%A In ('cscript /nologo find.vbs "%%~I" %Searchstring%') Do (
If %%A==0 Echo Missing %Searchstring% in file %%I))


----------



## cmdnoob

Nice indeed

Didn't know that you could write a vb script "out of the box" and execute it with a bat file. 

Appreciate all your replies

cheers :grin:


----------



## cmdnoob

Thanks for the previous replies, I have encountered another problem.

Counting the occurrence of each findstr and assign it to stdout.

If I have this code


Code:


FOR /R %mydir% %%i in (*.txt) do (findstr /i "\<%searchstr%[^;0-9a-zA-z-~\-=+']" "%%i" 1>nul 
            if errorlevel 1 findstr /i "\<%searchstr%$" "%%i" 1>nul
            if errorlevel 1 echo. echo Missing %searchstr% in file %%i)

Is there any way to catch stdout into a variable from the above structure. A seperate count for each findstr ? 

I tried


Code:


findstr /i /n "\<%searchstr2%[^;0-9a-zA-z-~\-=+']" "%%i" | find /c ":"

Which displays the correct count of occurence of STRING, but I can't assign that to a variable. 


Code:


1>%myvar% or 1>nul>%myvar%

The only way I found is to do it in a FOR loop


Code:


FOR %%a in (findstr..) do (set %%a = mycount)

But with 2 different findstr and looping of a file structure I find that quite hard.

I thought that I could pipe >1 (stdout) to a variable, but it seem only to work with files.

Again Im probably off better with VB, but if I could pipe the output in my current solution for each findstr it works.

Cheers


----------



## cmdnoob

Got it sorted



Code:


findstr /i /n "\<%searchstr2%[^;0-9a-zA-z-~\-=+']" "%%i" | find /c ":">__dumpx.tmp
set /p lines=<__dumpx.tmp
set /a count=lines
del __dumpx.tmp


----------

