Jump to content

Welcome to MSFN Forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. This message will be removed once you have signed in.
Login to Account Create an Account


Photo

FINDSTR workaround needed

- - - - -

  • Please log in to reply
19 replies to this topic

#1
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
Hi im using FINDSTR from batch file to compare two text files, however the files contain rather long lines, maybe greater than or around 200 characters.

and when run, FINDSTR simply complains that the search string is "too long", the limit seems to be somwhere around 127.

Is there any workaround or a different way to compare two text files? as far as i know the "FIND" command has a larger limit but im not sure how to make it work to my liking if it is possible at all.

any help is appreciated.


How to remove advertisement from MSFN

#2
uid0

uid0

    Advanced Member

  • Member
  • PipPipPip
  • 356 posts
You could try a grep port:
http://unxutils.sourceforge.net/
http://gnuwin32.sour...ckages/grep.htm
or just fc if you're comparing whole files, or winmerge to do it with a gui.

#3
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
ah thanks, so i guess its not possilbe through the default command line tools then?

shame, i would prefer to make the batch without using 3rd party tools.

#4
CoffeeFiend

CoffeeFiend

    Coffee Aficionado

  • Super Moderator
  • 5,399 posts
  • OS:Windows 7 x64
  • Country: Country Flag

without using 3rd party tools.

There's nothing forcing you to use any "tools" for this (included or 3rd party). You could move from 1980's best (batch files) to 1990's technology: VBScript (using InStr for example, which is meant precisely for this -- comparing text strings). You can even use regular expressions if you want... And that has worked out of the box on any Windows box for a little over a decade.
Coffee: \ˈkȯ-fē, ˈkä-\. noun. Heaven in a cup. Life's only treasure. The meaning of life. Kaffee ist wunderbar. C8H10N4O2 FTW.

#5
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
heh, thanks, i dont actually know any VB or anything i just wanted a fast way to compare the files with something i knew a little about which was just basically the batch tools.

can you confirm that findstr will not work? if it has any limitations and if i could somehow work round them?

EDIT: ill give VBscript a try, and see how it goes :)

Edited by bauxite, 03 February 2010 - 07:15 PM.


#6
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
If you posted what you needed to compare I think I could help you write a VBS Script for you.


GunSmokingMan



#7
Yzöwl

Yzöwl

    Wise Owl

  • Super Moderator
  • 4,530 posts
  • OS:Windows 7 x64
  • Country: Country Flag

Donator

If you are comparing the content of files on a line by line basis then, as already indicated, there's a built in commandline tool for that, fc.
FC /?

#8
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
Sorry i should have been more specific.

Its just two text files that contain unique lines of characters (including numbers/symbols like :,\)

1.txt contains on each line:

abcdefg
hds8745
fjhjgsjnn4
C:\im random\ini.txt

and 2.txt is almost similar:

Xbcdefg
hds8745
fjhjgsjnn4
C:\im random\ini.txt
im new.txt

so in 2.txt there may be new unique lines and/or some that are modified from 1.txt.
I used findstr like:

findstr /l /v /g:1.txt 2.txt >> changes.txt

which as i understand would try to match each line from 2.txt to 1.txt (not sure what order) and if no match was found redirect it to changes.txt therefore i would get a new text file with all the new or modified lines (although not deleted lines as far as i know).

this works as long as the entire line is less than about 125 characters long, that was the problem.

Edited by bauxite, 04 February 2010 - 07:10 PM.


#9
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
I have not tested this, but this VBS script might do what you want.
Fill in the paths

PATH_TO_FILE\SomeFile1.txt

PATH_TO_FILE\SomeFile2.txt

PATH_TO_FILE\Differences.txt

Save As Sort.vbs

Dim Fso :Set Fso = Createect("Scripting.FileSystemect")
 Dim StrTxt1, StrTxt2, StrTxt3, Ts
'-> First Text File
  Set Ts = Fso.OpenTextFile("PATH_TO_FILE\SomeFile1.txt", 1)
  StrTxt1 = Ts.ReadAll
  Ts.Close
'-> Second Text File
  Set Ts = Fso.OpenTextFile("PATH_TO_FILE\SomeFile2.txt", 1)
	Do Until Ts.AtEndOfStream
	 StrTxt2 = Ts.ReadLine
 '-> If Not There Add It	 
	 If Not InStr(StrTxt2, StrTxt1) Then
		StrTxt3 = StrTxt3 & StrTxt2 & vbCrLf
	 End If
	Loop
   Ts.Close
'-> Third Text File
   Set Ts = Fso.CreateTextFile("PATH_TO_FILE\Differences.txt")
   Ts.WriteLine StrTxt3
   Ts.Close




GunSmokingMan



#10
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
on running i get windows script host error,

line:1
Char:10
Error:Type mismatch:'Createect'
Code:800A00D
Source:Microsoft VBscript runtime error

#11
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
Sorry my mistake on the spelling it should be

 Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")

Not this

Dim Fso :Set Fso = Createect("Scripting.FileSystemect")




GunSmokingMan



#12
CoffeeFiend

CoffeeFiend

    Coffee Aficionado

  • Super Moderator
  • 5,399 posts
  • OS:Windows 7 x64
  • Country: Country Flag

Dim Fso :Set Fso = CreateObject("SScripting.FileSystemObject")

or perhaps:
Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")
(no double S) ;)
edit: looks like you fixed it already...

Either ways, I'm still not completely sure how it should work in the first place, like how it should react if extra lines were inserted in one file (report the extra line, or keep reporting mismatches from that point on), or if it should be a "dumb" line by line comparison against the other file, or even just seeing if the line merely exists in the other file (line/order unimportant). We need more details.

Edited by CoffeeFiend, 04 February 2010 - 09:07 PM.

Coffee: \ˈkȯ-fē, ˈkä-\. noun. Heaven in a cup. Life's only treasure. The meaning of life. Kaffee ist wunderbar. C8H10N4O2 FTW.

#13
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
Thank you all, the script now runs however it just gives me a copy of "SomeFile2.txt" in "Differences.txt",

im not sure what you mean, the files are not being changed while the script is run? ill try to explain it better:

Every line in the text file is unique to that particular file, and both files contain some lines which match exactly and others which do not, however either file may have more or less lines the the other.

simply i want to discard all the same lines to leave me with the rest or, only the lines in "2.txt" which are not in "1.txt" (if that makes the process easier)

Hope im making sense, i think i may even be intersted in learning VBscript, although i barely have much time and it seems rather difficult, anyway thanks :)

after quite a bit of reading at the MSDN site, specifically on the InStr function, and with lots of luck i seemingly found a solution.
 

Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")
Dim StrTxt1, StrTxt2, StrTxt3, Ts
'-> First Text File
Set Ts = Fso.OpenTextFile("PATH_TO_FILE\SomeFile1.txt", 1)
StrTxt1 = Ts.ReadAll
Ts.Close
'-> Second Text File
Set Ts = Fso.OpenTextFile("PATH_TO_FILE\SomeFile2.txt", 1)
Do Until Ts.AtEndOfStream
StrTxt2 = Ts.ReadLine
'-> If Not There Add It
If InStr(StrTxt1, StrTxt2) = 0 Then
StrTxt3 = StrTxt3 & StrTxt2 & vbCrLf
End If
Loop
Ts.Close
'-> Third Text File
Set Ts = Fso.CreateTextFile("PATH_TO_FILE\Differences.txt")
Ts.WriteLine StrTxt3
Ts.Close

all i did was was change the StrTxt variables around and make the whole thing If = 0, i barely know what im talking about here so dont laugh :D

Edited by bauxite, 04 February 2010 - 09:45 PM.


#14
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
Glad we could help.


GunSmokingMan



#15
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
sorry again, i spoke too soon, further testing shows it infact only works for some text file pairs as the one i posted above (which i was using to test the scripts)

I was unable to complete the script by myself, no idea why it doesnt work for any two text files..(i think its because of how InStr works when used in that script, it doesnt seem to care for "lines")

but i was thinking of a different approach, something like comparing each line from "2.txt" to every single line one at a time in "1.txt", it could be a better way? although maybe slower.

After more experimentation i figured the line by line approach could be done through batch script, im no pro but it resulted in this amazingly bad (but working) batch file, or so i think, anyway i created two special small text files to test this, as it takes a long time on large files, but works quick for a few lines..

here is the bat code, feel free to laugh this time. :)

setlocal enabledelayedexpansion
set var=1
for /f "delims=" %%S in (2.txt) do ((for /f "delims=" %%G in (1.txt) do (if !var!==1 (If "%%S"=="%%G" (set var=0) ELSE (If "%%G"=="END" echo %%S>>diff.txt)))) && set var=1)

It's a bit of a hack since the string "END" must be at the end of each text file for the batch to know its the "END" lol

translation to VBscript or Jscript help would be appreciated (if its possible in Jscript, anything thats faster lol i dont mind)

files attached in zip archive include, the bat file, the vbs script, both text files and the expected results to compare.

Attached Files


Edited by bauxite, 05 February 2010 - 07:27 PM.


#16
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
I tested the script
Test1.txt

Operating System	   » Microsoft Windows 7 Ultimate 64-bit
  Os Version			 » 6.1.7600 Ultimate Edition 
  Build Type			 » Multiprocessor Free
  Serial Number		  » 00426-292-0106894-85791
  Processor Name		 » Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz (x64)
  Video Card Name	    » NVIDIA GeForce 9800 GT
  Sound Device Name	  » C-Media PCI Audio Device
  Network Adapter	    » 802.11n Wireless PCI Express Card LAN Adapter
  Network Adapter	    » Realtek RTL8168C(P)/8111C(P) Family PCI-E Gigabit Ethernet NIC (NDIS 6.20)


Test2.txt

Computer Name		  » HOME-BETA-2008
  Operating System	   » Microsoft Windows 7 Ultimate 64-bit
  Os Version			 » 6.1.7600 Ultimate Edition 
  Build Type			 » Multiprocessor Free
  Serial Number		  » 00426-292-0106894-85791
  Physical Memory	    » 8.00 GB
  Processor Name		 » Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz (x64)
  Video Card Name	    » NVIDIA GeForce 9800 GT
  Sound Device Name	  » C-Media PCI Audio Device
  Network Adapter	    » 802.11n Wireless PCI Express Card LAN Adapter
  Network Adapter	    » Realtek RTL8168C(P)/8111C(P) Family PCI-E Gigabit Ethernet NIC (NDIS 6.20)
  Computer Type	  	» KQ499AA-A2L m9360f


Differences.txt

Computer Name		  » HOME-BETA-2008
  Physical Memory	    » 8.00 GB
  Computer Type	  	» KQ499AA-A2L m9360f


So the scripts looks for what missing in Test1, From Test2, reports it in Differences.txt

Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")
 Dim StrTxt1, StrTxt2, StrTxt3, Ts
'-> First Text File
  Set Ts = Fso.OpenTextFile("Test1.txt", 1)
  StrTxt1 = Ts.ReadAll
  Ts.Close
'-> Second Text File
  Set Ts = Fso.OpenTextFile("Test2.txt", 1)
	Do Until Ts.AtEndOfStream
	 StrTxt2 = Ts.ReadLine
	If InStr(StrTxt1, StrTxt2) = 0 Then
		StrTxt3 = StrTxt3 & StrTxt2 & vbCrLf
	End If
	Loop 
   Ts.Close
'-> Third Text File
   Set Ts = Fso.CreateTextFile("Differences.txt")
   Ts.WriteLine StrTxt3
   Ts.Close




GunSmokingMan



#17
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
yes you are right, in my attachment i have provided 2 files that dont work properly with the script, i think this is because of common "fragment" issues, something like that, however the bat works properly since its a line by line comparison although takes a year.

anyway its not a huge deal.

thanks again.

Edited by bauxite, 05 February 2010 - 07:50 PM.


#18
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
The script only looks for what missing from one file, what you want is a script that looks for what missing from both files.
I write up a script that will look for what missing from both files.

1.txt

XYZ
ABC
notreallyuncommon
onlyin1.txtsodontreallymatter1
onlyin1.txtsodontreallymatter2
x
y
astring1
astring2
astring3
D
E
p
o
i
GV
END


2.txt

uncommon
XYZ
uncommonstring2
astring1
astring2
astring3
A
B
uncommonstring3
C
D
E
i
o
p
V
Z
notreallyuncommon
uncommonstring1
END


What missing from both files are stored Differences.txt

ABC
onlyin1.txtsodontreallymatter1
onlyin1.txtsodontreallymatter2
x
y
GV
uncommon
uncommonstring2
A
B
uncommonstring3
C
V
Z
uncommonstring1

Save As Sort2.vbs

Dim Dic :Set Dic = CreateObject("Scripting.Dictionary")
 Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")
 Dim Obj, Ts, Txt1
 Set Ts = Fso.OpenTextFile("1.txt", 1)
   Do Until Ts.AtEndOfStream
	Txt1 = Ts.ReadLine 
	Dic.Add Txt1 ,Txt1
	Loop
	Ts.Close 
	 Set Ts = Fso.OpenTextFile("2.txt", 1)
	 Do Until Ts.AtEndOfStream
	  Txt1 = Ts.ReadLine
	  If Not Dic.Exists(Txt1) Then
	   Dic.Add Txt1 ,Txt1
	  Else
	   Dic.Remove Txt1
	  End If 
	 Loop
   Ts.Close	
   Set Ts = Fso.CreateTextFile("Differences.txt")
   For Each Obj In Dic.Keys
	 Ts.WriteLine Obj
   Next 
   Ts.Close




GunSmokingMan



#19
bauxite

bauxite

    Newbie

  • Member
  • 22 posts
Awesome.

that is quite brilliant, thank you again :)

#20
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
Your welcome, now the only problem might be in the first loop change

Do Until Ts.AtEndOfStream
	Txt1 = Ts.ReadLine 
	Dic.Add Txt1 ,Txt1
	Loop


Dictionary Objects can not have duplicate objects, so incase 1.txt has say

astring3
astring3

the script would error out because the object exists, this code change prevents that error

Do Until Ts.AtEndOfStream
	Txt1 = Ts.ReadLine 
	 If Not Dic.Exists(Txt1) Then
	  Dic.Add Txt1 ,Txt1
	 End If 
	Loop




GunSmokingMan






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users



How to remove advertisement from MSFN