MSFN Forum: How to merge two text files? - MSFN Forum

Jump to content


  • 12 Pages +
  • 1
  • 2
  • 3
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

How to merge two text files? and sort them... Rate Topic: -----

#1 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 08 June 2011 - 12:53 AM

There are two files:

1.txt
[SourceFileInfo]
clusapi.dll=B95AC82B54FE4359C3453264F848509A,0005000008931AA8,55568


2.txt
[SourceFileInfo]
clusnet.sys=A0610690266ED57A2D04EA5D7EC8084C,0005000008931AA8,67760


If I do "copy 1.txt+2.txt 3.txt" I get this:
[SourceFileInfo]
clusapi.dll=B95AC82B54FE4359C3453264F848509A,0005000008931AA8,55568
[SourceFileInfo]
clusnet.sys=A0610690266ED57A2D04EA5D7EC8084C,0005000008931AA8,67760


but I would like to get something similar to this:
[SourceFileInfo]
clusapi.dll=B95AC82B54FE4359C3453264F848509A,0005000008931AA8,55568
clusnet.sys=A0610690266ED57A2D04EA5D7EC8084C,0005000008931AA8,67760


Is it possible?


#2 User is offline   GrofLuigi 

  • GroupPolicy Tattoo Artist
  • PipPipPipPipPipPip
  • Group: Members
  • Posts: 1,277
  • Joined: 21-April 05
  • OS:none specified
  • Country: Country Flag

Posted 08 June 2011 - 01:02 AM

sort 3.txt /o 4.txt

One liner (but needs fixing by someone more knoledgable):

copy 1.txt+2.txt | sort /o 3.txt

GL

#3 User is offline   dencorso 

  • Adiuvat plus qui nihil obstat
  • Group: Super Moderator
  • Posts: 4,983
  • Joined: 07-April 07
  • OS:98SE
  • Country: Country Flag

Posted 08 June 2011 - 01:30 AM

Beyond Compare is the way to go... it's not free, but it's worth the cost.

#4 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 08 June 2011 - 02:23 AM

Or using unix tools:
grep -vi "\[SourceFileInfo\]" 1.txt  >>2.txt


"\" are need to escape the "[" used with grep for regular expression.

#5 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 08 June 2011 - 03:50 AM

Thank you very much :) I'll check them for sure. I'm especially interested in this unix one as it could be used in a batch script.

By the way, could anyone help me with this script?

MD HFMER TEMP2
DIR/B/A-D/OGN/ON HF\*.EXE>HF.TXT
SET HF=
FOR /F %%I IN (HF.TXT) DO (SET HF=%%I&IF DEFINED HF CALL :HFEXTRACT)
DEL/Q/S HF.TXT

REM ======================TYPE 1 HOTFIXES=================================================
:HFEXTRACT
TITLE %T1% - Processing %HF%
ECHO %HF%
MD TEMP&START/WAIT HF\%HF% /Q /X:TEMP
XCOPY/DEHRY TEMP HFMER
MOVE TEMP\UPDATE\update.inf HFMER\UPDATE\%HF%.inf
MOVE TEMP\UPDATE\update_w2k.inf HFMER\UPDATE\%HF%.inf
MOVE TEMP\UPDATE\update.ver TEMP2\%HF%.ver
DEL/Q/S HFMER\UPDATE\update*.inf HFMER\UPDATE\update.ver
RD/Q/S TEMP
CLS
REM =====================================================================================

COPY/B TEMP2\*.ver HFMER\UPDATE\update.ver
SORT HFMER\UPDATE\update.ver /O HFMER\UPDATE\update.ver
RD/Q/S TEMP2


This is a mix of strings taken from HFSLIP and my own but the problem is that I need the last three lines to go after the rest of the script is finished. It work correctly if I remove the last line but the folder TEMP2 is not deleted then.

This post has been edited by tomasz86: 08 June 2011 - 04:23 AM


#6 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 08 June 2011 - 06:51 AM

I've managed to overcome it:
IF EXIST HF\*.EXE (
	MD HFMER TEMP2
	DIR/B/A-D/OGN/ON HF\*.EXE>HF.TXT
	SET HF=
	FOR /F %%I IN (HF.TXT) DO (SET HF=%%I&IF DEFINED HF CALL :HFEXTRACT)
	DEL/Q/S HF.TXT
	CALL :HFCS
)
REM ======================TYPE 1 HOTFIXES=================================================
:HFEXTRACT
TITLE %T1% - Processing %HF%
ECHO %HF%
MD TEMP&START/WAIT HF\%HF% /Q /X:TEMP
XCOPY/DEHRY TEMP HFMER
MOVE TEMP\UPDATE\update.inf HFMER\UPDATE\%HF%.inf
MOVE TEMP\UPDATE\update_w2k.inf HFMER\UPDATE\%HF%.inf
MOVE TEMP\UPDATE\update.ver TEMP2\%HF%.ver
DEL/Q/S HFMER\UPDATE\update*.inf HFMER\UPDATE\update.ver
RD/Q/S TEMP
IF NOT EXIST HF.TXT (
	COPY/B TEMP2\*.ver HFMER\UPDATE\update.ver
	SORT HFMER\UPDATE\update.ver /O HFMER\UPDATE\update.ver
	RD/Q/S TEMP2
)
REM ======================================================================================


I have a question about this Unix tool called grep.

When I have only two files:

1.ver
[SourceFileInfo]
basesrv.dll=7F87C84D34813197A2360CEA800A7464,0005000008931B27,46352
cmd.exe=7705AED861C7FDBD919E771A1B42B5AA,0005000008931AA8,236304

2.ver
[SourceFileInfo]
clusapi.dll=B95AC82B54FE4359C3453264F848509A,0005000008931AA8,55568
clusnet.sys=A0610690266ED57A2D04EA5D7EC8084C,0005000008931AA8,67760


after "grep -vi "\[SourceFileInfo\]" 1.ver >>2.ver"
[SourceFileInfo]
clusapi.dll=B95AC82B54FE4359C3453264F848509A,0005000008931AA8,55568
clusnet.sys=A0610690266ED57A2D04EA5D7EC8084C,0005000008931AA8,67760
basesrv.dll=7F87C84D34813197A2360CEA800A7464,0005000008931B27,46352
cmd.exe=7705AED861C7FDBD919E771A1B42B5AA,0005000008931AA8,236304


Everything is OK. What about more than two files? Is it possible to use grep with multiple files?

Also an another question (2 questions ;)): Can the list be sorted alphabetically? Is there a way to remove duplicates?

This post has been edited by tomasz86: 08 June 2011 - 08:09 AM


#7 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 08 June 2011 - 01:29 PM

Yes grep can work on multiple files and no it can't sort or remove duplicates unless you do the scripting for it .

#8 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 12:09 AM

How should I use this command
grep -vi "\[SourceFileInfo\]" 1.txt  >>2.txt

when there are multiple files?

I tried something like this:
grep -vi "\[SourceFileInfo\]" *.txt  >>3.txt

but it doesn't work. Also tried doing
copy 1.txt+2.txt 3.txt
grep -vi "\[SourceFileInfo\]" 3.txt  >>3.txt

but after doing this there is no longer [SourceFileInfo] in the final file.

This post has been edited by tomasz86: 09 June 2011 - 12:09 AM


#9 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 09 June 2011 - 12:32 AM

grep -vih "\[SourceFileInfo\]" 1.txt 2.txt >>3.txt

But then if you need sorting and finding duplicates to choose the most recent version of each file, you might want to use vbs or autoit as it would be a lot more easier to script.

This post has been edited by allen2: 09 June 2011 - 12:36 AM


#10 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 12:37 AM

I see. Any way to do it for a larger number of files automatically, ex. 50 text files in one directory?

#11 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 09 June 2011 - 12:48 AM

Yes something like this should work (i added the sort part with the unix tool "sort.exe") :
set filenames=
for /f "delims= usebackq" %%i in (`dir /b %target%\*.inf`) do (set %filenames%=%filenames% %%i)
echo [SourceFileInfo] >result.txt
grep -vih "\[SourceFileInfo\]" %filenames% |sort -d >>result.txt

Due to batch variables limitations %filenames% can't have more than 2047 characters so it might not work with long filenames.

This post has been edited by allen2: 09 June 2011 - 12:58 AM


#12 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 01:05 AM

It's getting a little bit complicated for me as I'm just a beginner when it comes to scripting ;)

Anyway, I'll try to get the above script to work.

#13 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 09 June 2011 - 01:15 AM

@allen2
You don't actually *need* grep or any other "third party" program for such a simple task.

You can use either FIND /V "[" to exclude the line(s) containing square brackets or or use a FOR /F with a SKIP=1 (if that line is always the first one), or use FIND or FINDSTR to only get lines containing commas.

This could do:
@ECHO OFF
IF EXIST result.txt DEL result.txt
FOR /F %%A IN ('dir /b *.txt') DO (
FOR /F "skip=1 tokens=*" %%B IN (%%A) DO (
ECHO %%A -- %%B
ECHO %%B >>lines.txt
)
)
ECHO [SourceFileInfo] >result.txt
SORT lines.txt >>result.txt
DEL lines.txt
ECHO.
MORE result.txt


jaclaz

#14 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 09 June 2011 - 01:43 AM

@jaclaz
I hate the dos find because i always had hard time getting it to do what i wanted ; and also, as i began scripting on unix , i always find first an algorithm translation with a batch adapted from some kind of unix shell scripting.
Anyway, i never said my script coding was the best or even good.
That said, i would do what tomasz86 is trying to do (extracting hotfix, getting files with the higher version in one .inf to create a service pack) with autoit (for the duplicate file part) but a batch script might be easy for you.

#15 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 01:59 AM

Thank you very much :thumbup

I changed it like this:
@ECHO OFF
FOR /F %%A IN ('dir /b *.ver') DO (
FOR /F "skip=1 tokens=*" %%B IN (%%A) DO (
ECHO %%A -- %%B
ECHO %%B >>TEMP2\update.txt
)
)
ECHO [SourceFileInfo] >HFMER\UPDATE\update.ver
SORT TEMP2\update.txt >>HFMER\UPDATE\update.ver
DEL TEMP2\update.txt
ECHO.
MORE HFMER\UPDATE\update.ver

but I can't set the folder for '*.ver' files to be 'TEMP2\*.ver'. How should I edit this line?
FOR /F %%A IN ('dir /b *.ver') DO (

This post has been edited by tomasz86: 09 June 2011 - 01:59 AM


#16 User is offline   allen2 

  • Not really Newbie
  • PipPipPipPipPipPipPip
  • Group: Members
  • Posts: 1,750
  • Joined: 13-January 06

Posted 09 June 2011 - 02:15 AM

@jaclaz
Also , i am not sure the "echo %%B >>lines.txt" line will work for every kind of character the line might contains: i am thinking about "%", ">" or "&" for example but perhaps the .inf files won't ever contains one of them.

This post has been edited by allen2: 09 June 2011 - 02:16 AM


#17 User is offline   jaclaz 

  • The Finder
  • Group: Developers
  • Posts: 11,574
  • Joined: 23-July 04
  • OS:none specified
  • Country: Country Flag

Posted 09 June 2011 - 05:56 AM

@allen2
I don't know either, I just provided what IMHO is the simplest solution that fulfills the OP requirements and works with the examples posted.
It is not meant in any way as a competition of the type "my batch is better than yours", it's simply a way to exchange ideas.

@tomasz86
Try with:
@ECHO OFF 
FOR /F %%A IN ('dir /b .\TEMP2\*.ver') DO ( 
FOR /F "skip=1 tokens=*" %%B IN (.\TEMP2\%%A) DO ( 
ECHO .\TEMP2\%%A -- %%B 
ECHO %%B >>.\TEMP2\update.txt 
) 
) 
ECHO [SourceFileInfo] >.\HFMER\UPDATE\update.ver 
SORT .\TEMP2\update.txt >>.\HFMER\UPDATE\update.ver 
DEL .\TEMP2\update.txt 
ECHO. 
MORE .\HFMER\UPDATE\update.ver


Or use %~dp0:
@ECHO OFF
FOR /F %%A IN ('dir /b %~dp0*.ver') DO ( 
FOR /F "skip=1 tokens=*" %%B IN (%~dp0%%A) DO ( 
ECHO .%~dp0%%A -- %%B 
ECHO %%B >>%~dp0update.txt 
) 
) 
ECHO [SourceFileInfo] >%~dp0HFMER\UPDATE\update.ver 
SORT .\TEMP2\update.txt >>%~dp0HFMER\UPDATE\update.ver 
DEL %~dp0TEMP2\update.txt 
ECHO. 
MORE %~dp0HFMER\UPDATE\update.ver


If I get it right what you need/want.

%0 is the actual command line invoked i.e. the name of the batch or parameter 0 (zero).
%~dp0 is the same variable expanded to only drive and path (with a trailing backslash)

jaclaz

#18 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 07:08 AM

It works :) Thank you.

FOR /F %%A IN ('dir /b .\TEMP2\*.ver') DO ( 
	FOR /F "skip=1 tokens=*" %%B IN (.\TEMP2\%%A) DO ( 
	ECHO .\TEMP2\%%A -- %%B 
	ECHO %%B >>.\TEMP2\update.txt 
	) 
	) 
	ECHO [SourceFileInfo] >.\HFMER\UPDATE\update.ver 
	SORT .\TEMP2\update.txt >>.\HFMER\UPDATE\update.ver  
	RD /Q/S TEMP2
	ECHO. 
	MORE .\HFMER\UPDATE\update.ver


Generally, (as allen2 already pointed out) what I want to do here is to make a script for merging hotfixes(updates) for Windows 2000/XP/2003. It's code-name is HFMER (Hotfix Merger).

I asked about copying files and checking their file versions in an another thread but it seems to be too complicated to do now. Until I learn how to write such a script I'll stick to xcopy as it's the same way as HFSLIP uses when slipstreaming hotfixes. It may not be ideal but still such cases when the older file has newer version are quite rare so it's not such a big problem.

Anyway, by using this script I'm able to almost do everything to prepare the folders, files and the update.ver file containing the information from all of the merged hotfixes. Of course there will be duplicates in it but not that many so I can just remove them manually. Update.ver is not used by HFSLIP anyway.

The main problem lies in combining update.inf files. What I'm thinking about now is this. It's just an example. Let's say I have two update.inf files. After doing

copy update1.inf+update2.inf update3.inf

I get

[Version]

    Signature                 = "$Windows NT$"
	
[SourceDisksFiles]

    ipsecmon.exe=1
	
[Version]

    Signature                 = "$Windows NT$"
	
[SourceDisksFiles]

    remotesp.tsp=1

It would really help me if I could get from it this:

[Version]

    Signature                 = "$Windows NT$"
    Signature                 = "$Windows NT$"
	
[SourceDisksFiles]

    ipsecmon.exe=1
    remotesp.tsp=1


I don't care for duplicates now but is there any program that would do such a sorting automatically?

This post has been edited by tomasz86: 09 June 2011 - 07:09 AM


#19 User is offline   tomasz86 

  • http://windows2000.tk
  • PipPipPipPipPipPipPipPip
  • Group: Members
  • Posts: 2,241
  • Joined: 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag

Posted 09 June 2011 - 08:43 AM

By the way, I checked the Beyond Compare program recommended by dencorso but I found something else which, while quite similar to the above, is simpler and I find it more suitable for the task of merging the update.inf files (unless there is a tool that can do the task I mentioned in the previous post...).

http://kdiff3.sourceforge.net/

Posted Image

It's GPL. The other tool I had already known and have been using is WinMerge but WinMerge can compare only 2 files at once while Kdiff and Beyond Compare can compare 3 files. What I like the most about KDiff is its interface which is sooooo simple that comparing and merging files is very easy and can be done quicker then Beyond Compare for example (and without the need to memorise tens of keyboard shortuts ;)).

This post has been edited by tomasz86: 09 June 2011 - 08:48 AM


#20 User is offline   Yzöwl 

  • Wise Owl
  • Group: Super Moderator
  • Posts: 4,369
  • Joined: 13-October 04
  • OS:Windows 7 x64

Posted 09 June 2011 - 09:23 AM

What you require can be done in pure batch, and obviously with the use of third party utilities.

What happens if you have multiple files in which the content potentially overwrites content of another, how do you decide which line takes precedence when your script orders it? Are you sure that alphabetical will not mean overwriting something which should take precedence. If the file for instance works in linear form meaning line 20 will always be processed after line 19, what happens if line 21 only happens if line 19 is set to 0, but line 20 alphabetically changed that to 1 because line 22 onwards only worked if if the data was set to 1. I know its hard to explain but that's why it is generally easier to allow each individual file to run in sequence.

I could of course just provide a routine for you, (but then I could simply post it as my own project and take all the credit for it myself). It is after all close to what HFSLIP should have been, had its not become bloated with pointless additions and spoiled by poor/unmanageable scripting.

Share this topic:


  • 12 Pages +
  • 1
  • 2
  • 3
  • Last »
  • You cannot start a new topic
  • You cannot reply to this topic

4 User(s) are reading this topic
0 members, 4 guests, 0 anonymous users



All trademarks mentioned on this page are the property of their respective owners
Copyright © 2001 - 2013 msfn.org
Privacy Policy