Jump to content

Welcome to MSFN Forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. This message will be removed once you have signed in.
Login to Account Create an Account


Photo

How to merge two text files?

- - - - -

  • Please log in to reply
237 replies to this topic

#201
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

I was wrong. It still doesn't ECHO lines starting with ";" when EOL isn't defined :(

Yes, the semicolon is the default EOL of the FOR, you can do something like:
FOR /F "eol=§ tokens=1,2 delims=, " %%a in ('TYPE 1.inf') do echo %%a,%%b
The § character is in "Windows", in command line it will "look" different, it's ALT+0167, but you can use any other "extended ASCII" character that won't be normally used in a .inf.

jaclaz


How to remove advertisement from MSFN

#202
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
You seem to like "§" a lot :D but do you remember that it was also present in your original SPLITINF script... and the problem was that it didn't work when system locale was set to Korean :( and in the end I had to replace it with something else which was "}#}" so I'd like to avoid using such characters.
Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#203
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

You seem to like "§" a lot :D but do you remember that it was also present in your original SPLITINF script... and the problem was that it didn't work when system locale was set to Korean :( and in the end I had to replace it with something else which was "}#}" so I'd like to avoid using such characters.

Naah it's only because it is easily available on my italian keyboard, as said it is up to you to find a character that is not used in "Korean" (or "other rare language" files).

BTW it is possible (but you will have to experiment) that you can use one of the "lower" ASCII Characters, in the range 0 to 31 decimal, really cannot say. :unsure:

jaclaz

#204
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
I wanted to use both double quotes and spaces as delimiters but I didn't know how to do it so I searched for a solution in the Internet but couldn't really find anything useful. Finally I've managed to figure it out by myself and I'm posting this in case someone else encounters a similar problem:

FOR /F delims^=^"^  %%A IN ("abc = "1"") DO ECHO "%%A"
It will display just

"abc"
Notice that there are two spaces before %%A.
Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#205
Yzöwl

Yzöwl

    Wise Owl

  • Super Moderator
  • 4,534 posts
  • Joined 13-October 04
  • OS:Windows 7 x64
  • Country: Country Flag

Donator

The example you have provided shows us only that the space has been delimited because it has split the string before it has seen your double quote and output only the first token.

See how these work:
@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO/DELIMITED DOUBLE QUOTES ONLY
FOR /F TOKENS^=1-3^ DELIMS^=^" %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C]
FOR /F USEBACKQ^ TOKENS^=1-3^ DELIMS^=^" %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES AND SPACES
FOR /F TOKENS^=1-6^ DELIMS^=^"^  %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
FOR /F USEBACKQ^ TOKENS^=1-6^ DELIMS^=^"^  %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
PAUSE & GOTO :EOF


#206
bphlpt

bphlpt

    MSFN Addict

  • Member
  • PipPipPipPipPipPipPip
  • 1,798 posts
  • Joined 12-May 07
  • OS:none specified
  • Country: Country Flag
Actually, if all tomasz86 is interested in is the first token in the string, then his code does seem to work, as shown by running this (NOTE two spaces before "%%A"):

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO/TOMASZ86 METHOD FOR FIRST TOKEN ONLY
FOR /F delims^=^"^  %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A]
FOR /F delims^=^"^  %%A IN ("THIS"TEST" IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES ONLY
FOR /F TOKENS^=1-3^ DELIMS^=^" %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C]
FOR /F USEBACKQ^ TOKENS^=1-3^ DELIMS^=^" %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES AND SPACES
FOR /F TOKENS^=1-6^ DELIMS^=^"^  %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
FOR /F USEBACKQ^ TOKENS^=1-6^ DELIMS^=^"^  %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
PAUSE & GOTO :EOF

The output for both lines of the code added to show his method is -- [THIS]

If you want more flexibility over getting more than just the first token, then your code is much more complete and accurate, Yzöwl, as always.

EDIT: If, however, he was trying to get the same result for VAR=VAL vs VAR = VAL vs VAR="VAL" vs VAR = "VAL", then this code would work:

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO GET SAME RESULT FOR VAR=VAL vs VAR = VAL vs VAR="VAL" vs VAR = "VAL"
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^  %%A IN ("abc=1") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^  %%A IN ("abc = 1") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^  %%A IN ("abc="1"") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^  %%A IN ("abc = "1"") DO ECHO/[%%A] [%%B]
PAUSE & GOTO :EOF

All four lines will echo

[abc] [1]

So it all depends on what he really wants to do.

Cheers and Regards

Edited by bphlpt, 11 April 2013 - 05:54 PM.

Posted Image


#207
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
Hmm, actually the strings go like this:

1.txt
abc = 1
abc= 1
abc =1
abc = "1 2"
abc= "1 2"
abc ="1 2"
etc.
They're not consistent so sometimes there are quotes in other cases there aren't. Some goes for spaces.

I'm using this script:

FOR /F tokens^=1*^ delims^=^"^=^  %%A IN ('TYPE "1.txt"') DO (
	FOR /F delims^=^" %%C IN ("%%B") DO ECHO %%A="%%C"
)
and the output is:

abc="1"
abc="1"
abc="1"
abc="1 2"
abc="1 2"
abc="1 2"

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#208
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag
Well, the output would be the same with this:
FOR /F "tokens=1,2 delims==" %%A IN ('TYPE "1.txt"') DO CALL :strip_readd_quotes %%A %%B
GOTO :EOF

:strip_readd_quotes
ECHO %1="%~2"
GOTO :EOF

jaclaz

Edited by jaclaz, 12 April 2013 - 05:19 AM.


#209
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
Thanks :thumbup

Yours is actually better because it works even for something like this:

MainCancelIntroString   = "Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"
while mine doesn't.
Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#210
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

Thanks :thumbup

Yours is actually better because it works even for something like this:

Yes and no.
If you have more than two tokens NOT quoted "on the right side" of the equal sign my code snippet won't work.
This may:
FOR /F "tokens=1,* delims==" %%A IN ('TYPE "1.txt"') DO (
CALL :strip_spaces %%A
CALL :strip_readd_quotes %%B
ECHO !var!=!val!
)


:strip_spaces
set var=%*
GOTO :EOF

:strip_readd_quotes
set val="%*"
set val=%val:""="%
IF "[]"=="[%*]" set val=
GOTO :EOF

but it won't work with the example you just posted with double-double quotes inside double quotes. :ph34r:

jaclaz

#211
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
Does anyone know what the problem may be with:

MORE /T0 1.txt>2.txt
? This removes all TABs from a file and seems to work fine when processing small files. However, when I try to do this with larger files then the command suddenly stops at 1.65MB (the size differs but the point is that it just won't go further). The last line of 2.txt is something like:

-- More (86%) --
What's the case here? :blink:

Edited by tomasz86, 13 April 2013 - 03:34 PM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#212
bphlpt

bphlpt

    MSFN Addict

  • Member
  • PipPipPipPipPipPipPip
  • 1,798 posts
  • Joined 12-May 07
  • OS:none specified
  • Country: Country Flag
I've never run into this, but could it be the amount of memory available? It's just a guess, but it sounds like it is trying to load and process the entire thing in memory.

Cheers and Regards

Posted Image


#213
allen2

allen2

    Not really Newbie

  • Member
  • PipPipPipPipPipPipPip
  • 1,812 posts
  • Joined 13-January 06
"more" is used to show only portion of file to screen at the size of dos box so it is normal that it works like this. Perhaps you could trick it by changing the size of the dos box before running "more".
If you need to remove tabulation, a tool like unix sed would be usefull.

#214
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
@allen2 MORE seems to work like TYPE when you pipe it to a file.

MORE 1.txt>2.txt
TYPE 1.txt>2.txt
I don't see any difference in the output except for the case mentioned above when larger files are processed and it gets stuck at some moment...

Edited by tomasz86, 14 April 2013 - 03:59 AM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#215
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag
Usual reality check :w00t: :
  • WHY do you want to use MORE to remove TABs (and not gsar, that you have available or another third party tool) ?




jaclaz

#216
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
I find gsar quite problematic. There are characters like ":" or "\" which have to be replaced with something else in order to use them with gsar's "-r" option. It's also limited to 256 characters. At the moment I've changed the script so that everything which was done by gsar can be done with pure batch using the SET command, and there's even no significant different in speed (it's actually got faster by a few seconds). I can remove TABs using SET too. But that's not the point. I've discovered the "MORE /T" switch by accident and played with it for a while until I encounter the problem described above. I was just wondering why MORE suddenly stops when processing larger files.

By the way, I think I've managed to get the Strings sorted:

FOR /F "tokens=1* delims== " %%A IN (1.txt) DO (
	IF "%%B"=="" (
		SET Line=%%A=""
	) ELSE (
		FOR /F tokens^=1*^ delims^=^" %%C IN ("%%A="%%B"") DO (
			SET Line=%%C"%%D
			IF !Line:~-2!==^"^" SET Line=!Line:~0,-1!
		)
	)
	ECHO !Line!
)
This seems to work for all kinds of strings, including these:

TZROOT=SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones
HelpLink = "http://support.microsoft.com{##}kbid=2829069"
MainCancelIntroString   = "Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"
PowerShell_ReleaseNotesDir=

the result being:

TZROOT="SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones"
HelpLink="http://support.microsoft.com{##}kbid=2829069"
MainCancelIntroString="Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"
PowerShell_ReleaseNotesDir=""

Edited by tomasz86, 14 April 2013 - 08:18 AM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#217
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

I find gsar quite problematic. There are characters like ":" or "\" which have to be replaced with something else in order to use them with gsar's "-r" option. It's also limited to 256 characters.

I don't get it. :unsure:
Use a dec or hex character number, instead of the textual representation of it.

I intiially suggested gsar only because it's one of the tool I use coomonly (and it works on binary files, something I do a lot), you may want to find an alternative to it only dedicated to "text" files.

Besides the name :ph34r: this one doesn't seem like bad:
http://sourceforge.n...ojects/fart-it/
http://fart-it.sourceforge.net/
or this one:
http://findandreplace.codeplex.com/

Most probably the behaviour of MORE is a glitch in the matrix, I don't think that many people ever used MORE for anything bigger than a few Kbytes. :unsure:

jaclaz

#218
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
You're probably right about MORE. I'll leave it for now.

As for gsar, I just wanted to say that in case of gsar you need to take into account more special characters than in case of a pure batch. In your SPLITINF script gsar is used to replace characters like "?" or "&", etc. Is it really a problem to just use batch like this instead of gsar?

FOR /F delims^=^ eol^= %%A IN (1.txt) DO (
  SET Line=%%A
  SET Line=!Line:	=!
  SET Line=!Line:%%=%%%%!
  SET Line=!Line:^&={#}!
  SET Line=!Line:^?={##}!
  SET Line=!Line:^<={###}!
  SET Line=!Line:^>={####}!
  SET Line=!Line:^^!={#####}!
  SET Line=!Line:^|={######}!
  ECHO !Line!
)

Edited by tomasz86, 15 April 2013 - 02:41 AM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#219
dencorso

dencorso

    Adiuvat plus qui nihil obstat

  • Supervisor
  • 5,869 posts
  • Joined 07-April 07
  • OS:98SE
  • Country: Country Flag

Donator

Why tip-top around sed for so long? :unsure:
Granted, it converts DOS ASCII to unix ASCII. Then one pipes it through unix2dos -D, and lo!, it's DOS ASCII all right again.
Both exist in cygwin, requiring just the inevitable cygwin1.dll (and, perhaps, one or two more .dlls), since way back.
And I bet there's a good Mingw standalone implementation too...

Attached Files



#220
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

Is it really a problem to just use batch like this instead of gsar?

Not at all :).

As a matter of fact it makes sense since you are parsing the files line by line.

@dencorso
See the above, it is just a matter of "philosophy", either processing the file(s) as a whole or parsing them line by line.

And anything needing cygwin1.dll is philosophically "wrong". :ph34r:

And anything provided through their installer is a crazy, senseless, mass of bloat :( , compare:
http://reboot.pro/to...uest-for-ddexe/

jaclaz

#221
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
I'm now trying to solve a different problem...

I want to retrieve filename from a cabbed file. Let's say that the file is called "abc.dl_" and the real filename is "a b c.dll".

If you run

cabarc l abc.dll_
you get

Microsoft (R) Cabinet Tool - Version 5.2.3790.0
Copyright (c) Microsoft Corporation. All rights reserved..

Listing of cabinet file 'abc.dl_' (size 18989):
   1 file(s), 1 folder(s), set ID 0, cabinet #0

File name                      File size     Date      Time   Attrs
-----------------------------  ---------- ---------- -------- -----
   a b c.dll                 36352 2013/03/30 12:13:14  -a--

I've come up with this script:

@ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

SET tokens1=1
:loop1
FOR /F "skip=9 tokens=%tokens1%" %%A IN ('cabarc l abc.dll') DO (
	SET/A tokens1+=1
	GOTO :loop1
)
SET/A tokens1-=5

SET tokens2=1
:loop2
FOR /F "skip=9 tokens=%tokens2%-%tokens1%" %%A IN ('cabarc l abc.dll') DO (
	IF DEFINED File (
		SET File=!File! %%A
	) ELSE (
		SET File=%%A
	)
	SET/A tokens2+=1
	GOTO :loop2
)

ECHO "!File!"

PAUSE
which does work, the result being

"a b c.dll"
but I'm just wondering if there's any simpler way to do it instead of using such two loops. My method is also far from perfect because it won't work if the real filename has more than one space in between, ex.

"a     b c.dll"

Edited by tomasz86, 15 April 2013 - 09:35 AM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages

#222
jaclaz

jaclaz

    The Finder

  • Developer
  • 14,411 posts
  • Joined 23-July 04
  • OS:none specified
  • Country: Country Flag

but I'm just wondering if there's any simpler way to do it instead of using such two loops. My method is also far from perfect because it won't work if the real filename has more than one space in between, ex.

"a     b c.dll"

Do files with space in names exist in CAB files? :unsure:
Anyway, see if this fits:
@ECHO off
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /F "tokens=*" %%A IN ('cabarc L test.cab ^| FIND "/"') do (
SET Line=%%A
SET Line=!Line:~0,28!
CALL :rem_trail_spaces !Line!
ECHO [!Line!]
)

GOTO :EOF

:rem_trail_spaces
SET Line=%*
GOTO :EOF


jaclaz

#223
dencorso

dencorso

    Adiuvat plus qui nihil obstat

  • Supervisor
  • 5,869 posts
  • Joined 07-April 07
  • OS:98SE
  • Country: Country Flag

Donator

See the above, it is just a matter of "philosophy", either processing the file(s) as a whole or parsing them line by line.

And anything needing cygwin1.dll is philosophically "wrong". :ph34r:

And anything provided through their installer is a crazy, senseless, mass of bloat :( , compare:
http://reboot.pro/to...uest-for-ddexe/


Yes, I agree with your points there.
But I also need sed, so I did compromise. :blushing:

Now, our friend submix8c (thanks, man... you do rock! :thumbup ) found a pearl he pointed me to on another thread... the link he gave works no more, but good old Wayback Machine is always there for the rescue (at least for the time being...): UnxUtils 2001 version (real standalone)!!! :thumbup
Grab the sed from it and do give it a try (it's a 45 kiB PE file!)... you'll fall in love. :wub:

What's *really* limiting with sed is that it's ASCII, period... so if you need UNICODE, then that's not an option. But life is like that, anyway... :}

#224
Yzöwl

Yzöwl

    Wise Owl

  • Super Moderator
  • 4,534 posts
  • Joined 13-October 04
  • OS:Windows 7 x64
  • Country: Country Flag

Donator

Have you tried just using 'expand'
@ECHO OFF & SETLOCAL ENABLEEXTENSIONS

(SET TESTFILE=D:\My Files\abc.dl_)

FOR /F "TOKENS=*" %%# IN ('EXPAND -D "%TESTFILE%"^|FIND /I "%TESTFILE%"') DO (

	SET _=%%#)

ECHO/%_:*: =%

PING -n 4 127.0.0.1 1>NUL


#225
tomasz86

tomasz86

    www.windows2000.tk

  • Member
  • PipPipPipPipPipPipPipPip
  • 2,520 posts
  • Joined 27-November 10
  • OS:XP Pro x86
  • Country: Country Flag
@Yzöwl Expand.exe works but is extremely slow compared to "cabarc L".


@jaclaz Why "!0,28!"? It won't work for longer filenames, will it?

Do files with space in names exist in CAB files? :unsure:

This is a good question :P Probably not but I just want to go safe.

Edited by tomasz86, 17 April 2013 - 02:34 AM.

Posted Image
Unofficial Service Pack 5.2 for MS Windows 2000 <- use this topic if you need help with UURollup, Update Rollup 2 and other unofficial packages




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users



How to remove advertisement from MSFN