• Announcements

    • xper

      MSFN Sponsorship and AdBlockers!   07/10/2016

      Dear members, MSFN is made available via subscriptions, donations and advertising revenue. The use of ad-blocking software hurts the site. Please disable ad-blocking software or set an exception for MSFN. Alternatively, become a site sponsor and ads will be disabled automatically and by subscribing you get other sponsor benefits.
tomasz86

How to merge two text files?

238 posts in this topic

I was wrong. It still doesn't ECHO lines starting with ";" when EOL isn't defined :(

Yes, the semicolon is the default EOL of the FOR, you can do something like:

FOR /F "eol=§ tokens=1,2 delims=, " %%a in ('TYPE 1.inf') do echo %%a,%%b

The § character is in "Windows", in command line it will "look" different, it's ALT+0167, but you can use any other "extended ASCII" character that won't be normally used in a .inf.

jaclaz

0

Share this post


Link to post
Share on other sites

You seem to like "§" a lot :D but do you remember that it was also present in your original SPLITINF script... and the problem was that it didn't work when system locale was set to Korean :( and in the end I had to replace it with something else which was "}#}" so I'd like to avoid using such characters.

0

Share this post


Link to post
Share on other sites

You seem to like "§" a lot :D but do you remember that it was also present in your original SPLITINF script... and the problem was that it didn't work when system locale was set to Korean :( and in the end I had to replace it with something else which was "}#}" so I'd like to avoid using such characters.

Naah it's only because it is easily available on my italian keyboard, as said it is up to you to find a character that is not used in "Korean" (or "other rare language" files).

BTW it is possible (but you will have to experiment) that you can use one of the "lower" ASCII Characters, in the range 0 to 31 decimal, really cannot say. :unsure:

jaclaz

0

Share this post


Link to post
Share on other sites

I wanted to use both double quotes and spaces as delimiters but I didn't know how to do it so I searched for a solution in the Internet but couldn't really find anything useful. Finally I've managed to figure it out by myself and I'm posting this in case someone else encounters a similar problem:

FOR /F delims^=^"^  %%A IN ("abc = "1"") DO ECHO "%%A"

It will display just

"abc"

Notice that there are two spaces before %%A.

0

Share this post


Link to post
Share on other sites

The example you have provided shows us only that the space has been delimited because it has split the string before it has seen your double quote and output only the first token.

See how these work:


@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO/DELIMITED DOUBLE QUOTES ONLY
FOR /F TOKENS^=1-3^ DELIMS^=^" %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C]
FOR /F USEBACKQ^ TOKENS^=1-3^ DELIMS^=^" %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES AND SPACES
FOR /F TOKENS^=1-6^ DELIMS^=^"^ %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
FOR /F USEBACKQ^ TOKENS^=1-6^ DELIMS^=^"^ %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
PAUSE & GOTO :EOF

0

Share this post


Link to post
Share on other sites

Actually, if all tomasz86 is interested in is the first token in the string, then his code does seem to work, as shown by running this (NOTE two spaces before "%%A"):

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO/TOMASZ86 METHOD FOR FIRST TOKEN ONLY
FOR /F delims^=^"^ %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A]
FOR /F delims^=^"^ %%A IN ("THIS"TEST" IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES ONLY
FOR /F TOKENS^=1-3^ DELIMS^=^" %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C]
FOR /F USEBACKQ^ TOKENS^=1-3^ DELIMS^=^" %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C]
ECHO/&ECHO/DELIMITED DOUBLE QUOTES AND SPACES
FOR /F TOKENS^=1-6^ DELIMS^=^"^ %%A IN ("THIS IS"ENCLOSED"IN DOUBLE QUOTES") DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
FOR /F USEBACKQ^ TOKENS^=1-6^ DELIMS^=^"^ %%A IN ('THIS IS"ENCLOSED"IN DOUBLE QUOTES') DO ECHO/[%%A] [%%B] [%%C] [%%D] [%%E] [%%F]
PAUSE & GOTO :EOF

The output for both lines of the code added to show his method is -- [THIS]

If you want more flexibility over getting more than just the first token, then your code is much more complete and accurate, Yzöwl, as always.

EDIT: If, however, he was trying to get the same result for VAR=VAL vs VAR = VAL vs VAR="VAL" vs VAR = "VAL", then this code would work:

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
ECHO GET SAME RESULT FOR VAR=VAL vs VAR = VAL vs VAR="VAL" vs VAR = "VAL"
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^ %%A IN ("abc=1") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^ %%A IN ("abc = 1") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^ %%A IN ("abc="1"") DO ECHO/[%%A] [%%B]
FOR /F TOKENS^=1-2^ DELIMS^=^=^"^ %%A IN ("abc = "1"") DO ECHO/[%%A] [%%B]
PAUSE & GOTO :EOF

All four lines will echo

[abc] [1]

So it all depends on what he really wants to do.

Cheers and Regards

Edited by bphlpt
0

Share this post


Link to post
Share on other sites

Hmm, actually the strings go like this:

1.txt


abc = 1
abc= 1
abc =1
abc = "1 2"
abc= "1 2"
abc ="1 2"
etc.

They're not consistent so sometimes there are quotes in other cases there aren't. Some goes for spaces.

I'm using this script:

FOR /F tokens^=1*^ delims^=^"^=^  %%A IN ('TYPE "1.txt"') DO (
FOR /F delims^=^" %%C IN ("%%B") DO ECHO %%A="%%C"
)

and the output is:


abc="1"
abc="1"
abc="1"
abc="1 2"
abc="1 2"
abc="1 2"

0

Share this post


Link to post
Share on other sites

Well, the output would be the same with this:

FOR /F "tokens=1,2 delims==" %%A IN ('TYPE "1.txt"') DO CALL :strip_readd_quotes %%A %%B
GOTO :EOF

:strip_readd_quotes
ECHO %1="%~2"
GOTO :EOF

jaclaz

Edited by jaclaz
0

Share this post


Link to post
Share on other sites

Thanks :thumbup

Yours is actually better because it works even for something like this:

MainCancelIntroString   = "Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"

while mine doesn't.

0

Share this post


Link to post
Share on other sites

Thanks :thumbup

Yours is actually better because it works even for something like this:

Yes and no.

If you have more than two tokens NOT quoted "on the right side" of the equal sign my code snippet won't work.

This may:

FOR /F "tokens=1,* delims==" %%A IN ('TYPE "1.txt"') DO (
CALL :strip_spaces %%A
CALL :strip_readd_quotes %%B
ECHO !var!=!val!
)


:strip_spaces
set var=%*
GOTO :EOF

:strip_readd_quotes
set val="%*"
set val=%val:""="%
IF "[]"=="[%*]" set val=
GOTO :EOF

but it won't work with the example you just posted with double-double quotes inside double quotes. :ph34r:

jaclaz

0

Share this post


Link to post
Share on other sites

Does anyone know what the problem may be with:

MORE /T0 1.txt>2.txt

? This removes all TABs from a file and seems to work fine when processing small files. However, when I try to do this with larger files then the command suddenly stops at 1.65MB (the size differs but the point is that it just won't go further). The last line of 2.txt is something like:

-- More (86%) -- 

What's the case here? :blink:

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

I've never run into this, but could it be the amount of memory available? It's just a guess, but it sounds like it is trying to load and process the entire thing in memory.

Cheers and Regards

0

Share this post


Link to post
Share on other sites

"more" is used to show only portion of file to screen at the size of dos box so it is normal that it works like this. Perhaps you could trick it by changing the size of the dos box before running "more".

If you need to remove tabulation, a tool like unix sed would be usefull.

0

Share this post


Link to post
Share on other sites

@allen2 MORE seems to work like TYPE when you pipe it to a file.

MORE 1.txt>2.txt

TYPE 1.txt>2.txt

I don't see any difference in the output except for the case mentioned above when larger files are processed and it gets stuck at some moment...

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

Usual reality check :w00t: :

  • WHY do you want to use MORE to remove TABs (and not gsar, that you have available or another third party tool) ?

jaclaz

0

Share this post


Link to post
Share on other sites

I find gsar quite problematic. There are characters like ":" or "\" which have to be replaced with something else in order to use them with gsar's "-r" option. It's also limited to 256 characters. At the moment I've changed the script so that everything which was done by gsar can be done with pure batch using the SET command, and there's even no significant different in speed (it's actually got faster by a few seconds). I can remove TABs using SET too. But that's not the point. I've discovered the "MORE /T" switch by accident and played with it for a while until I encounter the problem described above. I was just wondering why MORE suddenly stops when processing larger files.

By the way, I think I've managed to get the Strings sorted:


FOR /F "tokens=1* delims== " %%A IN (1.txt) DO (
IF "%%B"=="" (
SET Line=%%A=""
) ELSE (
FOR /F tokens^=1*^ delims^=^" %%C IN ("%%A="%%B"") DO (
SET Line=%%C"%%D
IF !Line:~-2!==^"^" SET Line=!Line:~0,-1!
)
)
ECHO !Line!
)

This seems to work for all kinds of strings, including these:

TZROOT=SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones
HelpLink = "http://support.microsoft.com{##}kbid=2829069"
MainCancelIntroString = "Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"
PowerShell_ReleaseNotesDir=

the result being:

TZROOT="SOFTWARE\Microsoft\Windows NT\CurrentVersion\Time Zones"
HelpLink="http://support.microsoft.com{##}kbid=2829069"
MainCancelIntroString="Thank you for reporting the Request. When you click ""Send Report"" button, data concerning why install failed will be sent to Microsoft"
PowerShell_ReleaseNotesDir=""

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

I find gsar quite problematic. There are characters like ":" or "\" which have to be replaced with something else in order to use them with gsar's "-r" option. It's also limited to 256 characters.

I don't get it. :unsure:

Use a dec or hex character number, instead of the textual representation of it.

I intiially suggested gsar only because it's one of the tool I use coomonly (and it works on binary files, something I do a lot), you may want to find an alternative to it only dedicated to "text" files.

Besides the name :ph34r: this one doesn't seem like bad:

http://sourceforge.net/projects/fart-it/

http://fart-it.sourceforge.net/

or this one:

http://findandreplace.codeplex.com/

Most probably the behaviour of MORE is a glitch in the matrix, I don't think that many people ever used MORE for anything bigger than a few Kbytes. :unsure:

jaclaz

0

Share this post


Link to post
Share on other sites

You're probably right about MORE. I'll leave it for now.

As for gsar, I just wanted to say that in case of gsar you need to take into account more special characters than in case of a pure batch. In your SPLITINF script gsar is used to replace characters like "?" or "&", etc. Is it really a problem to just use batch like this instead of gsar?

FOR /F delims^=^ eol^= %%A IN (1.txt) DO (
SET Line=%%A
SET Line=!Line: =!
SET Line=!Line:%%=%%%%!
SET Line=!Line:^&={#}!
SET Line=!Line:^?={##}!
SET Line=!Line:^<={###}!
SET Line=!Line:^>={####}!
SET Line=!Line:^^!={#####}!
SET Line=!Line:^|={######}!
ECHO !Line!
)

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

Why tip-top around sed for so long? :unsure:

Granted, it converts DOS ASCII to unix ASCII. Then one pipes it through unix2dos -D, and lo!, it's DOS ASCII all right again.

Both exist in cygwin, requiring just the inevitable cygwin1.dll (and, perhaps, one or two more .dlls), since way back.

And I bet there's a good Mingw standalone implementation too...

sed1line.7z

0

Share this post


Link to post
Share on other sites

Is it really a problem to just use batch like this instead of gsar?

Not at all :).

As a matter of fact it makes sense since you are parsing the files line by line.

@dencorso

See the above, it is just a matter of "philosophy", either processing the file(s) as a whole or parsing them line by line.

And anything needing cygwin1.dll is philosophically "wrong". :ph34r:

And anything provided through their installer is a crazy, senseless, mass of bloat :( , compare:

http://reboot.pro/topic/15207-why-everything-is-so-dmn-diificult-a-web-quest-for-ddexe/

jaclaz

0

Share this post


Link to post
Share on other sites

I'm now trying to solve a different problem...

I want to retrieve filename from a cabbed file. Let's say that the file is called "abc.dl_" and the real filename is "a b c.dll".

If you run

cabarc l abc.dll_

you get

Microsoft (R) Cabinet Tool - Version 5.2.3790.0
Copyright (c) Microsoft Corporation. All rights reserved..

Listing of cabinet file 'abc.dl_' (size 18989):
1 file(s), 1 folder(s), set ID 0, cabinet #0

File name File size Date Time Attrs
----------------------------- ---------- ---------- -------- -----
a b c.dll 36352 2013/03/30 12:13:14 -a--

I've come up with this script:

@ECHO OFF

SETLOCAL ENABLEDELAYEDEXPANSION

SET tokens1=1
:loop1
FOR /F "skip=9 tokens=%tokens1%" %%A IN ('cabarc l abc.dll') DO (
SET/A tokens1+=1
GOTO :loop1
)
SET/A tokens1-=5

SET tokens2=1
:loop2
FOR /F "skip=9 tokens=%tokens2%-%tokens1%" %%A IN ('cabarc l abc.dll') DO (
IF DEFINED File (
SET File=!File! %%A
) ELSE (
SET File=%%A
)
SET/A tokens2+=1
GOTO :loop2
)

ECHO "!File!"

PAUSE

which does work, the result being

"a b c.dll"

but I'm just wondering if there's any simpler way to do it instead of using such two loops. My method is also far from perfect because it won't work if the real filename has more than one space in between, ex.

"a     b c.dll"

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

but I'm just wondering if there's any simpler way to do it instead of using such two loops. My method is also far from perfect because it won't work if the real filename has more than one space in between, ex.

"a     b c.dll"

Do files with space in names exist in CAB files? :unsure:

Anyway, see if this fits:

@ECHO off
SETLOCAL ENABLEDELAYEDEXPANSION
FOR /F "tokens=*" %%A IN ('cabarc L test.cab ^| FIND "/"') do (
SET Line=%%A
SET Line=!Line:~0,28!
CALL :rem_trail_spaces !Line!
ECHO [!Line!]
)

GOTO :EOF

:rem_trail_spaces
SET Line=%*
GOTO :EOF

jaclaz

0

Share this post


Link to post
Share on other sites
See the above, it is just a matter of "philosophy", either processing the file(s) as a whole or parsing them line by line.

And anything needing cygwin1.dll is philosophically "wrong". :ph34r:

And anything provided through their installer is a crazy, senseless, mass of bloat :( , compare:

http://reboot.pro/topic/15207-why-everything-is-so-dmn-diificult-a-web-quest-for-ddexe/

Yes, I agree with your points there.

But I also need sed, so I did compromise. :blushing:

Now, our friend submix8c (thanks, man... you do rock! :thumbup ) found a pearl he pointed me to on another thread... the link he gave works no more, but good old Wayback Machine is always there for the rescue (at least for the time being...): UnxUtils 2001 version (real standalone)!!! :thumbup

Grab the sed from it and do give it a try (it's a 45 kiB PE file!)... you'll fall in love. :wub:

What's *really* limiting with sed is that it's ASCII, period... so if you need UNICODE, then that's not an option. But life is like that, anyway... :}

0

Share this post


Link to post
Share on other sites

Have you tried just using 'expand'

@ECHO OFF & SETLOCAL ENABLEEXTENSIONS
(SET TESTFILE=D:\My Files\abc.dl_)
FOR /F "TOKENS=*" %%# IN ('EXPAND -D "%TESTFILE%"^|FIND /I "%TESTFILE%"') DO (
SET _=%%#)
ECHO/%_:*: =%
PING -n 4 127.0.0.1 1>NUL

0

Share this post


Link to post
Share on other sites

@Yzöwl Expand.exe works but is extremely slow compared to "cabarc L".

@jaclaz Why "!0,28!"? It won't work for longer filenames, will it?

Do files with space in names exist in CAB files? :unsure:

This is a good question :P Probably not but I just want to go safe.

Edited by tomasz86
0

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Recently Browsing   0 members

    No registered users viewing this page.