Jump to content

Recommended Posts

Hi,

 

i wanna delete Empty Spaces, in many files by script...

But i didnt get what i like...

 

Here is an Example: (there some are Tabs and Empty doubled Spaces at the end of a line)

; Generated by ERROR[Data]	AutomaticUpdates= " No"			    Autopartition	=0  			 	MsDosInitiated  = 0   				 UnattendedInstall   	=" 2 + 4 "   									    		[Unattended]  	 	 		UnattendMode=				FullUnattendedProgramFilesDir="\Program Files (x86)"NoWaitAfterGUIMode=1 

Exactly i wish folling steps:

01) Convert all to UTF8 or 32

02) Convert all Tabs to Spaces

03) Replace *" * with *"*  (Optional)

04) Replace * "* with *"*  (Optional)

05) Replace * " * with *"*  (Optional)

06) Replace *=* with * = * (Optional)

07) Replace * =* with * = * (Optional)

08) Replace *= * with * = * (Optional)

09) Remove/Replace Doubled White Space Everywhere

10) Remove White/Empty Space at the Beginn of a line

11) Remove White/Empty Space at the End of a line

12) Add a Empty Line before [

13) Replace Doubled Empty Lines With one Line

 

I tryed the fart tool, the sed tool, a batch, and notepad++

 

My Batch cut the Output, a perlscript doesnt delete the empty space at line end, - something other didnt do it at the beginning... (- but i really wanna have 1 Solution for all)

 

Should be look like this: After

; Generated by ERROR[Data]AutomaticUpdates = "No"Autopartition = 0MsDosInitiated = 0UnattendedInstall = "2 + 4"[Unattended]UnattendMode = FullUnattendedProgramFilesDir = "\Program Files (x86)"NoWaitAfterGUIMode = 1

Would be nice, if someone can tell me how to get it.... (- maybe with AutoIT ?)

 

Thx

R4

Edited by R4D3
Link to comment
Share on other sites


It would probably be faster to re-write the file, but it depends on the characters used, using batch and FOR /F "tokens= delims=", if all the lines are like that (essentially your example looks like a .ini or .inf file).

 

I am not too sure to understand the need to convert (from what?) to "UTF 8 or 32".

If it is plain ASCII, I would probably use gsar:

http://home.online.no/~tjaberg/

 

See if this seemingly unrelated thread:

http://www.msfn.org/board/topic/151785-how-to-merge-two-text-files/

 

gives you some inspiration, the object there was to split and selectively merge, nicely formatted, a whole lot of .inf, but the basic approach should be the same or similar.

 

jaclaz

Link to comment
Share on other sites

Certainly something with Regular Expression support is suitable. Surprised that Perlscript cannot do it. But hey, AutoIt has Perl Compatible Regular Expression support ;).

 

This is what I came up with. Many here seem to like using CMD scripts so I made this AutoIt script to be compiled as a CUI program. I am using AutoIt 3.3.10.2 at present which should be compatible with the latest version.

; Name of compiled file.#pragma compile(Out, 'reformatini.exe'); CUI program. Set to False for a GUI program.#pragma compile(Console, True); Bit x86|x64. Set to true for 64 bit program.#pragma compile(x64, False); AutoIt version. Tested on this version.#pragma compile(FileVersion, 3.3.10.2); What this file is meant for.#pragma compile(FileDescription, 'Reformat ini file content. Use /?, -? or -h for help.'); A name for the program.#pragma compile(ProductName, 'Ini File Reformat Tool'); Version for this program.#pragma compile(ProductVersion, 1.0.0.0)#NoTrayIconIf $CMDLINE[0] > 2 Then	; More then 1 parameter is not supported.	ConsoleWriteError('Only maximum of 2 parameters is allowed.' & @CRLF)	Exit 1ElseIf $CMDLINE[0] = 2 Then	; Read direct from the ini file.	$sContent = FileRead($CMDLINE[1])	If @error Then		ConsoleWriteError('Failed to read the file.' & @CRLF)		Exit 2	EndIfElseIf $CMDLINE[0] = 1 Then	Switch $CMDLINE[1]		Case '/?', '-?', '-h'			; Help			ConsoleWrite( _			 'Ini File Reformat Tool' & @CRLF & _			 'Outputs the reformatted content to the console or to a file.' & @CRLF & @CRLF & _			 'Pass 2 parameters as 1st being path to the source file and 2nd' & @CRLF & _			 'to the destination file. The file encoding of the destination file' & @CRLF & _			 'will be based on the source file encoding.' & @CRLF & _			 ' i.e. "' & @ScriptName & '" source.ini destination.ini' & @CRLF & @CRLF & _			 'Or, pass 1 parameter as being the path to the source file.' & @CRLF & _			 ' i.e. "' & @ScriptName & '" source.ini' & @CRLF & @CRLF & _			 'Or, pipe to this file.' & @CRLF & _			 ' i.e. Type source.ini | "' & @ScriptName & '"' & @CRLF & @CRLF & _			 'Exitcode:' & @CRLF & _			 ' 1 Only maximum of 2 parameters is allowed.' & @CRLF & _			 ' 2 Failed to read the file.' & @CRLF & _			 ' 3 No parameters and no input provided.' & @CRLF & _			 ' 4 Failed to open the file for write.' & @CRLF _			)			Exit		Case Else			; Read direct from the ini file.			$sContent = FileRead($CMDLINE[1])			If @error Then				ConsoleWriteError('Failed to read the file.' & @CRLF)				Exit 2			EndIf	EndSwitchElse	; Read from stdin.	$sContent = ''	Do		$sContent &= ConsoleRead()	Until @error	If $sContent == '' Then		ConsoleWriteError('No parameters and no input provided.' & @CRLF)		Exit 3	EndIfEndIf; Clean the content from the ini file.$sNewContent = _CleanIniFileContent($sContent); Output the cleaned content.If $CMDLINE[0] = 2 Then	; Open the output file for erase and then write in the same encoding as the source file.	$hWrite = FileOpen($CMDLINE[2], FileGetEncoding($CMDLINE[1]) + 0x2)	If $hWrite = -1 Then		ConsoleWriteError('Failed to open the file for write.' & @CRLF)		Exit 4	Else		; Write the new content to the output file.		FileWrite($hWrite, $sNewContent & @CRLF)		FileClose($hWrite)	EndIfElse	; Just output the new content to console.	ConsoleWrite($sNewContent & @CRLF)EndIfExitFunc _CleanIniFileContent($sContent)	; Remove empty lines and trim whitespace from the end of each line.	$sContent = StringRegExpReplace($sContent, '(?m)^\h*(.+?)\h*$', '\1')	; Remove horizonal whitespace on lines that have no other content.	$sContent = StringRegExpReplace($sContent, '(?m)^\h+$', '\1')	; Remove empty lines.	$sContent = StringRegExpReplace($sContent, '(\r\n|\n){2,}', '\1')	; Fix the spacing between the key values and the data values.	$sContent = StringRegExpReplace($sContent, '(?m)^([^;#[])(.*?)\h*=\h*(.*)$', '\1\2 = \3')	; Trim the spacing from the quoted data values. i.e. " string " to "string".	$sContent = StringRegExpReplace($sContent, '(?m)^([^;#[])(.+?) = "\h*(.+?)\h*"$', '\1\2 = "\3"')	; Add empty lines before section names.	$sContent = StringRegExpReplace($sContent, '(?m)^(\[.+\])$', @CRLF & '\1')	; Trim both ends of the content.	$sContent = StringStripWS($sContent, 0x3)	Return $sContentEndFunc

Here is a cmd script used to test it and the output.

CMD script to test the compiled AutoIt script named reformatini.exe.

@echo offecho ### Test 1: reformatini.exe -h.reformatini.exe -hecho ### Test 2: type source.ini.type source.iniecho ### Test 3: type source.ini ^| reformatini.exe.type source.ini | reformatini.exeecho ### Test 4: reformatini.exe source.ini.reformatini.exe source.iniecho ### Test 5: reformatini.exe source.ini destination.ini.reformatini.exe source.ini destination.iniecho ### Test 6: reformatini.exe source.ini ^> destination_echoed.ini.reformatini.exe source.ini > destination_echoed.inipausegoto :eof

Output in the CMD window.

### Test 1: reformatini.exe -h.Ini File Reformat ToolOutputs the reformatted content to the console or to a file.Pass 2 parameters as 1st being path to the source file and 2ndto the destination file. The file encoding of the destination filewill be based on the source file encoding. i.e. "reformatini.exe" source.ini destination.iniOr, pass 1 parameter as being the path to the source file. i.e. "reformatini.exe" source.iniOr, pipe to this file. i.e. Type source.ini | "reformatini.exe"Exitcode: 1 Only maximum of 2 parameters is allowed. 2 Failed to read the file. 3 No parameters and no input provided. 4 Failed to open the file for write.### Test 2: type source.ini.; Generated by ERROR[Data]        AutomaticUpdates= " No"  Autopartition =0        MsDosInitiated  = 0         UnattendedInstall      =" 2 + 4 "[Unattended]                                UnattendMode=                           FullUnattendedProgramFilesDir="\Program Files (x86)"NoWaitAfterGUIMode=1 ### Test 3: type source.ini | reformatini.exe.; Generated by ERROR[Data]AutomaticUpdates = "No"Autopartition = 0MsDosInitiated = 0UnattendedInstall = "2 + 4"[Unattended]UnattendMode = FullUnattendedProgramFilesDir = "\Program Files (x86)"NoWaitAfterGUIMode = 1### Test 4: reformatini.exe source.ini.; Generated by ERROR[Data]AutomaticUpdates = "No"Autopartition = 0MsDosInitiated = 0UnattendedInstall = "2 + 4"[Unattended]UnattendMode = FullUnattendedProgramFilesDir = "\Program Files (x86)"NoWaitAfterGUIMode = 1### Test 5: reformatini.exe source.ini destination.ini.### Test 6: reformatini.exe source.ini > destination_echoed.ini.Press any key to continue . . .

Test 5 and 6 output to file so you may see no output in the CMD window.

 

Output from one of the destination files. Winmerge shows that both output files are identical.

; Generated by ERROR[Data]AutomaticUpdates = "No"Autopartition = 0MsDosInitiated = 0UnattendedInstall = "2 + 4"[Unattended]UnattendMode = FullUnattendedProgramFilesDir = "\Program Files (x86)"NoWaitAfterGUIMode = 1

The tests seem OK to me. I have not tested a UTF-8, UTF-16... file though it should be good.

 

Use the executable like any one of these commands:

 

Syntax: reformatini.exe /? | -? | -h
Syntax: reformatini.exe source.ini [destination.ini]
Syntax: type source.ini | reformatini.exe

 

| is alternate except in the last command. [ ] is optional.

 

 

Let me know how it works for you.

 

Edit: Updated about the | not being an alternate in the last command which may have been confusing otherwise.

Edited by MHz
Link to comment
Share on other sites

First: Thanks a Lot: MHZ - your Script work perfectly, now I just need do modify it a bit, - or the batch with something like %1, to change an amount of Files in a Folder...

 

- Second:  I am Sure that a Perl Script can do it easily, - but I am not a programmer (I use try and error method instead...) - and wasn't able to combine those Expressions for the end and begin of a line...

 

@jaclaz: I would agree, if I like to change one singe file... - but my plan is to get the script running, over all extracted .mof .css .inf . ini .sif (all Txt based files) form my windows XP Iso.... - There are more than 12 Million empty spaces in that files... (only in 3 Files is a Warning Message: Do not Edit - but i am not sure if MS means "blanks" with that too... -)

 

The Input (from the files) is sometimes different - normally Windows use UTF16LE (but some files are not...) - and there is although this problem - that a white space isn't always a white space - cause some Language Components/Layouts use different one... (I read something about it a few days ago...)

 

the point, why I like to choose utf8 or 32 as output, is: - I have read that utf8 (the most common standard) is the smallest one (in file size) - and utf32 the fastest one. Some of that Files will be used from the OS every time - and I think about what happens when my System run those files (12 Million blanks "lighter") in UTF8, or 32 instead of 16LE - maybe I get a speed change, or maybe less memory usage - and cause I don't know, - I wanna test it...  (i know that i the digital signing of some files will be lost by that... - if i get errors i will expect them...)

 

;)

 

- if I get success - I will do the same thing on Reshack Extracted Resource Files ( UI Code parts from .exe .ocx .dll) (and yes I will try to slim down the palette colors of all internal resource files by a script too, cause I was able to slim down my explorer.exe to 330KB !!! in the past, by using a 3 Color palette (black/white/transparent) for all used resources, and by deleting the unused... (there is no need that a 48px 3 color Icon use a 16,7 Mio Color palette... - it just blows up the file size and the memory usage...) (so its try to Slim Down the Size of Windows , without deleting any files... -or before deleting them  ;))

 

Edit: Your link to this "unrealted" thread is interesting ;)

Edited by R4D3
Link to comment
Share on other sites

 

First: Thanks a Lot: MHZ - your Script work perfectly, now I just need do modify it a bit, - or the batch with something like %1, to change an amount of Files in a Folder...
 
- Second:  I am Sure that a Perl Script can do it easily, - but I am not a programmer (I use try and error method instead...) - and wasn't able to combine those Expressions for the end and begin of a line...
...

Your welcome.

I made it as a CUI i.e. console program so it can be used in a CMD script something like:

if not exist subfolder md subfolderfor %%A in (*.ini) do reformatini.exe "%%~A" "subfolder\%%~nxA"

Actually, you could do the above with it compiled as a GUI program though you would get no output in the CMD window.

 

Look, I do not expect you to be a (professional,serious,whatever) programmer. Learning programming is like climbing a ladder. You go step by step. The rate of the climb is up to you. Do it as steep as you can handle. What I have learnt is not by magic but determination. You can probably be there one day. About regular expressions, I considered I could do it in 1 regular expression, I come close, but kept failing. So, bleep it, I did it over several expressions. Maybe not the best but got it done and if a bug is present then usually I can track it down to one of the expressions or needing another expression. PCRE may need unicode i.e. UCP turned on though I do not handle unicode characters in the patterns so hopefully it should not be needed. If I sound a little too advanced for you then say, hey, can you break this down. I will try though I may have lost some prior memory but I can only try to remember how I knew little about programming i.e. I may need to be reminded. :)

Link to comment
Share on other sites

nah its ok, youre right... ;) i just get often confused from all that "special" charaters in scripts - i am able to do some little things - but i really have problems when i need to make a "concept" and how to use it... - i understand 80% of what you have done there, but wouldnt be able to get it myself from "zero"... - mostly i modify script examples and copy them together, to get my target...

 

- i already have somthing like a similar gui i maybe can use for it  (me and someone else did write a autoit script for batchmod (its a script itself for reshacker) to change the resource files from a extracted XP Iso to slipstream a resource packs like FlyAKite,... - but we never finished it... - i think i allready post that here "somewhere" - think this was on some older usernames R2-D2, or D5D4 or something similar, where i have forget pw and used email :D)

 

I will try your "MasterCode" for all that files next days... ;) (I have tryed to open all 1200 Files with Ultraedit by hand to modify them, and sometimes i got a warning, that this file is not dos coded - but i can fix that files before - I just have to make a list...)

Edited by R4D3
Link to comment
Share on other sites

Reshacker, hmm, not 64 bit compatible AFAIK. A shame really as it is renowned as being such a great program in its prime time. Even good programmers and their programs may need to retire.

 

I will try your "MasterCode" for all that files next days... ;) (I have tryed to open all 1200 Files with Ultraedit by hand to modify them, and sometimes i got a warning, that this file is not dos coded - but i can fix that files before - I just have to make a list...)

Cool. 1200 files, major BLEEP! By the time you have edited a couple of dozen or so, you could have a script to do the rest in seconds. Yeah, yeah, yeah, takes some known knowledge,though something to strive for. If Ultraedit is complaining about not being a DOS coded file then perhaps it thinks it is a binary file. A plain text file does not have a header yet alone a DOS header. Something strange may be going on there.

Link to comment
Share on other sites

Well, nothing can beat plain ASCII/ANSI (non-unicode), it is the first time I hear that UTF-8 is common (I mean among TEXT used in Windows .iso's in files like  .mof .css .inf . ini .sif, while it is very common in the web), open one of those text files in a hex editor/viewer.

If it's plain ASCII/ANSI, you will see exactly what you can see in (say) Notepad.

If it's unicode (UTF-16) it's first two bytes will be "ÿþ" or FFFE and you will see all letters separated by a dot.

If it's UTF-8 it's first three bytes will be "" or EFBBBF.

 

You CANNOT change the text encoding of files like .inf, .ini or .sif as simply the Windows Setup would not be able to read them, you have to keep the SAME text encoding as the original file.

 

ASCII/ANSI is an 8 bit FIXED text encoding, each character will take exactly one byte.

UTF-8 and UTF-16 are VARIABLE length text encoding, each character will take AT LEAST respectively 8 bits (1 byte) or 16 bits (2 bytes), and extended/regional symbols will take more.

UTF-32 is a FIXED format (just like ASCII/ANSI), but each character will need 32 bits (or 4 bytes) instead of 8 bits (or 1 byte).

 

So, if your scope (even if it were possible, i.e. all the tools/programs involved could actually read indifferently any of those encodings, which is NOT he case :no:) is to reduce size of the files, you are doing it wrong :w00t::ph34r: plain ASCII/ANSI and UTF-8 would be roughly the same size, the UTF-16will be AT LEAST double that size and UTF-32 will be 4 times that size.

 

jaclaz

Edited by jaclaz
Link to comment
Share on other sites

Hey, common ;) the amount of UTF-8 files at all is probably bigger than the amount of other Pages
 
Some files on the XP.Iso are already in different Code Pages, so I thought it is a good Idea to unify them (before converting the whitespaces - if I don't did this before - MHZ´s Script will maybe not work right on Tabs or Spaces from some files...
Now, I tested a batch file with all kinds of codepages... - and you're right, just 3 of them worked 100% at this single test (Ansi, UTF-8(without BOM), and Unicode)
 
But Slim down the Size of XP - without deleting Files, is not Impossible !
 
Here you can Proof it by Yourself: 
A) Extract all files form your NT 5.X Iso (*.*_; *.Cab; *.Zip)
1) Delete more than 12 Million empty spaces from all Text based Files (maybe Expect those 3 "do not edit" files, and files with a .Cat file)
2) Remove all ; Comments
3) Replace all www.********* with Blank, Router Address, or Blackhole
4) Replace a Billion Times the Word Microsoft with MS (yes I know that Xp is from them...- so keep it in the Version String)
5) Extract the resource Files from all(.exe; .dll; .ocx), do Step 1-4 for their Menu files
6) Delete unused Resources (.wav, .ani; .cur; .jpg; .avi; .wmv; .bmp; .gif; .png; .ico)
7) Delete every second picture in .avi and .gif animations band reduce/or speed up the play speed (Example: Explorer Copy Animation)
8) Convert Color palettes of all "inside" images/Anims who are not bigger than 96x96 to 256 +Alpha, or 4 Color (Black, White, Grey, Alpha), or what you like
9) Convert the bigger Pics to a 256 Color+Alpha palette (Expect the StartupPic/Animation)  
9) Remove Multiple Doubled Icon Files (just keep 16x16 and 32x32 - I do not like BIG Icons at all... - if you increase, a 32px Icon to 96px - and the icon is away, the system will show up the rescaled 32px icon - dosnt look so fine, but dont break the OS, the icons inside get resources to be displayed in 8x8 on a 16 Color Monitor up to 512x512
A) Compile all Files back to their Original and try your new XP Iso in a VM
 
Best thing would be to move all of that "Lovely Stuff" to shell32.dll (and set a link to it instead) - cause there are thousands and thousands of similar or doubled Resources in this files... - you will never believe how much...
As I said before, - you can Slim down the Explorer.exe with Reshacker to 330KB ! And if you use a good Set of Icons it looks ways nicer...
 
Edit: No, I am not able to write a Script for all that... (i even dont know a good commandline Palette reducer) - I hope i am close to finish Step 1 ;)
 
 
Update: After some testing of your code i found a entry that seems to clean some other codepage tabs/spaces) - just dont know why it begins with (  and get the & @ in a little different order
	$sContent = (StringRegExpReplace($sContent, "\h+", " ") & @CRLF)

 

Edited by R4D3
Link to comment
Share on other sites

<snip>

Update: After some testing of your code i found a entry that seems to clean some other codepage tabs/spaces) - just dont know why it begins with (  and get the & @ in a little different order

 

	$sContent = (StringRegExpReplace($sContent, "\h+", " ") & @CRLF)

The extra braces is forcing everything within them to be evaluated. In that example, they serve no meaningful purpose. Take care with something like that pattern as paths in your text files can have multiple spaces or tabs as an example which could break those paths.

 

Adding (*UCP) at the beginning of each Regular Expression pattern may change the behavior of how certain characters are matched. I.e. \h may match unicode horizontal spaces as well as the ASCII + Extended ASCII horizontal spaces.

 

So you can change this function to:

Func _CleanIniFileContent($sContent)	; Trim whitespace from the end of each line.	$sContent = StringRegExpReplace($sContent, '(*UCP)(?m)^\h*(.+?)\h*$', '\1')	; Remove horizonal whitespace on lines that have no other content.	$sContent = StringRegExpReplace($sContent, '(*UCP)(?m)^\h+$', '\1')	; Remove empty lines.	$sContent = StringRegExpReplace($sContent, '(*UCP)(\r\n|\n){2,}', '\1')	; Fix the spacing between the key values and the data values.	$sContent = StringRegExpReplace($sContent, '(*UCP)(?m)^([^;#[])(.*?)\h*=\h*(.*)$', '\1\2 = \3')	; Trim the spacing from the quoted data values. i.e. " string " to "string".	$sContent = StringRegExpReplace($sContent, '(*UCP)(?m)^([^;#[])(.+?) = "\h*(.+?)\h*"$', '\1\2 = "\3"')	; Add empty lines before section names.	$sContent = StringRegExpReplace($sContent, '(*UCP)(?m)^(\[.+\])$', @CRLF & '\1')	; Trim both ends of the content.	$sContent = StringStripWS($sContent, 0x3)	Return $sContentEndFunc

This is untested so it may work OK or it may need to be updated to handle the different behavior.

 

Edit1: Added extra info about using just \h+ in a pattern.

 

Edit2: Created another version. v1.1. See below.

 

Due to approximately 1200 files you have, running a process 1200 times in a loop is rather harsh IMO.

A test is done in this version on the 2nd parameter and if it is an existing directory path then you can use a file pattern on the 1st parameter. Thus, one process can do multiple files.

 

A 7th test was added to the test CMD file to show this:

echo ### Test 7: reformatini.exe *.ini dest.if not exist dest md destreformatini.exe *.ini dest

The AutoIt script is here:

; Name of compiled file.#pragma compile(Out, 'reformatini.exe'); CUI program. Set to False for a GUI program.#pragma compile(Console, True); Bit x86|x64. Set to true for 64 bit program.#pragma compile(x64, False); AutoIt version. Tested on this version.#pragma compile(FileVersion, 3.3.10.2); What this file is meant for.#pragma compile(FileDescription, 'Reformat ini file content. Use /?, -? or -h for help.'); A name for the program.#pragma compile(ProductName, 'Ini File Reformat Tool'); Version for this program.#pragma compile(ProductVersion, 1.1.0.0)#NoTrayIconIf $CMDLINE[0] > 2 Then	; More then 2 parameter is not supported.	ConsoleWriteError('Only maximum of 2 parameters is allowed.' & @CRLF)	Exit 1	ElseIf $CMDLINE[0] = 2 Then	If StringInStr(FileGetAttrib($CMDLINE[2]), 'D') Then		; Destination is a directory path.				; Get a handle to the 1st file.		$hFind = FileFindFirstFile($CMDLINE[1])		If $hFind = -1 Then			ConsoleWriteError('Failed to find the source file.' & @CRLF)			Exit 5		EndIf				; Get the source directory as we need to read the source files from there.		$sSourceDir = StringRegExpReplace($CMDLINE[1], '^(.+)\\.*?$', '\1')		If Not @extended Then $sSourceDir = '.'				While 1			; Find a file that matches the pattern.			$sFound = FileFindNextFile($hFind)			If @error Then ExitLoop						; Skip folders.			If @extended Then ContinueLoop									; Give some feedback of file processing.			ConsoleWrite('Processing "' & $sSourceDir & '\' & $sFound & '"' & @CRLF)						; Read direct from the ini file.			$sContent = _FileRead($sSourceDir & '\' & $sFound, True)			If @error Then				FileClose($hFind)				Exit 2			EndIf						; Clean the content from the ini file.			$sNewContent = _CleanIniFileContent($sContent)						; Write to the output file.			_FileWrite($sSourceDir & '\' & $sFound, $CMDLINE[2] & '\' & $sFound, $sNewContent, True)			If @error Then				FileClose($hFind)				Exit 4			EndIf						Sleep(10)		WEnd				; Close the find handle.		FileClose($hFind)			Else		; Give some feedback of file processing.		ConsoleWrite('Processing "' & $CMDLINE[1] & '"' & @CRLF)				; Read direct from the ini file.		$sContent = _FileRead($CMDLINE[1])				; Clean the content from the ini file.		$sNewContent = _CleanIniFileContent($sContent)				; Write to the output file.		_FileWrite($CMDLINE[1], $CMDLINE[2], $sNewContent)	EndIf	ElseIf $CMDLINE[0] = 1 Then	Switch $CMDLINE[1]		Case '/?', '-?', '-h'			; Help			ConsoleWrite( _			 'Ini File Reformat Tool' & @CRLF & _			 'Outputs the reformatted content to the console or to a file.' & @CRLF & @CRLF & _			 'Pass 2 parameters as 1st being path to the source file and 2nd' & @CRLF & _			 'to the destination file. The file encoding of the destination file' & @CRLF & _			 'will be based on the source file encoding. If the 2nd is a directory' & @CRLF & _			 'path, then the 1st can be a pattern to find and process multiple files.' & @CRLF & _			 ' i.e. "' & @ScriptName & '" source.ini destination.ini' & @CRLF & _			 ' i.e. "' & @ScriptName & '" *.ini "destination folder"' & @CRLF & @CRLF & _			 'Or, pass 1 parameter as being the path to the source file.' & @CRLF & _			 ' i.e. "' & @ScriptName & '" source.ini' & @CRLF & @CRLF & _			 'Or, pipe to this file.' & @CRLF & _			 ' i.e. Type source.ini | "' & @ScriptName & '"' & @CRLF & @CRLF & _			 'Exitcode:' & @CRLF & _			 ' 1 Only maximum of 2 parameters is allowed.' & @CRLF & _			 ' 2 Failed to read the file.' & @CRLF & _			 ' 3 No parameters and no input provided.' & @CRLF & _			 ' 4 Failed to open the file for write.' & @CRLF & _			 ' 5 Failed to find the source file.' & @CRLF _			)					Case Else			; Read direct from the ini file.			$sContent = _FileRead($CMDLINE[1])						; Clean the content from the ini file.			$sNewContent = _CleanIniFileContent($sContent)						; Just output the new content to console.			ConsoleWrite($sNewContent & @CRLF)	EndSwitch	Else	; Read from stdin.	$sContent = ''	Do		Sleep(10)		$sContent &= ConsoleRead()	Until @error		If $sContent == '' Then		ConsoleWriteError('No parameters and no input provided.' & @CRLF)		Exit 3	EndIf		; Clean the content from the ini file.	$sNewContent = _CleanIniFileContent($sContent)		; Just output the new content to console.	ConsoleWrite($sNewContent & @CRLF)	EndIfExitFunc _CleanIniFileContent($sContent)	Local $sPrefix	; Add a PCRE prefix here i.e. '(*UCP)' for full unicode support.	$sPrefix = '(*UCP)'	; Trim whitespace from the end of each line.	$sContent = StringRegExpReplace($sContent, $sPrefix & '(?m)^\h*(.+?)\h*$', '\1')	; Remove horizonal whitespace on lines that have no other content.	$sContent = StringRegExpReplace($sContent, $sPrefix & '(?m)^\h+$', '\1')	; Remove empty lines.	$sContent = StringRegExpReplace($sContent, $sPrefix & '(\r\n|\n){2,}', '\1')	; Fix the spacing between the key values and the data values.	$sContent = StringRegExpReplace($sContent, $sPrefix & '(?m)^([^;#[])(.*?)\h*=\h*(.*)$', '\1\2 = \3')	; Trim the spacing from the quoted data values. i.e. " string " to "string".	$sContent = StringRegExpReplace($sContent, $sPrefix & '(?m)^([^;#[])(.+?) = "\h*(.+?)\h*"$', '\1\2 = "\3"')	; Add empty lines before section names.	$sContent = StringRegExpReplace($sContent, $sPrefix & '(?m)^(\[.+\])$', @CRLF & '\1')	; Trim both ends of the content.	$sContent = StringStripWS($sContent, 0x3)	Return $sContentEndFuncFunc _FileRead($sSourceFile, $bReturnOnError = False)	; Read direct from the ini file.	Local $sContent	$sContent = FileRead($sSourceFile)	If @error Then		ConsoleWriteError('Failed to read the file "' & $sSourceFile & '".' & @CRLF)		If $bReturnOnError Then Return SetError(1, 0, 2)		Exit 2	EndIf	Return $sContentEndFuncFunc _FileWrite($sSourceFile, $sDestinationFile, $sContent, $bReturnOnError = False)	; Open the output file for erase and then write in the same encoding as the source file.	Local $hWrite	$hWrite = FileOpen($sDestinationFile, FileGetEncoding($sSourceFile) + 0x2)	If $hWrite = -1 Then		ConsoleWriteError('Failed to open the file for write.' & @CRLF)		If $bReturnOnError Then Return SetError(1, 0, 4)		Exit 4	Else		; Write the new content to the output file.		FileWrite($hWrite, $sContent & @CRLF)		FileClose($hWrite)	EndIfEndFunc

 It already has the (*UCP) prefix on the Regular Expressions if you look at the _CleanIniFileContent() function.

Edited by MHz
Link to comment
Share on other sites

Thx...

 

Now after a first Test (Over all Files from the Iso that could be Open in Notepad) - your Script deletes 19MB of Whitespace at all... - i guess it would be more :sneaky: - now maybe i just search the most worse one, by size, cause it will be hard to write a repack script for all that files (sometimes there is a cabbed *._ file inside a .cab thats inside another .cab - and some files get overwritten when i expand them - cause their expand target ends up in same file name....)

 

post-395354-0-26976900-1433006272_thumb.

 

At this Test, i just use your new script, with this batch and a *.* command - and didnt test yet if the new one can do different files type like the old one...

@ECHO OFF & COLOR 3F & ECHO Script by msfn User: MHZif not exist subfolder md subfolderfor %%A in (*.inf; *.ini; *.sif) do reformatini.exe "%%~A" "subfolder\%%~nxA"Pause
Edited by R4D3
Link to comment
Share on other sites

You should be able to use *.* as first parameter and a folder as 2nd parameter. This is so long as you have only text based files in the source directory as no file type filtering is done by the script.
 
Another way could be to rename the files adding a temporary extension i.e. file1.ini to file1.ini.text, file2.inf to file2.inf.text etc. Do a reformatini.exe *.text destfolder and then once done, rename the files removing the temporary extension. CMD For loop and using Rename should be able to do the mass file renaming.

 

As for some whitespace which could be removed. The ini file format usually is not so spacious in its default API usage.

 

i.e. Try this test. Requires reformatini.exe for the comparison.

; Create a default ini layout.IniWrite('test1.ini', 'section 1', 'key1', 'value1')IniWrite('test1.ini', 'section 1', 'key2', 'value2')IniWrite('test1.ini', 'section 2', 'key1', 'value1')IniWrite('test1.ini', 'section 2', 'key2', 'value2'); Clean the ini.RunWait('reformatini.exe test1.ini test2.ini'); Read the characters of the files into a variable.$test1 = FileRead('test1.ini')$test2 = FileRead('test2.ini'); Show some results.MsgBox(0x40000, @ScriptName, _ 'Size of test1 = ' & StringLen($test1) & @CRLF & @CRLF & _ $test1 & @CRLF & @CRLF & _ 'Size of test2 = ' & StringLen($test2) & @CRLF & @CRLF & _ $test2 & @CRLF _)

I get this output.

Size of test1 = 78[section 1]key1=value1key2=value2[section 2]key1=value1key2=value2Size of test2 = 88[section 1]key1 = value1key2 = value2[section 2]key1 = value1key2 = value2

10 characters are excess whitespace in the cleanup file as it adds spaces around the = character and adds spacing before section name lines. Something to consider. Minor changes to the Regular Expressions can change that result.

Link to comment
Share on other sites

;) Thx again,
 
I know the script:
 
- add some whitespace between = itself
- fixes possible errors like " this"  and "this " one
- add a line before a [ at Linebegin ? or in General `? - didnt think about it before - i will test it soon ;)
 
In the 7050 Files,  " = " was inside 551.584 times, but when a " " space need 1 Byte this 1.103.168 spaces will be something like 1MB - so at all I would save 20MB instead of 19MB (I still think about killing all tabs and doubled whitespaces between "things", and ; comments at line end, or lines starting with a ; too)
 
Problem is: I now have to decide, how much of this files I really like to change
 
I Could use:
A) files who are easy to recab to their position in the Iso
B) just the big ones
C) just the files with most whitespace (by counting them before)
D) Chosen files by list
E) Files by type
F) all Textbase Files (but I am not sure if I can handle the expanding, recabbing of all archived files; [some files have different names and end up in the same file, other have their real name after expanding - and need to keep it, and some have not, and will be renamed by windows setup) (I although would need something like a "Codepage Detector", who checks up if first, if the file can be cleaned; - by searching the first byte somehow ?
 
I thought first to do it finaly like this, cause if I use *.* - I need to move all this extensions to another place (*.* was just for a Test. Where I copyed all readable files in one folder, to get this knowledge about that 19MB of Whitespace, inside them...)
@ECHO OFF & COLOR 3F & ECHO Script by msfn User: MHZif not exist subfolder md subfolderfor %%A in (*.adm; *.adr; *.asa; *.asp; *.aspx; *.bat; *.cer; *.cf; *.chs; *.cht; *.cmd; *.cnt; *.config; *.cpx; *.crl; *.css; *.csv; *.default; *.df; *.dns; *.dtd; *.dun; *.dxt; *.ecf; *.eng; *.gpd; *.h; *.h2; *.hex; *.hht; *.hkf; *.hpj; *.hta; *.htm; *.htt; *.htx; *.hxx; *.icw; *.inc; *.inf; *.ini; *.ins; *.isp; *.jpn; *.js; *.key; *.kor; *.man; *.manifest; *.mfl; *.mib; *.mof; *.msc; *.nt; *.obe; *.osc; *.p7b; *.pmc; *.ppd; *.ppt; *.pro; *.prx; *.rat; *.reg; *.rsp; *.sam; *.sed; *.sep; *.set; *.sif; *.smc; *.sp2; *.spd; *.sql; *.srg; *.sym; *.tha; *.the; *.txt; *.uninstall; *.url; *.vbs; *.vcf; *.ver; *.wpl; *.wsc; *.wsx; *.xdr; *.xml; *.xsd; *.xsl; *.xslt) do reformatini.exe "%%~A" "subfolder\%%~nxA"Pause

- Maybe i need to open a New Thread in Windows XP Subforum for this Idea of Prepairing a Iso

Edited by R4D3
Link to comment
Share on other sites

Problem is: I now have to decide, how much of this files I really like to change

Depends on how much you can do by script. Manually doing it would be a PITA.

 

... I although would need something like a "Codepage Detector", who checks up if first, if the file can be cleaned; - by searching the first byte somehow ?

Codepage affects the extended ASCII. This depends on the system language default that is set. It is not a BOM that defines codepage, but rather file encoding. If you look at the _FileWrite() function, you may noticed that I used FileGetEncoding() which gets the encoding that the file uses.

 

@ECHO OFF & COLOR 3F & ECHO Script by msfn User: MHZif not exist subfolder md subfolderfor %%A in (*.adm; *.adr; *.asa; *.asp; *.aspx; *.bat; *.cer; *.cf; *.chs; *.cht; *.cmd; *.cnt; *.config; *.cpx; *.crl; *.css; *.csv; *.default; *.df; *.dns; *.dtd; *.dun; *.dxt; *.ecf; *.eng; *.gpd; *.h; *.h2; *.hex; *.hht; *.hkf; *.hpj; *.hta; *.htm; *.htt; *.htx; *.hxx; *.icw; *.inc; *.inf; *.ini; *.ins; *.isp; *.jpn; *.js; *.key; *.kor; *.man; *.manifest; *.mfl; *.mib; *.mof; *.msc; *.nt; *.obe; *.osc; *.p7b; *.pmc; *.ppd; *.ppt; *.pro; *.prx; *.rat; *.reg; *.rsp; *.sam; *.sed; *.sep; *.set; *.sif; *.smc; *.sp2; *.spd; *.sql; *.srg; *.sym; *.tha; *.the; *.txt; *.uninstall; *.url; *.vbs; *.vcf; *.ver; *.wpl; *.wsc; *.wsx; *.xdr; *.xml; *.xsd; *.xsl; *.xslt) do reformatini.exe "%%~A" "subfolder\%%~nxA"Pause
I do not know probably half of those extensions and whether it is safe to use the Regular Expressions on those as they are designed for an ini file type structure. You may need to make Regular Expressions in different functions to be called by the detected file type.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...