Jump to content

Welcome to MSFN Forum
Register now to gain access to all of our features. Once registered and logged in, you will be able to create topics, post replies to existing threads, give reputation to your fellow members, get your own private messenger, post status updates, manage your profile and so much more. This message will be removed once you have signed in.
Login to Account Create an Account


Photo

How can I Read and write (overwrite, not append) to the same file in p

- - - - -

  • Please log in to reply
8 replies to this topic

#1
poly4life

poly4life

    Newbie

  • Member
  • 10 posts
  • OS:XP Pro x86
  • Country: Country Flag
Hello,

I'm having difficulty reading from and writing to the same file. I'm not sure what I'm doing wrong. It works fine if I'm in read/write append mode, but that doesn't accomplish what I want.

I'm trying to read the file in, do some processing on it, then overwrite the file. I honestly don't understand the purpose of "+> operator". "+<" works fine, but it appends to the file. "+>" deletes the contents of the file immediately after I open it. Is it that the "+>" operator is simply not the right tool for reading from and (over)writing to the same file? I saw a perl module called Tie::File but I am having difficulty using it and would appreciate some help with it. Otherwise, I'll just read from the file, close it, and then write to the file, if I can't get the "+>" operator working for this problem.

Thank you.

Here's my code:

#!/usr/local/bin/perl -w 
 
use Fcntl qw(:DEFAULT :flock :seek); # import LOCK_* constants 
 
local $/=undef;

#Read/Write with overwrite
open(FILE, "+>", $file) || die("Cannot open file"); 
flock(FILE, LOCK_EX); 
seek(FILE, 0, SEEK_SET); 

$file_data=<FILE>; 
 
#Do some processing on $file_data here 
 
# (Over)Write to same file
print FILE $file_data; 
close(FILE);

Thank you.


How to remove advertisement from MSFN

#2
allen2

allen2

    Not really Newbie

  • Member
  • PipPipPipPipPipPipPip
  • 1,812 posts
Of course "+>" is not right , if you open a file while deleting its content, you won't read too much data from it. The original purpose of this mode is to write first the in file after erasing its contend then read. For what you're triying to do "+<" should work.

#3
CoffeeFiend

CoffeeFiend

    Coffee Aficionado

  • Super Moderator
  • 5,399 posts
  • OS:Windows 7 x64
  • Country: Country Flag

I honestly don't understand the purpose of "+> operator". "+<" works fine, but it appends to the file. "+>" deletes the contents of the file immediately after I open it.


+> truncates the file (makes it a 0 byte file) so it's hard to read from that for sure. It will also create the file if it doesn't exist already (either ways, you're getting that 0 byte file)

+< is read/write. However, after you're done reading the file, your "position" is at the end of the file, so if you start writing then that's where you'll be writing from -- essentially appending (very much like it would using any other language in this specific scenario). If you want to write from the beginning, you have to seek to the beginning first.

Not that I would do it this way, unless you're at least 100% certain that the content you'll be writing will never be smaller by *any* amount (even a single byte), because then you'll have garbage appended at the end of your new file. Your best bet (again, for any language -- so long as the files aren't huge) is to first open the file, reading its contents into some sort of variable, then closing it. Then you do whatever processing it is you wanted to do. Then you finally reopen it, this time for writing, *truncating* the old file, write the new stuff to it and close it once last. Or you can also rename the old file as a backup (if you want one), and create the new file. That's much more fool-proof in most cases.
Coffee: \ˈkȯ-fē, ˈkä-\. noun. Heaven in a cup. Life's only treasure. The meaning of life. Kaffee ist wunderbar. C8H10N4O2 FTW.

#4
poly4life

poly4life

    Newbie

  • Member
  • 10 posts
  • OS:XP Pro x86
  • Country: Country Flag
Yes, thank you, I see that now, "+>" is not the right tool for the job. However, I was looking at overwrite, as opposed to append, because I didn't wish to append data to the end of the file.

FYI -- I should've mentioned in this in the original post -- I am running Strawberry Perl 5.12.2.0 on Windows XP SP3 32-bit, and the file I'm opening is a text file and some of the processing involves regex.

After processing, I rewrite the entire file, instead of just a portion of it. The problem is the much of the rewrite starts at the beginning of the file. Again, my logic was instead of picking-and-choosing which parts of the file to rewrite, why not just read the entire contents (it's not a huge file) of the text file into Perl, do processing on it, then rewrite the entire file.

I looked at the seek doc and did some more research on append and seek, and at least from unix/linux, it is not possible to syseek or seek with append. I had tried it, too, and it would only append to the end of the file.
SOURCE: http://www.justlinux...ad.php?t=131467

MY CODE for read/append ("+>>")
#!/usr/local/bin/perl -w

use Fcntl qw(:DEFAULT :flock :seek); # import LOCK_* constants

open(FILE, "+>>", "test.txt") || die("Cannot open file");
flock(FILE, LOCK_EX);
seek(FILE, 0, SEEK_SET);
$file_data=<FILE>;
print $file_data;
print FILE "xxx";

close(FILE);

print "\n\n-------\n\n";

Beforehand, I opened the file in read-mode, copied its contents, closed the file, opened the file again in write-overwrite-mode, wrote to the file, and, finally, closed the file. But I thought why open and close the same file twice, when I may be able to do it all in one shot? It'll be more efficient, less code, making it potentially easier to maintain and debug, and less of a performance hit. With one file, it's no big deal. But if I'm processing many, many files (i.e. reading in a directory), I could see a performance hit. So, this is why I posted in the first place, to learn if there's a better way.

I like the suggestion for the backup, thank you. I suppose I'll just open it twice, unless I can get the Tie::File module to work correctly.

Thank you both again.

#5
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag
I do not know if this will help or not but here is a VBS script
1\ Opens the textfile and read it contents into one varible called V1
2\ Then uses the V1 varible to rewrite the textfile and adds at Line 4 and Line 7, Add 1 , Add 2,
then closes the textfile with the changes saved.

Const ForReading = 1, ForWriting = 2, ForAppending = 8
'-> Object For Script
 Dim Fso :Set Fso = CreateObject("Scripting.FileSystemObject")
'-> Varibles For Use
 Dim C1, File, Ts, V1, V2
'-> Check To Make Sure File Is Present
 File = "Test_Text.txt"
   If Fso.FileExists(File) Then 
'-> Loop To Read All The Text File Into Varible V1
    Set Ts = Fso.OpenTextFile(File,ForReading,True)
    Do Until Ts.AtEndOfStream
     V1 =  Ts.ReadAll 
    Loop
     Ts.Close
'-> Loop To Add The New Add 1, Add 2 At Lines 4 And &
    Set Ts = Fso.OpenTextFile(File,ForWriting,true)
     For Each V2 In Split(V1, vbCrLf)
      C1 = C1 + 1
'-> Add To Line 4 And Line 7
     If C1 = 4 Then 
      Ts.WriteLine "Add 1 " & V2
     ElseIf C1 = 7 Then 
      Ts.WriteLine "Add 2 " & V2
     ElseIf V2 = "" Then
'-> Do Nothing It A Blank Line
     Else
'-> Add The Unchange Line Back To File
      Ts.WriteLine V2
     End If
    Next 
     Ts.Close   
   Else
    MsgBox "Missing This Text : " & File
   End If 




GunSmokingMan



#6
Yzöwl

Yzöwl

    Wise Owl

  • Super Moderator
  • 4,530 posts
  • OS:Windows 7 x64
  • Country: Country Flag

Donator

Quick question gsm, would I need to add to line four and line six to add to lines four and line seven respectively.

If I add to line four it would mean that old line four became new line five, old line five became new line six and old line six became new line seven! I'd suggest the term append to line…

#7
gunsmokingman

gunsmokingman

    MSFN Master

  • Super Moderator
  • 2,418 posts
  • OS:none specified
  • Country: Country Flag

If I add to line four it would mean that old line four became new line five,


No the script only adds to the front of the line and line 4 remains line 4 after the change.,
V2 would be line 4 from the varible V1 after it had been Split with vbCrLf

     If C1 = 4 Then       
       Ts.WriteLine "Add 1 " & V2


Contents Of Test_Text.txt before script runs

Line 01
Line 02
Line 03
Line 04
Line 05
Line 06
Line 07
Line 08
Line 09
Line 10

After script ran once

Line 01
Line 02
Line 03
Add 1 Line 04
Line 05
Line 06
Add 2 Line 07
Line 08
Line 09
Line 10

If you ran the script 2 times then it

Line 01
Line 02
Line 03
Add 1 Add 1 Line 04
Line 05
Line 06
Add 2 Add 2 Line 07
Line 08
Line 09
Line 10




GunSmokingMan



#8
CoffeeFiend

CoffeeFiend

    Coffee Aficionado

  • Super Moderator
  • 5,399 posts
  • OS:Windows 7 x64
  • Country: Country Flag

would I need to add to line four and line six to add to lines four and line seven respectively.

Doesn't really matter. The OP is doing something completely different in the first place like he explained in post #4 (using regular expressions). And then again, he already had a working solution that did the file open/read/close, processing, then file open/write/close separately.

His only problem was opening the file just once with R/W access, reading from it, seeking back (which he wasn't doing so it was effectively appending) and then writing again -- which as-is was bad idea in the first place: if your new content is shorter than the old one, then you end up with junk (contents from the old file) tacked on at the end of it (I tried to explain before that he had to seek back, and that this often wouldn't be sufficient to solve the problem too)

So changing language without any real technical merits or benefits, not using regular expressions, or adding specific logic that's completely irrelevant to the problem and such? Ok, whatever... But this doesn't actually address his actual problem in any way: reading from & writing to the same file by opening it just once. Not mentioned (because he hasn't discovered the next problem that would arise once he gets this working), but it must be able to "shrink" the file too if necessary.

There is a way to do exactly what he's asking for (and in perl, still using regexp's and all), not that it really offers any actual benefits vs opening it twice IMO:
  • open the file with RW access (using "+<")
  • read its content into some variable
  • storing the size/length of the said "old" contents in a variable
  • do the processing on it just like before
  • seek to the beginning of the file: seek(FILE, 0, SEEK_SET); (what he wasn't doing after reading from the file, thus making it append)
  • write your new content
  • if the size of the "new" content is less than the size of the old contents previously stored, then call truncate(FILE, newSizeHere); on it (discarding the extraneous bytes)
  • close the file
Not that it's any better than his current/old solution IMO. He's seemingly trying to do that for performance, but saving a file open/close operation (just getting a file handle) vs the added seek & truncate operations... There's basically going to be no measurable difference between the two (way less than 1ms difference*). I'll much sooner use the code that's more solid (proper error handling for starters), better written, better written/documented, easier to understand, more versatile/reusable, is better tested (e.g. has good unit tests), is easier to use, will be better supported in the future, etc.

Either ways, I think this is completely pointless in the first place (and this is why I have not/will not bother spending the 5 minutes to write code that does exactly this). This particular problem (replacing text using regular expressions) was already solved 35 years ago by AWK (using sub or gsub). He's just reinventing the wheel, and poorly at that.

* Edit: allen2 sure has a good point there too (see post below). I mean, if this executes once in a while it's pointless trying to spend hours of coding to shave off a few microseconds of execution tme. But if you're going to use this in a situation where it actually matters (like running it a billion times in a loop) then a scripting language probably isn't the best tool for the job in the first place (you'd want something compiled for sure -- and probably make the tool iterate through the files instead of running it a bazillion times). Then again, sometimes regular expressions are also overkill (or not the best pick) for the job and something like a Boyer-Moore search might be faster to find the parts that need replacing. I don't personally bother much with optimization (assuming the code is already half-decently written) until it actually becomes a problem (then you profile and see what needs to be optimized -- the file I/O, the time spent on string ops, the time spent spawning the same process repeatedly, etc -- and then address that particular problem)
Coffee: \ˈkȯ-fē, ˈkä-\. noun. Heaven in a cup. Life's only treasure. The meaning of life. Kaffee ist wunderbar. C8H10N4O2 FTW.

#9
allen2

allen2

    Not really Newbie

  • Member
  • PipPipPipPipPipPipPip
  • 1,812 posts
I'll just add that if you want a fast execution speed, you most likely shouldn't use a scripting langage.




0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users



How to remove advertisement from MSFN