would I need to add to line four and line six to add to lines four and line seven respectively.
Doesn't really matter. The OP is doing something completely
different in the first place like he explained in post #4 (using regular expressions). And then again, he already had a working solution that did the file open/read/close, processing, then file open/write/close separately.
problem was opening the file just once
with R/W access, reading from it, seeking back (which he wasn't doing so it was effectively appending) and then writing again -- which as-is was bad idea in the first place: if your new content is shorter than the old one, then you end up with junk (contents from the old file) tacked on at the end of it (I tried to explain before that he had to seek back, and that this often wouldn't be sufficient to solve the problem too)
So changing language without any real technical merits or benefits, not using regular expressions, or adding specific logic that's completely irrelevant to the problem and such? Ok, whatever... But this doesn't actually address his actual problem in any way: reading from & writing to the same file by opening it just once. Not mentioned (because he hasn't discovered the next problem that would arise once he gets this working), but it must be able to "shrink" the file too if necessary.
There is a way to do exactly what he's asking for (and in perl, still using regexp's and all), not that it really offers any actual benefits vs opening it twice IMO:
- open the file with RW access (using "+<")
- read its content into some variable
- storing the size/length of the said "old" contents in a variable
- do the processing on it just like before
- seek to the beginning of the file: seek(FILE, 0, SEEK_SET); (what he wasn't doing after reading from the file, thus making it append)
- write your new content
- if the size of the "new" content is less than the size of the old contents previously stored, then call truncate(FILE, newSizeHere); on it (discarding the extraneous bytes)
- close the file
Not that it's any better than his current/old solution IMO. He's seemingly trying to do that for performance, but saving a file open/close operation (just getting a file handle) vs the added seek & truncate operations... There's basically going to be no measurable difference between the two (way less than 1ms difference*). I'll much sooner use the code that's more solid (proper error handling for starters), better written, better written/documented, easier to understand, more versatile/reusable, is better tested (e.g. has good unit tests), is easier to use, will be better supported in the future, etc.
Either ways, I think this is completely pointless in the first place (and this is why I have not/will not bother spending the 5 minutes to write code that does exactly this). This particular problem (replacing text using regular expressions) was already solved 35 years ago by AWK
(using sub or gsub). He's just reinventing the wheel, and poorly at that.
* Edit: allen2 sure has a good point there too (see post below
). I mean, if this executes once in a while it's pointless trying to spend hours of coding to shave off a few microseconds of execution tme. But if you're going to use this in a situation where it actually matters (like running it a billion times in a loop) then a scripting language probably isn't the best tool for the job in the first place (you'd want something compiled for sure -- and probably make the tool iterate through the files instead of running it a bazillion times). Then again, sometimes regular expressions are also overkill (or not the best pick) for the job and something like a Boyer-Moore search might be faster to find the parts that need replacing. I don't personally bother much with optimization (assuming the code is already half-decently written) until it actually becomes a problem (then
you profile and see what needs to be optimized -- the file I/O, the time spent on string ops, the time spent spawning the same process repeatedly, etc -- and then address that particular problem)