Perl Question

GIDustin

Emperor
Joined
Nov 11, 2001
Messages
1,392
Location
Spearfish, SD
I have been running my site for quite awhile now, and not until yesterday have I ever had a problem with my script writing to the same file twice at the same time (2 users dld a file exactly same time). The Air Transport and the CF100 were completely wiped out due to this.

Anyway, I suppose I should start using flock() on my files, but I am not quite sure on the syntax. Can anyone be of assistance?

Thanks

GIDustin
 
GIDustin: I would seriously suggest using a relational database management system to do a website on. They make life *much much* easier. There are free ones like MySQL and PostGreSQL. A RDBMS will take care of these things for you automagically.

But, I'm assuming that you'd like a solution that'll work easily for you, rather than being told to reimplement your whole system. Sooooo how to use flock....well, it's really quite disgusting actually. Firstly, you have to check that your system actually supports the flock(2) system call. If you're on a Unix-based system, you should be right. Then umm....well then it gets a little more icky.

You see, flock takes these flags, called names like LOCK_SH, LOCK_EX, LOCK_NB, and LOCK_UN in C. The actual values of these vary from system to system though, and Perl will take any value you give it, and pass it on to the C system call. But, Perl doesn't define these constants and vary them from system to system. That bit is up to you.

So, what you want is somewhere at the top of your file something that looks like

my $LOCK_SH = xxx;
my $LOCK_EX = yyy;
my $LOCK_NB = zzz;
my $LOCK_UN = nnn;

The only problem is, you need to find out the value of xxx, yyy, zzz, and nnn. How do you do that? Good question! Well, somewhere inside your standard C header files there should be a header which defines them. It's /usr/include/sys/file.h on my system.

You could try going into /usr/include/sys and grepping for LOCK_SH. Anyway, on my system, GNU/Linux, you end up with

my $LOCK_SH = 1;
my $LOCK_EX = 2;
my $LOCK_NB = 4;
my $LOCK_UN = 8;

and once we've done that painful bit, we can get on to actually locking files!

if you have a filehandle called FILE, you can lock the file so only your process can access it using

flock FILE, $LOCK_EX,;

and unlock it using

flock FILE, $LOCK_UN;

note that these calls will block: i.e. if you try to lock the file, and another process is accessing the file, your process will wait there until the other process locks the file. You can make the calls non-blocking by ORing with the $LOCK_NB flag, and then the function will return immediately, with an error code if it can't gain the lock on the file.

Remember that you have to make sure your script always releases the lock to the file! Otherwise umm...bad things could happen.

-Sirp.
 
Thanks man

I want to learn and use MySQL, but a database on the server costs extra, so I avoided that route. As for the 4 values, I do not have access to that file on the server. I remember it saying that the server was linux, so I am guessing that your values will work as well (They seem to be binary numbers).

Until I learn it completely, I have re-written my scripts to not open the files as much, and no longer keep track of downloads (The files are now open to be read, and not written to anymore). After doing so, I have noticed a slight increase in speed, and no more problems with deleted files, but I lost one of my better features.

GIDustin
 
GIDustin: yeah lots of players do charge you extra for a SQL database unfortunately. If the server is Linux, then it should be the same as the numbers I mentioned. Yes the numbers are powers of 2, because they can be ORed together to make a bitmask, it's just that which value corresponds to which bitmask can vary on different systems.

If you only read the files and never write to them, then you won't need to do any flocking. Also, as an alternative to flock, you can use sysread and syswrite to read/write from the files. These are system calls and thus atomic operations: i.e. a write won't take place while another write is happening. You still have to be a little careful with what you do, but it is an alternative. If you use sysread/syswrite though, you have to be careful not to mix it with 'normal' Perl IO or Very Bad Things will happen.

-Sirp.
 
You ended both of your replies with "Bad things may happen". This cannot be good. :p

I never read about sysread and syswrite, so I will have to go look those up. What do you mean by "dont mix it with normal perl IO"?

:D :D

GIDustin
 
GIDustin: true, that can't be good :) course, bad things happen all the time in programming!

In C in Unix there are two system calls for doing reading and writing, funnily enough named read and write. They are very simple functions: they just read/write a buffer to disk. They don't for example, allow you to read just one line, you have to read n bytes.

Perl tries to make I/O alot easier by letting you read say a line at a time using <FILE>. This along with using the read and write and print functions in Perl is the 'normal' way to do I/O in Perl.

However, Perl goes try to give access to all the C Unix System Calls, and so it provides syswrite and sysread for the write and read system calls. The problem is, that syswrite and sysread will read/write 'raw' blocks of data to disk, while normal Perl I/O has to keep track of certain things that won't be kept track of if you use sysread/syswrite. For that reason, you should choose one method or the other to access a file: use syswrite/sysread and no other Perl file access function, or use the normal Perl file access functions and avoid syswrite/sysread.

The big advantage of syswrite/sysread is that they are 'atomic'. That is, if you do a syswrite, and someone else does a sysread of the same file at the same time, either your syswrite will complete and then their sysread will be done, or their sysread will comple and your syswrite will be done. It will never get half way through doing your syswrite, then do their sysread and then the rest of your syswrite. This is exactly what can happen with normal Perl I/O since it's not atomic.

And yes, I am probably alot better at doing this stuff than explaining it :)

-Sirp.
 
I understand what you are saying completely. Basically if I convert a few of my <FILE>'s to sysread/write, they may conflict with my other <FILE>'s, so I should either change none, or change them all. It would probably take me a good week or so to change them all (my scripts are rather large), but I will do some testing and see what happens. The only problem is that my test server is Win98SE running apache, while the real one is Linux, so "results may vary"....

GIDustin
 
GIDustin: Win98 running apache? Ack! Far from an ideal configuration :)

and yes, results may vary, watch out for differences between binary and text files in Windows in particular. In Unix there is no distinction. Also, sysread/syswrite probably aren't atomic operations on Windows.

Also, you don't have to change them *all* over. Just ones that access the same file. E.g. it's fine to access file 'foo.txt' using <FILE> and 'bar.txt' using sysread/syswrite, but it's not ok to access 'foo.txt' partly using <FILE> and partly using sysread/syswrite. Well actually, it is ok if you close the file and reopen it :)

I'm not sure I'd recommend this approach though, probably better to just use flock...

-Sirp.
 
Originally posted by Sirp
GIDustin: Win98 running apache? Ack! Far from an ideal configuration :)

I tried installing linux on my other comp to use as a server, and I got linux installed just fine, but getting anything else installed was a pain. Obviously "linux" hasnt been told that "exe" stands for executable . . :p That is why I run win98 and apache.

Thanks for all the help. I really appreciate it.

GIDustin
 
Of course 'exe' doesn't stand for executable. In Unix, all files are treated the same, no matter what their name is. This extension thing is a HUGE kludge added by later operating systems.

In Unix, an executable simply has its 'execute' permission bit set using chmod...

-Sirp.
 
and that is why I dont run linux. . . . :p

GIDustin
 
umm....what? You don't run Linux because other operating systems have huge kludges in them? Hmm....fair enough, whatever works for you :)

-Sirp.
 
I think that the others are more user friendly, like dbl clicking an exe runs it, dbl clicking a .doc opens it, etc. Linux may be easier, I dont know, but the one day that I had it installed, the only thing I could really do was play the pre-installed games.

GIDustin
 
'user friendly' is far more what one is familiar with than any genuine objective measure. Personally I find Linux far more 'user friendly' than Windows. I can easily configure all the settings on my computer just by editing a few configuration files, while on Windows I'm lost.

But sure, Linux is a programmer's operating system, I wouldn't recommend it for non-technical users. It just seemed to me that since you write Perl scripts.....you're a programmer :)

-Sirp.
 
Back
Top Bottom