ASP, CGI and PHP Scripts and Record-Locking: What Every Webmaster Needs To
Know by Sunil Tanna
Many of us install server-side (ASP, CGI or PHP) scripts on our web sites, and
many of this scripts store data on the server. However, poorly designed scripts
can experience performance problems and sometimes even data corruption on busy
(and not so busy) web sites.
If you're not a programmer, why should this matter to you?
Answer: Even if you're just installing and using server-side scripts, you'll
want to make sure that the scripts that you choose don't randomly break or
corrupt your data.
First, some examples of the types of scripts which store data on web servers
include:
(Of course, many scripts in each of these (and other) categories are
well-designed, and run perfectly well even on very busy web sites).
1. Follow-up autoresponders typically store the list of subscribers to the
autoresponder, as well where in the sequence of messages, each subscriber is.
Examples of autoresponder scripts:
http://www.scriptcavern.com/scr_email_auto.php
2. Classified ad scripts store (at least) a list of all the classified ads
placed by visitors. Examples of this type of script:
http://www.scriptcavern.com/scr_classified.php
3. Free for all links scripts store a list of all links posted by visitors. See
some example scripts listed at: http://www.scriptcavern.com/scr_ffa.php
4. Top site scripts usually store a list of the members of the top site as well
as information about the number of "votes" that each has received. For examples
of this type of script, see http://www.scriptcavern.com/scr_topsite.php
So what kind of scripts have problems? And what sort of problems am I talking
about?
Well the principle problems all relate to what happens when bits of data from
multiple users needs to be stored on updated at the same time. Some scripts
handle these situations well, but others don't...
DATA CORRUPTION
Here's a common data corruption problem that can occur with many scripts:
1. When some bit of data needs to be updated, a copy of the server-side script
starts running, and then starts updating it.
2. If another user comes along and does an update before the first copy of the
script has finished, a second copy of the script starts running at the same
time.
3. There are a number of ways things can now go wrong, for example:
(a) What if the first copy of the script reads in the data, then the second
copy reads the same data, then the first copy updates the data, then the second
copy updates the data? Answer: any changes made by the first copy of the script
can get lost.
(b) What if the first and second copy of scripts are both adding multiple bits
of new data to the store at the same time? For example, imagine each needs to
store the headline, description and the name of the person posting a classified
ad. Well, what can happen (with some scripts) is the two classified ads can get
intermingled, so you might get (for example) HEADLINE-1, DESCRIPTION-1,
HEADLINE-2, PERSON-1, DESCRIPTION-2, PERSON-2. Or worse yet, you might get bits
of each part of each classified ad, mixed with the bits of the other. This type
of thing is usually really bad news, as your data may consequently becoming
unusable from that point on.
Does this sound too unlikely a problem to worry about? Don't bank on it... even
if it happens only 1 time in 1,000, or 1 in 10,000, eventually it will happen:
You need a solution.
So the real question is: is it possible for programmers to create scripts
without these kinds of problems? Fortunately the answer is yes, and there are a
number of ways that programmers can address it:
1. They can store each bit of data in a separate file. This isn't necessarily a
total solution by itself (in particular, a script which just does this could
still have problems if multiple copies of a script update the same file at the
same time), but it does make data corruption less likely, and if corruption
does occur, at least it won't corrupt the entire data store in one go.
2. They can use file-locking. This means that if one copy of a script is
working with a file, another copy of the script is prevented from working on
that file, until the first copy has finished. File-locking works if done
correctly, but programming it into a script needs to be done very carefully and
precisely, for every single possible case... even a tiny bug or omission can
allow the possibility of data-corruption in through the backdoor!
3. They can use a database (such as MySQL) to store the data. Provided the data
is properly structured in the database, the database handles the locking
automatically. And, as the programmer doesn't have to write their own special
locking routines, the possibility of bugs and omissions are much reduced.
PERFORMANCE PROBLEMS
Of course, avoiding having your data corrupted should be the paramount
consideration in choosing a script, but is there anything else we need to be
concerned about?
Answer: Performance
Of course, all webmasters are aiming to build busy high traffic web sites...
but will your scripts be able to handle the load?
Go back and re-read the paragraph on file-locking. Now think about what would
happen if all the classified ads on your classified page were stored in a
single file (or all the links on your top site, or all the subscribers to your
autoresponder, etc.).
What would happen?
Answer: Because each update can only be performed after the previous update has
been completely finished, your site may be slow, or even unable to handle all
your users' requests.
So what's the solution?
There's two options that programmers can use:
1. They can use lots of small files and file-lock each individually (for
example, one per classified, one per top site listing, etc.). Of course, this
needs to be handled very carefully...
2. They can use a database (like MySQL), as databases allow any one individual
record ("row") to be updated, even when another is also being updated.
IN CONCLUSION
Now, let's summarise:
1. Scripts that store data in files need to use file-locking to avoid
data-corruption, and they also need to break the data into separately
updateable chunks to avoid performance problems on busy web sites.
2. Scripts that store data in databases (like MySQL), provided of course that
they have been properly coded, are usually less likely to suffer from
data-corruption or performance problems.
And one additional point:
3. Even the best script is not immune to hard-disk hardware failures, your web
host being struck by lightning, and all the other snafus that can happen. So,
do take regular back-ups of any data that you can't afford to lose!
In short, even if you're not a script programmer, you need to be aware of data
storage issues. In future, when considering a script for your web site, don't
be afraid to ask some hard questions about how it stores data and how well it
handles multiple users.
This article is Copyright (C) 2005, Answers 2000 Limited.
About the Author: This article was written by Sunil Tanna of Answers 2000. For
a directory of ASP, CGI, PHP and Remotely hosted scripts, please visit
http://www.scriptcavern.com - and for scripts written by Answers 2000
please visit http://www.scriptrocket.com
-----------------------------------------------------------------
-----------------------------------------------------------------
Publication Terms And Conditions:
Answers 2000 Limited grants you a free non-exclusive permission (license) to
publish a copy of this article on your web site or opt-in ezine, subject to you
complying with ALL of the following:
1. You must publish the article in full and unedited (except that you may omit
this Terms and Conditions section, you may omit the word count, and you may
correct any typos that you might find). 2. If you publish on a web site: (i)
you must make ALL links clickable, (ii) you may format the article to fit
within your web site's design, (iii) you must include the copyright notice and
"About the Author" section at the end. 3. If you publish in an ezine: (i) your
ezine must be opt-in with your users having specifically elected to subscribe
to your ezine and with the ability to unsubscribe at any time, (ii) you must
include all link URLs unedited and in full, (iii) you may format the article to
your ezine's layout, (iv) you must include the copyright notice and "About the
Author" section at the end. 4. To the maximum extent permissible under law,
this article is provided "AS IS" without warranties of any kind whether express
or implied. 5. These terms and conditions shall be governed by and construed in
accordance with the laws of England and Wales. Any disputes arising from
matters relating to this article shall be exclusively subject to the
jurisdiction of the courts of England and Wales. You agree that any legal
action against Answers 2000 Limited (or its directors, officers, or employees)
relating to this article or this agreement will be brought in the courts of
London, England, however Answers 2000 Limited reserves right to pursue breach
of these terms in any jurisdiction.
There are 1225 words in this article (including title and About the Author
section).
-----------------------------------------------------------------
About the Author
This article was written by Sunil Tanna of Answers 2000. For a directory of ASP,
CGI, PHP and Remotely hosted scripts, please visit
http://www.scriptcavern.com - and for scripts written by Answers 2000
please visit http://www.scriptrocket.com
|