I'm trying to design an HTML page that can edit itself. In essence, it's
just like a Wiki page, but my own very simple version. It's a page full
of plain old HTML content, and then at the bottom, there's an "Edit"
link. So the page itself looks something like this:
<HTML><HEAD><TI TLE>blah</TITLE></HEAD><BODY>
<!-- TEXT STARTS HERE -->
<H1>Hello World!</H1>
<P>More stuff here...</P>
<!-- TEXT ENDS HERE -->
<A HREF="/cgi-bin/editpage.cgi?pa ge=thisfile.htm l">Edit</A>
</BODY></HTML>
So if you click the "Edit" link, the CGI script goes out and reads the
body of the "thisfile.h tml" file. It uses those special "TEXT STARTS
HERE" tags to identify where the editable content starts and stops. It
then just dumps whatever is between those tags into a page that looks
like this:
<HTML><HEAD><TI TLE>blah</TITLE></HEAD><BODY>
<FORM ACTION="/cgi-bin/savepage.cgi" METHOD="POST">
<TEXTAREA NAME="pagetext" COLS="120" ROWS="35" WRAP="OFF">
<H1>Hello World!</H1>
<P>More stuff here...</P>
</TEXTAREA>
<INPUT TYPE=SUBMIT VALUE="Save">
</FORM>
</BODY></HTML>
So, on that page, you can edit whatever you want inside the TEXTAREA and
then click the "Save" button and the "savepage.c gi" script will write the
new data right into "thisfile.html" . And so far, this works great.
But I ran into trouble as soon as I started trying to enter special
characters in the TEXTAREA. For example, I open the page and click the
"Edit" link and then start typing in fancy stuff like this:
<H1>Hello World!</H1>
<P>More stuff here...</P>
<   ; Some text > ;
And it *seems* to work at first. When I save the page and then view it,
sure enough, I see what I'd expect, something like:
Hello World!
More stuff here...
< Some text >
But the next time I try to edit it, I can see that something has gone
horribly wrong underneath the surface. The " " characters are gone,
and have been replaced by some weird unicode whitespace characters (or
something?). The "<" and ">" get replaced by actual "<" and ">"
symbols, which is no good at all. The web browser now thinks that
"<Some text>" is some sort of tag and doesn't display it at all. Yuck.
So I searched around a little, and I see that it's actually standard
behavior for the <TEXTAREA> field to automatically perform conversions
like this. Ok, fine, so how do I turn this "feature" off so that
*exactly* what I type gets saved? At first, I thought that I could fix
it in my CGI script by just always expanding "<" and ">" and other
special characters back into their HTML equivalents.
But that obviously won't work, because then it will mangle every tag in
the whole file. In fact, the text in the example above would become:
<H1>He llo World!</H1>
<P>Mor e stuff here...</P>
...
Ugh. So I can't really solve this in the script. I need to turn this
dumb behavior of the TEXTAREA off, or this just can't work. I need for
"<" to stay "<" and I need for "<" to stay "<" and that's it. Any
ideas?
Thanks for reading, and thanks for any help.
Pat
just like a Wiki page, but my own very simple version. It's a page full
of plain old HTML content, and then at the bottom, there's an "Edit"
link. So the page itself looks something like this:
<HTML><HEAD><TI TLE>blah</TITLE></HEAD><BODY>
<!-- TEXT STARTS HERE -->
<H1>Hello World!</H1>
<P>More stuff here...</P>
<!-- TEXT ENDS HERE -->
<A HREF="/cgi-bin/editpage.cgi?pa ge=thisfile.htm l">Edit</A>
</BODY></HTML>
So if you click the "Edit" link, the CGI script goes out and reads the
body of the "thisfile.h tml" file. It uses those special "TEXT STARTS
HERE" tags to identify where the editable content starts and stops. It
then just dumps whatever is between those tags into a page that looks
like this:
<HTML><HEAD><TI TLE>blah</TITLE></HEAD><BODY>
<FORM ACTION="/cgi-bin/savepage.cgi" METHOD="POST">
<TEXTAREA NAME="pagetext" COLS="120" ROWS="35" WRAP="OFF">
<H1>Hello World!</H1>
<P>More stuff here...</P>
</TEXTAREA>
<INPUT TYPE=SUBMIT VALUE="Save">
</FORM>
</BODY></HTML>
So, on that page, you can edit whatever you want inside the TEXTAREA and
then click the "Save" button and the "savepage.c gi" script will write the
new data right into "thisfile.html" . And so far, this works great.
But I ran into trouble as soon as I started trying to enter special
characters in the TEXTAREA. For example, I open the page and click the
"Edit" link and then start typing in fancy stuff like this:
<H1>Hello World!</H1>
<P>More stuff here...</P>
<   ; Some text > ;
And it *seems* to work at first. When I save the page and then view it,
sure enough, I see what I'd expect, something like:
Hello World!
More stuff here...
< Some text >
But the next time I try to edit it, I can see that something has gone
horribly wrong underneath the surface. The " " characters are gone,
and have been replaced by some weird unicode whitespace characters (or
something?). The "<" and ">" get replaced by actual "<" and ">"
symbols, which is no good at all. The web browser now thinks that
"<Some text>" is some sort of tag and doesn't display it at all. Yuck.
So I searched around a little, and I see that it's actually standard
behavior for the <TEXTAREA> field to automatically perform conversions
like this. Ok, fine, so how do I turn this "feature" off so that
*exactly* what I type gets saved? At first, I thought that I could fix
it in my CGI script by just always expanding "<" and ">" and other
special characters back into their HTML equivalents.
But that obviously won't work, because then it will mangle every tag in
the whole file. In fact, the text in the example above would become:
<H1>He llo World!</H1>
<P>Mor e stuff here...</P>
...
Ugh. So I can't really solve this in the script. I need to turn this
dumb behavior of the TEXTAREA off, or this just can't work. I need for
"<" to stay "<" and I need for "<" to stay "<" and that's it. Any
ideas?
Thanks for reading, and thanks for any help.
Pat
Comment