G'day
I have some pages written by a bot and much of the code does not
concern the visible content on the site. I'd like to strip all the
codes that do not affect or influence the visible stuff (although I'd
like to keep the nested tables, if possible). Some of this can be
stripped using Search/Replace, but some of it contains codes which
differ from page to page.
How many pages? About 750, totalling 80 megabytes of data, which I'm
hoping to reduce when I "clean" the code.
Do you know of any tool that can do this? A tool that can be set to
strip all codes except HTML 2.0 would, for example, also be useful
except I'll lose the nested tables (which is not a *gigantic*
loss...).
I tried converting everything to TXT but most HTML2TXT programs
deliver very poor results. I did find some code strippers that
attempt to maintain the tables layout (but that is even less
preferred). If the stuff is gonna be in plaintext, then there should
be an intelligent way of dealing with nested tables.
Any advice, people? What tool can you recommend? Preferably for W95x
(but Linux would be fine too as long as it is newbie-friendly),
preferably freeware (or shareware, but I don't intend buying).
I have some pages written by a bot and much of the code does not
concern the visible content on the site. I'd like to strip all the
codes that do not affect or influence the visible stuff (although I'd
like to keep the nested tables, if possible). Some of this can be
stripped using Search/Replace, but some of it contains codes which
differ from page to page.
How many pages? About 750, totalling 80 megabytes of data, which I'm
hoping to reduce when I "clean" the code.
Do you know of any tool that can do this? A tool that can be set to
strip all codes except HTML 2.0 would, for example, also be useful
except I'll lose the nested tables (which is not a *gigantic*
loss...).
I tried converting everything to TXT but most HTML2TXT programs
deliver very poor results. I did find some code strippers that
attempt to maintain the tables layout (but that is even less
preferred). If the stuff is gonna be in plaintext, then there should
be an intelligent way of dealing with nested tables.
Any advice, people? What tool can you recommend? Preferably for W95x
(but Linux would be fine too as long as it is newbie-friendly),
preferably freeware (or shareware, but I don't intend buying).
Comment