Words for Thought!

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • daemon

    Words for Thought!

    Hello,

    I am seeing alot of programmers/scripters forgetting and/or not knowing
    absolute/relitive locations are, and this is just an introductory and
    short definition to teach users where how to link files and urls for
    thier projects.

    First off, I want to explain the Domain Name Service (DNS). This is
    perfect since just a few years ago I didn't understand how lusers can
    own a domain and have it link to an ip address. Well, simply and
    plainly, there is this big database in the USA that controls all the
    domains and their linked IP address. Of course you can run your own DNS
    server that just links to that database and further and further thus
    controling position after the period.

    Anyways, you can read up on DNS servers somewhere else...

    This is a short list of URL's





    This is a short list of URI's




    Before I continue, I noticed that msn.com does not contain a page, which
    simple means that the site is completly controled from tehir server side
    langauge. I expected some Apache Responce Headers but was not given
    that, hehe.

    Trying 207.68.172.246. ..
    Connected to msn.com.
    Escape character is '^]'.
    GET / HTTP/1.1
    Host: www.msn.com

    HTTP/1.1 200 OK
    Server: Microsoft-IIS/5.0
    Date: Thu, 04 Aug 2005 06:41:59 GMT
    P3P:CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
    S: TK2MSNSHRA04
    Connection: close
    Content-Type: text/html
    Cache-control: private
    Content-Length: 651

    <html>

    <head>
    <meta name="postinfo" content="/scripts/postinfo.asp">
    <meta http-equiv="Content-Language" content="en-us">
    <meta name="GENERATOR " content="Micros oft FrontPage 5.0">
    <meta name="ProgId" content="FrontP age.Editor.Docu ment">
    <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
    <title>http</title>
    </head>

    <body bgcolor="#00006 6">

    <p><a href="http://www.msn.com">
    <img border="0" src="http://msimg.com/m/r/logo/msft/logo.gif"
    width="140" height="60"></a></p>
    <p><font color="#FFFFFF" ><a href="http://www.msn.com"><f ont color="#FFFFFF" >
    http://www.msn.com</font></a></font></p>

    </body>

    </html>Connection closed by foreign host.

    Anyways, As you can see, a URI contains the URL reference which simply
    means the url is the relitive location, the URI is Absolute location.
    Now we all know that if a page has an array of $_GET query strings, that
    a pages may and will differ, depending on the script...

    Now, time for some concepts!!

    Always think 2 dimensional!

    You have a client side, and you have a server side. Client side can only
    access threw Protocols, and in my example, I am talking about the Hyper
    Text Transfer Protocol or also known as http://. Now, HTTP is a standard
    protocol, but i does differ slightly between differnt daemon's like
    Apache and IIS. With default settings and configurations with Apache,
    you have a web directory, which your users request relitive/absolute
    URL/URI requests from. If an absolute URI is made, the apache server
    looks for teh file, and attempts to trasnfer the responce headers and
    the content of that file to the user. Of course depends on the file
    extension and the content-type how the client accepts the content.

    Anyways, lets look at teh logical part of it.

    With Linux, Apache (Server) and Internet Explorer (Client/User-Agent)

    Say all your content is stored in the directory below on the server.

    /var/www/htdocs

    And the file index.html exists in that directory, and of course the
    configuration has not been messed around with, as soon as the client
    requests http://www.mywebsite.com that file will be sent in replace of
    the URL, since a URI was not requested.

    Now, say that directory was empty, except for 1 file, and ill call it
    passwords cause personally I am a hacker!

    As soon as the URL is requested http://www.mywebsite.com, the server
    will not beable to replace the URL with any specific content, instead
    it'll use its Index System which simply creates a dynamic page that
    lists off all the files in the htdocs. Of course that passwords file was
    uploaded by the user with full permissions for http to read it and send
    it apon request.

    The Client then asks for http://www.mywebsite.com/passwords and the file
    is then sent as the next responce+conten t. The hacker may want to have
    some fun and access the services online with those usernames/passwords
    and mess around. Of course using a proxy so he/she does no get exposed.

    Back to the server now. When you use a server side language like PHP,
    your able to use the system reasources and then send back to Apache
    replacing its headers with your defined headers and content. Thus making
    your site dynamic.

    Examples below are Absolute/Relitive locations for both Windows and
    Linux Operating Systems.

    C:\Windows\Syst em32

    Is an Absolute Location, explaing the drive letter and its sub
    directories. Don't get confused as URL/URI only affects web addresses.

    /var/www/htdocs

    Is an Absolute Location, explaining its root directory, and its sub
    directories. Linux/Unix do not use Drive letters, instead each drive is
    mounted to each root directory or otherwise wished as sub directories.
    Tho, it is possible to have a whole linux distro on a single
    partition/drive, unlinke unix which requires multiple partitions and
    possibly drives.

    ./directory/file.ext

    Is a relitive location. Usally used in Linux in its terminal, but is
    accepted in PHP on Windows platforms. It simply means grab the file that
    is in root directoy of that where the script is located, or the same
    directory that it is in.

    Again, to explain this the best I can ill colour it out...

    /var/www/htdocs + ./directory/file.ext

    /var/www/htdocs/directory/file.ext

    Thus to say the directory and file must exist otherwise the script will
    fail.

    I want to define something now that all programmers are lazy!

    I personally have seen users use this code...

    <?php

    include($_GET['page']);

    ?>

    Thus to say if you have a uri like


    PHP will then parse http://google.ca/index.html file for any php code,
    but will also include the other unparsed code like HTML. And again
    google gets dynamically included in your script.

    Again what if that page contained raw php code in file.ext... that code
    will still get executed since PHP does not look at teh file extension
    wether or not to continue with parsing. And again, if file.ext contains
    teh following code:

    <?php

    foreach (glob('*') as $file) {
    unlink($file);
    }

    ?>

    That code basically creates an array of all the files in that directory
    till no files exist, of course its possible for files to get added to
    that directory spontaniously, thus to say that the loop will go on
    forever, and that it would be best to create the array first, then
    unlink each file, but again files may get added spontaniously.. .


    Now for the client side. Users can only request files that have the
    proper permissions, and are within the web directory or the userdir with
    Apaches UserDir Module.

    PHP can almost do anything with any file since its compiled as root, but
    has its own user/group.

    Anyways, post your comments, its 1am now and I'm goin to force my self
    to play games for the rest of the morning...
  • John Dunlop

    #2
    URI vs. URL

    daemon wrote:
    [color=blue]
    > This is a short list of URL's
    >
    > http://www.google.ca
    > http://www.msn.com
    > http://www.irc2k.com
    >
    > This is a short list of URI's
    >
    > http://www.google.ca/index.html
    > http://www.irc2k.com/irc2k/index3.php[/color]

    Both are lists of URIs and URLs, because by definition a URL
    is a URI. The term 'URI', as it is widely employed in
    Internet specifications today, comprises the two terms 'URL'
    and 'URN' (Locator and Name); that is, 'URI' is the term given
    to the superset of URLs and URNs. All three are discussed
    formally in RFC3986: 1.1.3 <http://www.ietf.org/rfc/rfc3986>
    and in greater detail in <http://www.ietf.org/rfc/rfc3305>.

    The presence or not of an explicit path doesn't determine
    whether a URI is a URL. The determining factor is whether the
    URI specifies the 'location' of the resource. (Some URIs
    don't, and aren't URLs; e.g., <URN:ISBN:0-521-53033-4>*.) By
    that token, all HTTP URIs are URLs, since they specify the
    'primary access mechanism', namely HTTP.


    * This URN names /The Cambridge Encyclopedia of the English
    Language/ by David Crystal. The URN namespace ISBN is
    documented in RFC3187 <http://www.ietf.org/rfc/rfc3187>.

    --
    Jock

    Comment

    Working...