how can i grab my tables and display them on another page using reg-ex

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • luke
    New Member
    • May 2012
    • 14

    how can i grab my tables and display them on another page using reg-ex

    For some reason

    page 1.html // get contents from page 2.html

    [code=php]

    function get_tag($htmlel ement,$attr, $value, $html)
    {
    $attr = preg_quote($att r);
    $value = preg_quote($val ue);

    if($attr!='' && $value!='')
    {

    $tag_regex = '/<'.$htmlelement .'[^>]*'.$attr.'="'.$ value.'" width="100%">(. *?)<\\/'.$htmlelement. '>/si';

    $matchCount = preg_match($tag _regex,$html,$m atches);

    if ($matchCount > 0)
    {
    echo("$matchCou nt matches found.\n");
    }
    else
    {
    echo("no records");
    }

    }
    }
    $htmlcontent = file_get_conten ts("http://page2.html/");
    $extract = get_tag("table" ,"class", "tablemenu" , $htmlcontent);

    echo $extract;



    [/code]

    page2.html

    I would like to grab all of the tables from the page and display them on another one.

    [code=html]

    // more code

    <table width="100%" class="tablemen u">
    <tbody>
    <tr>
    <td>

    </td>
    </tr>
    </tbody>
    </table>

    <table width="100%" class="tablemen u">
    <tbody>
    <tr>
    <td>
    // some data
    </td>
    </tr>
    </tbody>
    </table>

    <table width="100%" class="tablemen u">
    <tbody>
    <tr>
    <td>
    // some data
    </td>
    </tr>
    </tbody>
    </table>

    <table width="100%" class="tablemen u">
    <tbody>
    <tr>
    <td>
    // some data
    </td>
    </tr>
    </tbody>
    </table>

    // more code

    [/code]

    I have kept trying but had no luck :(
  • Atli
    Recognized Expert Expert
    • Nov 2006
    • 5062

    #2
    Regular expressions are not a good way to find things in a HTML document. HTML syntax is to irregular for it to be reliable. You need an actual parser if you want this to work properly.

    PHP has a built in libraries that can be used to parse HTML documents. Most of them are more focused on XML, but many can also be used to deal with HTML. - The DOM extension, for example, has a DOMDocument class with a loadHTMLFile method. With that, you can traverse the HTML structure in much the same way you would in JavaScript.

    For example:
    Code:
    <?php
    
    $filePath = "/path/to/HTML/file.html";
    
    // Load the HTML file.
    $dom = new DOMDocument();
    $dom->loadHTMLFile($filePath);
    
    // Find all the tables.
    $tables = $dom->getElementsByTagName("table");
    
    // Go through the table list and look for tables
    // with the class: "tablemenu"
    if ($tables && $tables->length > 0) {
        for ($i = 0; $i < $tables->length; ++$i) {
            $item = $tables->item($i);
            $class = $item->attributes->getNamedItem("class")->nodeValue;
            if ($class == "tablemenu") {
                // The $item here would be one of the
                // tables we are looking for!
            }
        }
    }

    Comment

    • luke
      New Member
      • May 2012
      • 14

      #3
      Getting an error Trying to get property of non-object in C:\wamp\www

      reffers to this line...

      $class = $item->attributes->getNamedItem(" class")->nodeValue;

      but my table does print out.

      [code=php]
      $filePath = "http://msn.net/";

      // Load the HTML file.
      $dom = new DOMDocument();
      @$dom->loadHTML(file_ get_contents($f ilePath));

      // Find all the tables.
      $tables = $dom->getElementsByT agName("table") ;

      if ($tables && $tables->length > 0) {
      for ($i = 0; $i < $tables->length; ++$i) {
      $item = $tables->item($i);
      $class = $item->attributes->getNamedItem(" class")->nodeValue;

      if ($class == "tablemenu" ) {

      echo $item->nodeValue;
      echo "<br />";
      }
      }
      }
      [/code]

      Comment

      Working...