Looking for a search engine that search a mysql database

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Martien van Wanrooij

    Looking for a search engine that search a mysql database

    I have been using phpdig in some websites but now I stored a lot of larger
    texts into a mysql database. In the phpdig search engine, when you entered a
    search word, the page where the search word was found was displayed with
    about 2 lines before and 2 lines behind the search word itself. Let us say
    you look for "peanut butter" an the word is found in a larger text about
    sandwiches, even when it is on the 40th line of the text you would get
    something like
    "www.mysite .com/sandwich.php
    ....In Holland peanut butter is a popular spread on sandwiches "
    A query like "SELECT title, maintext FROM MYTEXTS WHERE maintext LIKE
    $searchedword" will do most of the job and I can create a query that
    displays only the first 200 characters of maintext, so there will be an
    introductory text about sandwiches and our peanut butter lover maybe will
    skip this page :)

    but I am puzzled about a command that ( either in php or mysql) jumps to
    $searchedword in the maintext field and returns a couple of lines around it.
    Any ideas? If there is an open source php module that could do this I will
    be happy too and maybe I just am overseeing a relatively easy function that
    will do the job.. Google-ing to "mysql php search engines" did not give too
    many hints.

    Thanks for any help.

    Thanks


  • Drakazz

    #2
    Re: Looking for a search engine that search a mysql database

    Read in http://dev.mysql.com/doc/refman/5.0/...xt-search.html
    Full text search is mostly used. About the 200 characters I am not
    sure.
    Also, about the highlighting etc., you could look at MediaWiki's source
    ( http://www.mediawiki.org/ ).

    Thank You.

    Comment

    • Martien van Wanrooij

      #3
      Re: Looking for a search engine that search a mysql database


      "Drakazz" <vykintas.narmo ntas@googlemail .com> schreef in bericht
      news:1147212057 .295508.50950@g 10g2000cwb.goog legroups.com...[color=blue]
      > Read in http://dev.mysql.com/doc/refman/5.0/...xt-search.html
      > Full text search is mostly used.[/color]
      Thank you Drakazz.. as I said I thought maybe I am just entering the wrong
      keywords in google , I didn't think about keywords like fulltext search but
      as an afterthought it makes sense to me . I think this article will resolve
      a lot of the problem

      Martien


      Comment

      • Rik

        #4
        Re: Looking for a search engine that search a mysql database

        Drakazz wrote:[color=blue]
        > Full text search is mostly used. About the 200 characters I am not
        > sure.[/color]

        No idea, but two methods come to mind:
        Assuming $text is the returned text from the database, and $string is the
        searchword:

        Normal functions:

        $occurance = stripos($text, $string);
        $start = ($occurance-100 < 0) ? 0: $occurance-100;
        $display = substr($start, 200 + strlen($text));

        Advantage is it's quick, disadvantage is will only find the first occurance,
        and will cut up words.

        A probably more versatile method are regular expressions:

        $chars = 100; // (the desired characters before and after)
        $allowword = 20; //extra characters allowed to find a word boundary

        $allow = $chars + $allowword;
        $else = $chars-1; //pff, naming variables is a drag

        $search = preg_quote($str ing, '/'); //escape all characters that could have
        special meaning:

        preg_match_all( '/(^(?:.){0,'.$el se.'}|\b(?:.){' .$chars.','.$al low.'})('.$sea
        rch.')((?:.){'. $chars.','.$all ow.'}\b|(?:.){0 ,'.$else.'}$)/si', $text,
        $matches, PREG_SET_ORDER) ;

        Now you have an array $matches, that contains the searchstring and
        surrounding $chars characters. The expressions tries to keep words whole,
        with a maximum of extra characters given bij $allowword. It's no problem
        when there aren't that many characters in front or behind the searchstring,
        in that case the matchs just returns from the beginning or untill the end
        respectively.

        $matches is now an array, containg:
        $matches[index_of_match][0] = The entire text.
        $matches[index_of_match][1] = The preceeding text.
        $matches[index_of_match][2] = The searchstring.
        $matches[index_of_match][3] = The proceeding text (? don't know wether this
        is good english)

        Matches can be diplayed like:
        foreach($matche s as $match){
        print $match[0];
        }

        But maybe you want to highlight your searchstring, no problem:

        foreach($matche s as $match){
        print $match[1].'<span
        class="highligh t">'.$match[2].'</span>'.$match[3];
        }

        When looking for several words, you could even change the search string like
        this:

        $searcharray = array('searchst ring','some other word', 'yet another');
        $search = implode('|',arr ay_map('preg_qu ote', $searcharray));

        And just apply the same regex. Note that will give back a match for each
        word seperately. How to prevent those "double" matches is a whole other
        ballgame. Coming here I realize that even searching for one term could give
        you doubles.

        Highlighting the other searchterms can't be done using just the matches
        array. While keeping the double entries, every searchterm can be highlighted
        like:

        foreach($matche s as $match){
        print preg_replace('/('.$search.')/si', '<span
        class="highligh t">\1</span>', $match[0]);
        }

        Doubles could be prevented by using PREG_OFFSET_CAP TURE in the folowwing
        regex:

        $searcharray = array('searchst ring','some other string', 'yet another');
        $search = implode('|',arr ay_map('preg_qu ote', $searcharray));
        preg_match_all( '/'.$search.'/si',$text, $matches,PREG_O FFSET_CAPTURE);

        And then looping through $matches[0], gathering the surrounding text with
        preg_matches on substrings (makes it a lot quicker), and checking wether or
        not the offset of the following match is "within reach".

        Create a substring from the text from searchterms close to eachother, with
        max allowed characters +1 on either side.

        pregmatch('/(\b{'.$chars.', '.$allow.'}|^.{ './*exact number of preceeding
        chars*/'}).{'./*exact_length from first offset to last offset plus
        stringlength last searchterm*/.'}(.{'.$chars. ','.$allow.'}\b |.{'./*exact
        number of proceeding chars*/'}$)/si', $substring, $combinations.
        PREG_SET_ORDER) ;

        foreach($combin ations as $final){
        print preg_replace('/('.$search.')/si', '<span
        class="highligh t">\1</span>', $final[0]);
        }


        Grtz,

        --
        Rik Wasmus


        Comment

        • Martien van Wanrooij

          #5
          Re: Looking for a search engine that search a mysql database


          "Rik" <luiheidsgoeroe @hotmail.com> schreef in bericht
          news:e3rm7c$kvp $1@netlx020.civ .utwente.nl...
          [ detailed explanation}
          Thank you Rik, for a "luiheidsgoeroe " you did a lot of work to resolve my
          problem :) This is the solution I was looking for.

          Martien


          Comment

          • Rik

            #6
            Re: Looking for a search engine that search a mysql database

            Martien van Wanrooij wrote:[color=blue]
            > Thank you Rik, for a "luiheidsgoeroe " you did a lot of work to
            > resolve my problem :) This is the solution I was looking for.[/color]


            No problem.
            Not living up to the name indeed... it's about time to rectify that...

            Grtz,
            --
            Rik Wasmus


            Comment

            Working...