Parse variable names from string

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Victor Lagerkvist

    Parse variable names from string

    Hello, I have the need to parse variable names from a string and save them
    somewhere safe for future usage. Here's my first attempt (I don't have any
    rules for valid names yet) - but I have a feeling that it's unnecessary
    complex? Any input would be greatly appreciated.

    #include <stdio.h>
    #include <stdlib.h>

    int get_ops(char *sen, char ***atom, char limit);

    int main(void)
    {
    char *test = "a, b, c, d"; /* The real input is stripped of spaces */
    char **atom;
    get_ops(test, &atom, ',');
    return 0;
    }

    int get_ops(char *sen, char ***atom, char limit)
    {
    int i, j, k;
    char **tmp1, *tmp2;
    *atom = malloc(sizeof (char *));
    **atom = malloc(1);
    for (i = j = k = 0; sen[i] != '\0'; ++i) {
    if (sen[i] != limit) {
    tmp2 = realloc((*atom)[k], j+2);
    if (tmp2 == NULL)
    return -2;
    (*atom)[k] = tmp2;
    tmp2 = NULL;
    (*atom)[k][j++] = sen[i];
    }
    else if (sen[i] == limit) {
    (*atom)[k++][j] = '\0';
    tmp1 = realloc(*atom, (k +1)*sizeof(char *));

    if (tmp1 == NULL)
    return -2;
    *atom = tmp1;
    tmp1 = NULL;
    (*atom)[k] = malloc(1);
    j = 0;
    }
    }
    return 0;
    }

  • user923005

    #2
    Re: Parse variable names from string

    On Jul 3, 12:53 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
    Hello, I have the need to parse variable names from a string and save them
    somewhere safe for future usage. Here's my first attempt (I don't have any
    rules for valid names yet) - but I have a feeling that it's unnecessary
    complex? Any input would be greatly appreciated.
    >
    #include <stdio.h>
    #include <stdlib.h>
    >
    int get_ops(char *sen, char ***atom, char limit);
    >
    int main(void)
    {
    char *test = "a, b, c, d"; /* The real input is stripped of spaces */
    char **atom;
    get_ops(test, &atom, ',');
    return 0;
    >
    }
    >
    int get_ops(char *sen, char ***atom, char limit)
    {
    int i, j, k;
    char **tmp1, *tmp2;
    *atom = malloc(sizeof (char *));
    **atom = malloc(1);
    for (i = j = k = 0; sen[i] != '\0'; ++i) {
    if (sen[i] != limit) {
    tmp2 = realloc((*atom)[k], j+2);
    if (tmp2 == NULL)
    return -2;
    (*atom)[k] = tmp2;
    tmp2 = NULL;
    (*atom)[k][j++] = sen[i];
    }
    else if (sen[i] == limit) {
    (*atom)[k++][j] = '\0';
    tmp1 = realloc(*atom, (k +1)*sizeof(char *));
    >
    if (tmp1 == NULL)
    return -2;
    *atom = tmp1;
    tmp1 = NULL;
    (*atom)[k] = malloc(1);
    j = 0;
    }
    }
    return 0;
    >
    }
    I guess that it is not nearly complex enough.
    If you are gathering variable names from {presumably} C source code,
    it will have to be fully grammar aware.
    Normally, parsers put variable names into a hash table.
    I suggest that you get an existing C parser, and just read the
    variable list it creates when it scans a source file.
    Here is a place to find a C grammar:

    It works with the Gold Parser.
    There are C grammars all over the place, so I am sure you can find one
    for YACC or Antlr or whatever.


    Comment

    • Victor Lagerkvist

      #3
      Re: Parse variable names from string

      user923005 wrote:
      On Jul 3, 12:53 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
      >Hello, I have the need to parse variable names from a string and save
      >them somewhere safe for future usage. Here's my first attempt (I don't
      >have any rules for valid names yet) - but I have a feeling that it's
      >unnecessary complex? Any input would be greatly appreciated.
      >>
      >#include <stdio.h>
      >#include <stdlib.h>
      >>
      >int get_ops(char *sen, char ***atom, char limit);
      <snip>
      I guess that it is not nearly complex enough.
      If you are gathering variable names from {presumably} C source code,
      it will have to be fully grammar aware.
      Normally, parsers put variable names into a hash table.
      I suggest that you get an existing C parser, and just read the
      variable list it creates when it scans a source file.
      Here is a place to find a C grammar:

      It works with the Gold Parser.
      There are C grammars all over the place, so I am sure you can find one
      for YACC or Antlr or whatever.
      Actually, the only functionality I truly need is the names of the variables
      (there's only one "type") and the number of them - everything else is a
      bonus! They are given by the user from standard input, line by line, such
      as:
      << build a, b, c

      And that's all there is (more or less any types of names should be allowed),
      and for some reason I usually become a sad panda when the code "runs away".

      Comment

      • user923005

        #4
        Re: Parse variable names from string

        On Jul 3, 2:47 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
        user923005 wrote:
        On Jul 3, 12:53 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
        Hello, I have the need to parse variable names from a string and save
        them somewhere safe for future usage. Here's my first attempt (I don't
        have any rules for valid names yet) - but I have a feeling that it's
        unnecessary complex? Any input would be greatly appreciated.
        >
        #include <stdio.h>
        #include <stdlib.h>
        >
        int get_ops(char *sen, char ***atom, char limit);
        <snip>
        I guess that it is not nearly complex enough.
        If you are gathering variable names from {presumably} C source code,
        it will have to be fully grammar aware.
        Normally, parsers put variable names into a hash table.
        I suggest that you get an existing C parser, and just read the
        variable list it creates when it scans a source file.
        Here is a place to find a C grammar:

        It works with the Gold Parser.
        There are C grammars all over the place, so I am sure you can find one
        for YACC or Antlr or whatever.
        >
        Actually, the only functionality I truly need is the names of the variables
        (there's only one "type") and the number of them - everything else is a
        bonus! They are given by the user from standard input, line by line, such
        as:
        << build a, b, c
        >
        And that's all there is (more or less any types of names should be allowed),
        and for some reason I usually become a sad panda when the code "runs away".- Hide quoted text -
        >
        In that case, why not just store them in a hash table to ensure you do
        not have duplicates.

        Comment

        • Victor Lagerkvist

          #5
          Re: Parse variable names from string

          user923005 wrote:
          On Jul 3, 2:47 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
          >user923005 wrote:
          On Jul 3, 12:53 pm, Victor Lagerkvist <plumsa...@gmai l.comwrote:
          >Hello, I have the need to parse variable names from a string and save
          >them somewhere safe for future usage. Here's my first attempt (I don't
          >have any rules for valid names yet) - but I have a feeling that it's
          >unnecessary complex? Any input would be greatly appreciated.
          >>
          >#include <stdio.h>
          >#include <stdlib.h>
          >>
          >int get_ops(char *sen, char ***atom, char limit);
          ><snip>
          I guess that it is not nearly complex enough.
          If you are gathering variable names from {presumably} C source code,
          it will have to be fully grammar aware.
          Normally, parsers put variable names into a hash table.
          I suggest that you get an existing C parser, and just read the
          variable list it creates when it scans a source file.
          Here is a place to find a C grammar:
          >http://www.devincook.com/goldparser/grammars/index.htm
          It works with the Gold Parser.
          There are C grammars all over the place, so I am sure you can find one
          for YACC or Antlr or whatever.
          >>
          >Actually, the only functionality I truly need is the names of the
          >variables (there's only one "type") and the number of them - everything
          >else is a bonus! They are given by the user from standard input, line by
          >line, such as:
          ><< build a, b, c
          >>
          >And that's all there is (more or less any types of names should be
          >allowed), and for some reason I usually become a sad panda when the code
          >"runs away".- Hide quoted text -
          >>
          >
          In that case, why not just store them in a hash table to ensure you do
          not have duplicates.
          Ah, that would no doubt be sleek in this case. I thank thee for thine input.

          Comment

          Working...