Splitting text with regular expressions

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • David Jackson

    Splitting text with regular expressions

    Hello,

    The company I'm working for has taken over a smaller company with a fairly
    large customer base. We want to send an email to that customer base
    informing them of the takeover but the mailing list is not held in a
    database. In fact we've been given it as a Word document.

    The individual email addresses are in the format: "Name <address>" e.g.

    Bill Gates <billg@microsof t.com>;

    and I've been tasked with the job of splitting the data into its constituent
    parts so that we can store them separately in our database.

    I wondered if regular expressions might be the most efficient way of doing
    this?

    Can anyone help me with some guidance on how I might do this?

    Thanks,

    DJ

  • bob clegg

    #2
    Re: Splitting text with regular expressions

    Hi David,
    If you stay in the Word format you will using the Word interop DLL to
    move around and capture chunks of text.
    Once you have captured a string it doesn't matter much whether you use
    regex or string functions to break the string into email address and
    name. IMHO.
    I prefer regex but am not an expert and rely heavily on Regex Buddy to
    construct the expressions.
    Given this job is a once only and I dare say you are under a bit of
    pressure to get this finished my gut reaction is to get the data into
    a csv file if possible. (Word Table -Excel -CSV ) then you can
    read it line by line and use string functions to break it up prior to
    writing to your database.
    hth
    Bob

    On Fri, 26 Sep 2008 10:24:09 +0100, "David Jackson"
    <someone@somewh ere.comwrote:
    >Hello,
    >
    >The company I'm working for has taken over a smaller company with a fairly
    >large customer base. We want to send an email to that customer base
    >informing them of the takeover but the mailing list is not held in a
    >database. In fact we've been given it as a Word document.
    >
    >The individual email addresses are in the format: "Name <address>" e.g.
    >
    >Bill Gates <billg@microsof t.com>;
    >
    >and I've been tasked with the job of splitting the data into its constituent
    >parts so that we can store them separately in our database.
    >
    >I wondered if regular expressions might be the most efficient way of doing
    >this?
    >
    >Can anyone help me with some guidance on how I might do this?
    >
    >Thanks,
    >
    >DJ

    Comment

    • David Jackson

      #3
      Re: Splitting text with regular expressions

      "bob clegg" <cutbob_clegg@r emooove.xtra.co .nzwrote in message
      news:q5cpd4laen 5biopl71ua4bd7g io04tf4ud@4ax.c om...

      Hi Bob,

      Thanks for the reply.
      >Can anyone help me with some guidance on how I might do this?
      >
      If you stay in the Word format you will using the Word interop DLL to
      move around and capture chunks of text.
      No intention to stay in the Word format.
      I prefer regex but am not an expert and rely heavily on Regex Buddy to
      construct the expressions.
      OK, I'll have a look at that for the future.
      Given this job is a once only and I dare say you are under a bit of
      pressure to get this finished my gut reaction is to get the data into
      a csv file if possible. (Word Table -Excel -CSV ) then you can
      read it line by line and use string functions to break it up prior to
      writing to your database.
      In fact, a colleague suggested a much better alternative for this:

      string strRawEmail = "Bill Gates <billg@microsof t.com>";
      MailAddress objMailAddress = new MailAddress(str RawEmail);
      string strEmailAddress = objMailAddress. Address;
      string strDisplayName = objMailAddress. DisplayName;

      DJ

      Comment

      Working...