Re: best design for parse
thank you.
you do have a point but the application I have in mind to get most of easy
to do but boring and repetitive task out user quickly to get their buy in
for the next phrase. The application is not going to be perfect on version 0
but must be flexible to adapt to need change.
Furthermore I choose normalizing date format to yyyy-mm-dd because that is
the standard string date format that is acceptable by almost all standard
windows applications
for the users that I deal with despite locale, despite default display
format.
as a side note right now this application at version zero is not to automate
everything but help users to do their jobs and help us to gain understanding
of what they do. at the same time validate the transform process that will
be used later for automation. version 1 will automate a lot more and may
actually drive some excel, word application process
you could say the version zero is closer to Mickey mouse utility with, if
you wish
"Stephany Young" <noone@localhos twrote in message
news:ukTMMqsMHH A.4712@TK2MSFTN GP04.phx.gbl...
the
Regex("\d{2}/\d{2}/\d{4}|[A-Za-z]{3}/\d{2}/\d{4}|\d{2}/[A-Za-z]{3}/\d{4}|\d{
2}/\d{2}/\d{2}|[A-Za-z]{3}/\d{2}/\d{2}|\d{2}/[A-Za-z]{3}/\d{2}|\d{2}/\d{2}")
_match.Index)
"MM/dd/yyyy",
the
and
to
be
a
part
year
month
inbuild
format
format
>
>
thank you.
you do have a point but the application I have in mind to get most of easy
to do but boring and repetitive task out user quickly to get their buy in
for the next phrase. The application is not going to be perfect on version 0
but must be flexible to adapt to need change.
Furthermore I choose normalizing date format to yyyy-mm-dd because that is
the standard string date format that is acceptable by almost all standard
windows applications
for the users that I deal with despite locale, despite default display
format.
as a side note right now this application at version zero is not to automate
everything but help users to do their jobs and help us to gain understanding
of what they do. at the same time validate the transform process that will
be used later for automation. version 1 will automate a lot more and may
actually drive some excel, word application process
you could say the version zero is closer to Mickey mouse utility with, if
you wish
"Stephany Young" <noone@localhos twrote in message
news:ukTMMqsMHH A.4712@TK2MSFTN GP04.phx.gbl...
Again you're missing the point.
>
I think the best thing you can do is post a relatively small sample of the
text you are attempting to parse.
>
While you're doing that, execute the following and observe the results. It
demonstrates what I am talking about:
>
Dim _source As String = "On 07/01/2007 the quick brown fox jumps over
>
I think the best thing you can do is post a relatively small sample of the
text you are attempting to parse.
>
While you're doing that, execute the following and observe the results. It
demonstrates what I am talking about:
>
Dim _source As String = "On 07/01/2007 the quick brown fox jumps over
lazy dog." & Environment.New Line & _
"On 08/01/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On Jan/09/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 10/Jan/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 11/01/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 01/12/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On Jan/13/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 14/Jan/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 15/01 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"The part number XYZ/72/84 is now discontinued."
>
Dim _regex As New
>
"On 08/01/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On Jan/09/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 10/Jan/2007 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 11/01/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 01/12/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On Jan/13/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 14/Jan/07 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"On 15/01 the quick brown fox again jumps over the lazy dog." &
Environment.New Line & _
"The part number XYZ/72/84 is now discontinued."
>
Dim _regex As New
>
2}/\d{2}/\d{2}|[A-Za-z]{3}/\d{2}/\d{2}|\d{2}/[A-Za-z]{3}/\d{2}|\d{2}/\d{2}")
>
Dim _candidates As Integer = 0
Dim _matches As Integer = 0
>
Dim _match As Match = _regex.Match(_s ource)
>
While _match.Success
_candidates += 1
Console.WriteLi ne("{0} found at index {1}", _match.Value,
Dim _candidates As Integer = 0
Dim _matches As Integer = 0
>
Dim _match As Match = _regex.Match(_s ource)
>
While _match.Success
_candidates += 1
Console.WriteLi ne("{0} found at index {1}", _match.Value,
Try
Console.WriteLi ne("Converted value = {0:yyyy-MM-dd}",
DateTime.ParseE xact(_match.Val ue, New String() {"dd/MM/yyyy",
Console.WriteLi ne("Converted value = {0:yyyy-MM-dd}",
DateTime.ParseE xact(_match.Val ue, New String() {"dd/MM/yyyy",
"MMM/dd/yyyy", "dd/MMM/yyyy", "dd/MM/yy", "MM/dd/yy", "dd/MMM/yy",
"MMM/dd/yy", "dd/MM"}, Nothing, DateTimeStyles. None))
_matches += 1
Catch _ex As Exception
Console.WriteLi ne(_ex.Message)
End Try
_match = _match.NextMatc h()
End While
>
Console.WriteLi ne("{0} candidates found", _candidates)
>
Console.WriteLi ne("{0} matches found", _matches)
>
>
"GS" <gsmsnews.micro soft.comGS@msne ws.Nomail.comwr ote in message
news:eFm$y5rMHH A.4376@TK2MSFTN GP03.phx.gbl...
"MMM/dd/yy", "dd/MM"}, Nothing, DateTimeStyles. None))
_matches += 1
Catch _ex As Exception
Console.WriteLi ne(_ex.Message)
End Try
_match = _match.NextMatc h()
End While
>
Console.WriteLi ne("{0} candidates found", _candidates)
>
Console.WriteLi ne("{0} matches found", _matches)
>
>
"GS" <gsmsnews.micro soft.comGS@msne ws.Nomail.comwr ote in message
news:eFm$y5rMHH A.4376@TK2MSFTN GP03.phx.gbl...
You are sort of on the same track as mine.
I must first apologize I did not tell you the complete story.
Although the application does not exactly know before hand what format
I must first apologize I did not tell you the complete story.
Although the application does not exactly know before hand what format
data may come in, however part of the application allow user to define
record favourite for a website
- to extract by text or html
- header content and format
- record format and date format ( that is where the date format mask
come in)
- optionally ordinal number for each column or re-ordering
- trailer content and format
For a given batch, at least for the body, date format are uniform
furthermore, the need to make the extract process generic and adaptable
- to extract by text or html
- header content and format
- record format and date format ( that is where the date format mask
come in)
- optionally ordinal number for each column or re-ordering
- trailer content and format
For a given batch, at least for the body, date format are uniform
furthermore, the need to make the extract process generic and adaptable
the front end that takes the user definitions, I believe it would be
easier
to "normalize" date string to "yyyy-mm-dd".
Also the end target for of may not necessarily be SQL database but may
easier
to "normalize" date string to "yyyy-mm-dd".
Also the end target for of may not necessarily be SQL database but may
text, pasted to word report. or excel by user
Therefore, I can transform the date format mask to regex in the
appropriate
format and identifier I can use regex,replace to normalize the date. As
Therefore, I can transform the date format mask to regex in the
appropriate
format and identifier I can use regex,replace to normalize the date. As
matter of fact the date separator does not have to / but can be space as
long as there are identifiable delimiter around the date string.
I already have code for dealing with regex for dates from prior project.
all I have to do is adapt to the present need
who knows, maybe I taken on a totally offbeat tract
"GS" <gsmsnews.micro soft.comGS@msne ws.Nomail.comwr ote in message
news:%23vnOBJiM HHA.1280@TK2MSF TNGP04.phx.gbl. ..
long as there are identifiable delimiter around the date string.
I already have code for dealing with regex for dates from prior project.
all I have to do is adapt to the present need
who knows, maybe I taken on a totally offbeat tract
"GS" <gsmsnews.micro soft.comGS@msne ws.Nomail.comwr ote in message
news:%23vnOBJiM HHA.1280@TK2MSF TNGP04.phx.gbl. ..
thanks for all pitched in so far.
>
let give it another shot.
>
looks like an easier way out would be
1.copy the date format string regex string holder and then derive the
relevant regex expression to be used for date normalization later in
>
let give it another shot.
>
looks like an easier way out would be
1.copy the date format string regex string holder and then derive the
relevant regex expression to be used for date normalization later in
2:
replace the regex string the yyyy to regex year expression with
identifier
look for yy and replace with 20yy and repeat the step above
replace mmm with the month regex expression associated with month
identifier
replace mm with the 2 digit month regex expression associated with
look for yy and replace with 20yy and repeat the step above
replace mmm with the month regex expression associated with month
identifier
replace mm with the 2 digit month regex expression associated with
identifier
replace dd with the 2 digit day regix expression assoc. with day
identifier
>
2. use the resulting regex in regex replace to normalize to yyyy--mm-dd
>
>
any problem with the above approach?
>
"Cor Ligthert [MVP]" <notmyfirstname @planet.nlwrote in message
news:%23Qj7TbWM HHA.3944@TK2MSF TNGP06.phx.gbl. ..
GS,
Maybe can you avoid this in 2007 and all things like that as
DateTime.parseE xact, but have a look to the nicely by Microsoft
replace dd with the 2 digit day regix expression assoc. with day
identifier
>
2. use the resulting regex in regex replace to normalize to yyyy--mm-dd
>
>
any problem with the above approach?
>
"Cor Ligthert [MVP]" <notmyfirstname @planet.nlwrote in message
news:%23Qj7TbWM HHA.3944@TK2MSF TNGP06.phx.gbl. ..
GS,
Maybe can you avoid this in 2007 and all things like that as
DateTime.parseE xact, but have a look to the nicely by Microsoft
globalization and than the to that related ToString option.
Cor
"gs" <gs@dontMail.te lusschreef in bericht
news:OtrnsPTMHH A.4720@TK2MSFTN GP03.phx.gbl...
let say I have to deal with various date format and I am give
Cor
"gs" <gs@dontMail.te lusschreef in bericht
news:OtrnsPTMHH A.4720@TK2MSFTN GP03.phx.gbl...
let say I have to deal with various date format and I am give
string from one of the following
dd/mm/yyyy
mm/dd/yyyy
dd/mmm/yyyy
mmm/dd/yyyy
dd/mm/yy
mm/dd/yy
dd/mmm/yy
mmm/dd/yy
dd/mm
what is the best way to come up a relevant regex for the incoming
dd/mm/yyyy
mm/dd/yyyy
dd/mmm/yyyy
mmm/dd/yyyy
dd/mm/yy
mm/dd/yy
dd/mmm/yy
mmm/dd/yy
dd/mm
what is the best way to come up a relevant regex for the incoming
string
a) use two array and statically match
b) use regex to find the order
>
>
>
a) use two array and statically match
b) use regex to find the order
>
>
>
>
Comment