Intepret date if IsDate is False

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • OldBirdman
    Contributor
    • Mar 2007
    • 675

    Intepret date if IsDate is False

    I am attempting to scrape as much as I can from certain web pages and/or emails. Date is giving me the most trouble.

    I can code this, and have started, but I wondered if anyone has a solution already written, that they would share. I would like such a function, when passed 'Input' to return a date variable. At minimum, it should handle the following:

    Code:
    Input             Intepreted as      Date American
    ------------      -------------      -----------------------
    "4/7/05"          7 April, 2005      #4/7/05#
    "8/6"             8 June, 2009       #6/8/09#
    "8/2002"          1 August, 2002     #8/1/02#
    "August 03"       1 August, 2003     #8/1/03#
    "4.6.06"          4 June, 2006       #7/4/06#
    "8th of March"    8 March, 2009      #3/8/09#
    "8 th of Mar"     8 March, 2009      #3/8/09#
    "December 17 th"  17 December, 2009  #12/17/09#
    "Dec 17th"        17 December, 2009  #12/17/09#
    "August 2009"     1 August, 2009     #8/1/09#
    "Aug09"           1 August, 2009     #8/1/09#
    "September"       1 September, 2009  #9/1/09#
  • FishVal
    Recognized Expert Specialist
    • Jun 2007
    • 2656

    #2
    Just subscribing.

    Comment

    • JustJim
      Recognized Expert Contributor
      • May 2007
      • 407

      #3
      I'll watch this with interest too.

      Comment

      • NeoPa
        Recognized Expert Moderator MVP
        • Oct 2006
        • 32661

        #4
        The problem with this is that humans are too non-algorithmic generally! They don't like to follow decent rules as computers do.

        It's possible, but very laborious, to produce a routine for this, but breaking it down into something neat is pretty well impossible, as there's so little common to the various ways of portraying the dates.

        You can be clever, but you still end up with a large chunk of code, most of which isn't very general.

        BTW I'd be happy if someone proved me wrong. Impressed too of course.

        Comment

        • FishVal
          Recognized Expert Specialist
          • Jun 2007
          • 2656

          #5
          Frankly speaking, it is not that hard task to right code to recognize date in formats provided in post#1. Sure, it is not extremely easy, but not hard either.

          Comment

          • OldBirdman
            Contributor
            • Mar 2007
            • 675

            #6
            I have, over the years, collected a large number of photos of birds (30K+). I am currently attempting to standardize the names, then I will generate an Access database to cross-reference by location, season, plumage, sex, and age. The photographers or web designers are not always good at fully identifying all the data.

            Date is important in that it gives clues to plumage. Birds molt twice a year, and knowing the date identifies plumage. But I don't have to have an exact date. I can work without the year, as the plumage in Feb 1988 would be the same as Feb 2008. Season (winter, spring, ...) would suffice if all birds molted at the same time.

            I'm going to incorporate a date into the file name, with the form "99 XXX00", where 99 is the day, XXX is the month or season, and 00 is the year. Just about any piece of that pattern can be missing, except both month and year. If I only have year, I use 4 digit year.

            To process this date into either a date field, or separate day,month,year fields is easy.

            So, from various sources, I need to paste into a text field some not-quite-random-garbage and produce "3 August09" or "2003" or "April" or "???" or just an empty field. I have to also convert the bird's name to my standards, as well as the location, and any comments by the photographer or website.

            Most of the code here is Basic, not using the capabilities of Access. Access does help with the bird names, and a photographers table so I know where "My neighbor's yard" is in terms of City (or park or area), Country.

            Back to the date. It can be done, to some degree of accuracy, with a "brute force" method. 200 lines of code, each executed once. Nothing clever, like "Format(Join(Sp lit(txtInput)), ....)"

            Each picture must be done individually. There is not automatic process for this, as a picture may be identified as "Same as previous post". So I don't have to cover all possibilities. It's just that the more I can do without typing, the faster the task. I can handle about 2 pictures/minute now if they are by the same photographer, and in chronological order (lots of luck here).

            Comment

            • FishVal
              Recognized Expert Specialist
              • Jun 2007
              • 2656

              #7
              Ok.

              Brute force, a fortiori, could take advantage of Access as RDBMS.
              Let us say, different possible formants could be stored in table and applied on input in a loop. This gives, BTW, an opportunity to create a self-teaching application.

              Comment

              • OldBirdman
                Contributor
                • Mar 2007
                • 675

                #8
                My last email included
                Another shot of the pretty Blue Tit in the Apple tree. March 24 th in my garden, in France. <name>
                I replaced the name for privacy, but I'm already taking advantage of Access. Access has a table of photographers, and another related table of locations. I have every species of bird on the planet in a set of tables from Kingdom, Phylum, Class, Order, Family, Genus, Species by English Common Name

                1) Message displayed in Outlook Express
                2) Double-click Attachment to see if I even want picture. If no, delete email. Otherwise
                3) Highlight entire message and cntl-C to clipboard
                4) Toggle to my picture processing program BirdPix using cntl+tab or cntl+shift+tab
                5) Click cmdNewPaste -
                a: Clears all fields on form
                b: Pastes into an unbound textbox, txtWorkspace, from clipboard
                6) Highlight "Blue Tit" in txtWorkspace, using mouse
                7) Press cmdSpecies
                a: Copies "Blue Tit" to species lookup textbox
                b: Fills Name with "Blue Tit (Cyanistes caeruleus)", the name I want
                c: Fills PictureName textbox with "Blue Tit (Cyanistes caeruleus) ~ ~"
                8) Select from listbox first letter of photographer's name from email
                9) Select correct name from dropdown combobox, already dropped from 8)
                10) Select "City, Country" from displayed listbox containing this photographer's Home Country, Home City/Country, Frequently Visited places, last countries traveled to, etc.
                a: Fills PictureName textbox with "Blue Tit (Cyanistes caeruleus) ~ ~ Paris, Framce"
                11) Highlight "March 24 th" in txtWorkspace
                12) Press cmdDate
                13) Fills txtDate
                a: Fills PictureName textbox with "Blue Tit (Cyanistes caeruleus) ~ ~ Paris, France ~ 24 Mar09"
                14) Press cmdFinalPicName
                a: Copies "562983760 ~ Blue Tit (Cyanistes caeruleus) ~ ~ Paris, France ~ 24 Mar09.jpg" to clipboard
                15) Use cntl+shift+tab to get back to picture
                16) Press s
                a: Save dialog displayed
                b: cntl+V to paste name into save dialog
                c: Press Enter to save picture and close dialog
                17) If there is a second picture of this bird, or the next email contains another picture of this bird, then
                18) Use cntl+tab to return to BirdPix
                19) Press cmdFinalPicName
                a: Generates same name with a different 9 digit number
                b: Returns to picture for save
                20) Loop as needed

                Above seems lengthy, but each kept picture can, on average, be handled in 30 seconds. I'm not sure what I learn, or what I save, by adding a table for different date patterns.

                The species name is bad enough. It will take species by common (English only) name, by latin (scientific) name, a combination, or a partial name. Just the name "Tit" produces a dropdown list of 55 species ending in the word "Tit"

                Comment

                • JustJim
                  Recognized Expert Contributor
                  • May 2007
                  • 407

                  #9
                  Hi OldBirdman,

                  It looks like you've got a very capable application developed there. My thinking on the date problem, is that if you have a text box bound to a field of datatype Date, Access is fairly good at interpreting some user input. For the others, would it be more convenient just to have a calendar control near the date text box?

                  I agree that code could be written to manage the examples you gave in the OP, it would be complicated, but not complex. I think the major downfall is that it would be fairly high maintenance for a while until you had covered all possible formats.

                  Since you get input from over the world, ambiguous dates (dd/mm or mm/dd where the day number <=12) would still need some input from you, I think.

                  All the best

                  Jim

                  Comment

                  • NeoPa
                    Recognized Expert Moderator MVP
                    • Oct 2006
                    • 32661

                    #10
                    I went through a large number of algorithmic approaches on Saturday trying to find a consistent way through this, but there were exceptions in the data for each one I tried.

                    As Jim said, it's not complex. Nothing mathematically convoluted, just laborious to work through. Very untidy.

                    Comment

                    Working...