Parsing XML file

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • PhilOfWalton
    Recognized Expert Top Contributor
    • Mar 2016
    • 1430

    #16
    Which universe? I thought we were now in the Multi-Universe situation.

    Come on zmbd widen your horizons a bit!!

    Phil

    Comment

    • zmbd
      Recognized Expert Moderator Expert
      • Mar 2012
      • 5501

      #17
      I did...
      [t_Universe]
      [PK_U][UUuid][UName][UDateLastTouche d]

      [t_Galaxy]
      [PK_G][FKUniverse[GUuid][GName][GDateLastTouche d]

      etc...

      :)
      String theory rules... String Theory for Kids (and Clever Adults)

      Comment

      • zmbd
        Recognized Expert Moderator Expert
        • Mar 2012
        • 5501

        #18
        Pulling stuff into tables now...
        Lots of repeated code so I'm pulling these into their own procedures - in the old days I would have used a lot of GoSub routines for these sections.

        As I pull the sections into their on procedures I'm altering the code to use arrays to pass table field and XML node names between the calling Recursive and the dependencies. This way I don't have to hardcode for each table

        I think I also have the Group level issue fixed... next stage once I have the universe test version running.

        Comment

        • Hagran
          New Member
          • Aug 2018
          • 2

          #19
          Can you please show us the latest version of your XML?
          Because I had a pretty similar probleem myself, it is a real pain to manage all this data.

          Comment

          • zmbd
            Recognized Expert Moderator Expert
            • Mar 2012
            • 5501

            #20
            So it is WAY too late
            and
            I am WAY too tired to explain this file in it's entirety.

            I've made two XML files they should be fairly easy to read

            The code is fairly well commented.

            This is only the POC for my final XML data file...
            My issue with my final file is that the same tag-name "<Group>" is used like a psychotic nested-if-then so I still have some work to do on passing the recursion levels up and down the tree.

            but that is for another day.
            Attached Files

            Comment

            • PhilOfWalton
              Recognized Expert Top Contributor
              • Mar 2016
              • 1430

              #21
              Brilliant piece of work, zmbd.

              I am a timorous wee bestie, and wonder if we can combine your method and my method where I use the table “Codes” to load the “Clues” into the zSpiderTableFie lds array.

              Then you could go on to study taxonomy of anything

              Phil

              Comment

              • zmbd
                Recognized Expert Moderator Expert
                • Mar 2012
                • 5501

                #22
                Thank You PhilOfWalton.

                I'll have to look at your database again; however, should be able to open a recordset against the "Codes" table and maybe feed the "Clues" directly thus, bypassing the array...

                The next challenge is parsing the Big-XML
                Where some psychotic nested the <Group> and other tags like nested IF-Then from hell (literally hundreds of these nesting's) reusing the same tag names over and over again with each child-node!
                Code:
                <Root>
                    <Group>
                        <UUID></UUID>
                        <Times>
                             <LastUpDate></LastUpDate>
                        </Times>
                        <Item></Item>
                        -some entries-
                        <Group>
                            <UUID></UUID>.
                            <Times>
                                <LastUpDate></LastUpDate>
                            </Times>
                            <Item>
                                -some entries-
                            </Item>
                            <Group>
                                <UUID></UUID>
                                <Times>
                                     <LastUpDate></LastUpDate>
                                </Times>
                                <Item>
                                     -some entries-
                                </Item> 
                            </Group>
                            <Group>
                                 <UUID></UUID>
                                 <Times>
                                      <LastUpDate></LastUpDate>
                                  </Times>
                                 <Item>
                                      -some entries-
                                 </Item>
                            </Group>
                        </Group>
                    </Group>
                </Root>
                Last edited by zmbd; Aug 9 '18, 11:09 PM.

                Comment

                • Rabbit
                  Recognized Expert MVP
                  • Jan 2007
                  • 12517

                  #23
                  When you hit a nested group node, can't you recurse and pass in the parent's id?

                  Edit: Also, my preference would be to put the XML node -> table/field mappings into a metadata table so you can update the mappings more easily than if you had to do it through code.
                  Last edited by Rabbit; Aug 9 '18, 04:32 PM.

                  Comment

                  • zmbd
                    Recognized Expert Moderator Expert
                    • Mar 2012
                    • 5501

                    #24
                    Rabbit
                    Passing the parent was my initial thought too!
                    If Parent = Group and Current tag is = Group then we have a subgroup use that table!
                    Here's the rub, take line 8 above
                    My thought was that Parent of <UUID> should be <Group>
                    It doesn't parse that way - instead it is returning the node name.

                    AND it gets very funny - say <UUID> did return <Group> which <Group>? is it the GreatGrandParen t, GrandParent, Parent? If it returns <Root> then at least you know you're at the top level

                    Code:
                    Print #zFreeFile, String(zRecursion, ".") & _
                        " (" & " ParentNode: " & zSpider.ParentNode.basename & _
                           ") (Spider: " & zSpider.basename & _
                               ")(text: " & zSpider.Text & ")"
                    In the text file:
                    
                    >Group level: 1
                    <<< There's another Print op when the select-case is Groups that places this flush left for visual location<<
                    Group::   <<Calling level
                    . UUID:: 
                    .. ( ParentNode: UUID) (Spider: )(text: Goup_100)
                    .. Name::
                    ... ( ParentNode: Name) (Spider: )(text: 100_Group)
                    ... ( ParentNode: Group) (Spider: Notes)(text: )
                    ... IconID::
                    .... ( ParentNode: IconID) (Spider: )(text: 49)
                    .... Times::
                    ..... LastModificationTime::
                    ...... ( ParentNode: LastModificationTime) (Spider: )(text: 2009-04-24T19:18:46Z)
                    ...... CreationTime::
                    ....... ( ParentNode: CreationTime) (Spider: )(text: 2007-04-18T09:28:58Z)
                    ....... LastAccessTime::
                    ........ ( ParentNode: LastAccessTime) (Spider: )(text: 2009-07-31T18:10:39Z)
                    ........ ExpiryTime::
                    ......... ( ParentNode: ExpiryTime) (Spider: )(text: 2007-04-18T09:28:58Z)
                    ......... Expires::
                    .......... ( ParentNode: Expires) (Spider: )(text: False)
                    .......... UsageCount::
                    ........... ( ParentNode: UsageCount) (Spider: )(text: 153)
                    > What I have now is this
                    Code:
                    SpiderChildren(byRef inChildNodeList as object, _
                        ByVal RecursionLevel as Long) 
                    Select Case spider.tag
                     Case <tag=Group>
                      'blah
                       Select Case Recursion
                       Case Is <2
                         'blah
                         recursion = recursion + 1
                         call SpiderChildren(SpiderChildren,Recursion)
                         recursion = recursion + 1
                       Case Is >=2
                         'blah
                         call SpiderChildren(SpiderChildren,Recursion)
                      End Select
                     Case
                    (...)
                    End Select
                    I've flattened the Category/Sub-category structure they are using as the Sub/sub(/sub...)-category are more of a description so these are being moved into a related comments table
                    This appears to be working and is parsing the main and sub categories into the proper tables and the code is checking a M:M join table (I did have the 1:Category>M:Su bCatagory relationship setup and then ran into a mess where the same subcategory was used with multiple catagories - sigh - I tried to cheat on the normalization - bytes.me.ITRET !
                    Truck - Green
                    Truck - Blue
                    Truck - Yellow
                    Car - Green
                    Car - Blue
                    Car - Yellow



                    As for the metadata table, I gave up on it.
                    For the example Database I posted the metadata table would be the way to go for ease of maintenance. I can see adding a record for say [>planet>Moons] and parsing to [table_moons].

                    What I have is very much as shown in the example XML in #22, while neutered, is essentially the format the file is in... how do map the nested <Group> when there are no attributes <Group ID=000>?
                    So take line 2, Line 9, line 17 so we have Parent, child, grandchild, great-grandchild
                    Add to root another child - how do you even begin to map this?
                    Even with a pedigree table... I simply gave up.
                    Code:
                    <Root>
                    +
                    -----<Group>
                    +      +
                    +      ---<Group>
                    +           +
                    +           ---<Group>
                    +----<Group>
                    Missing, of course, are the <Item> entries under each group

                    I am very likely missing something with how to create the Metatable - If I could have figured out a XSLT for the import it would have been so much nicer.
                    Last edited by zmbd; Aug 10 '18, 12:02 AM.

                    Comment

                    • Rabbit
                      Recognized Expert MVP
                      • Jan 2007
                      • 12517

                      #25
                      Sorry, when I said pass the parent, I wasn't referring to the parent node itself. I just meant passing something with which you can identify the parent. In this case, UUID.

                      From the way the XML is structured, it looks like some sort of hierarchical data. A company reporting structure or chain of command. Hence, why you need to know the parent group of the group you're processing.

                      By the time your code reaches the embedded group, it will have parsed the UUID of the parent. So you can pass that into the recursion.

                      Code:
                      call SpiderChildren(SpiderChildren,Recursion,parentUUID)

                      Comment

                      • zmbd
                        Recognized Expert Moderator Expert
                        • Mar 2012
                        • 5501

                        #26
                        DONE and DONE!

                        Finally,
                        I have data pulling in to the database and it looks correct!
                        I've written some code to instance the Excel file and start parsing through it to make sure the records match; however, the first 50 or so agreed.

                        as for the interface - my take on a "split-form" (I had to redact a few things; however, I think you will get the general idea)
                        Attached Files
                        Last edited by NeoPa; Sep 14 '20, 06:52 PM. Reason: Updated PHP picture link to the JPEG one.

                        Comment

                        • strive4peace
                          Recognized Expert New Member
                          • Jun 2014
                          • 39

                          #27
                          nice example, @zmbd. I tried using MSXML2.DOMDocum ent.6.0 to read the pages of my website but the HTML is too messy I guess. Things like <br> instead of <br /> apparently hang it up. My purpose is to get the pages of my site into Access so Access will eventually manage and generate the content. I ended up writing my own parser, and am still working on it. I don't know if how you did things will be helpful for my logic, but I plan to look again. Thanks!

                          Comment

                          • SwissProgrammer
                            New Member
                            • Jun 2020
                            • 220

                            #28
                            An expert asking questions and being guided to the answer by a team of other experts.

                            I humbly request that the final, working project, minus any sensitive data, be posted for future use.

                            Some day I might like to be able to do this and I do not want to have to struggle for years just to get to the level of the least of you.

                            Please.

                            Thanks.

                            Comment

                            • twinnyfo
                              Recognized Expert Moderator Specialist
                              • Nov 2011
                              • 3662

                              #29
                              @SwissProgramme r,

                              Believe it or not, I've probably learned much of what I know about MS Access from Bytes. I learned out of necessity because of the projects I've had to work with.

                              Just stick with it and you can do just about anything!

                              Comment

                              • SwissProgrammer
                                New Member
                                • Jun 2020
                                • 220

                                #30
                                twinnyfo,

                                How did you make that text red?

                                Comment

                                Working...