How to speed-up access to strings

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • NeoPa
    Recognized Expert Moderator MVP
    • Oct 2006
    • 32633

    #46
    Originally posted by Rabbit
    Rabbit:
    Both the string and byte array versions are extremely quick, the byte array is just slightly quicker. Does it matter if one takes a second and the other takes a quarter of a second? Probably not, but the byte array version is quicker and cleaner.
    Generally not, but if Ricardo is dealing with multi-megabytes then possibly. From my experience of the string version (The only one I even knew of until you chipped in with this very interesting concept.) is already very quick so it probably won't make much difference across just a couple of megabytes.

    I did notice, however, that your code doesn't include any hash/octothorp (#) characters, yet I was under the impression they are required in such situations. Am I wrong? Is this a different version from what I've been using?
    Originally posted by ADezii
    ADezii:
    I thought that I was old school, wait till NeoPa and isladogs see this, then they will have someone else to pick on! (LOL)
    (ROFL) - I really did guffaw. Nice one :-D

    NB. You should also notice that Rabbit is also a moderator himself so he might also pick himself up on his code should he choose to ;-)

    I suspect Rabbit is older than he seems and this part of the code is left over from when he first created the basic code many, many years ago.

    @Rabbit.
    I'd like to set this as Best Answer. In fact, I will anyway. However, it would be improved if you'd make updates to deal with the percent (%) (It's great to know about but many users will simply be confused.) and the hashes if necessary.
    Last edited by NeoPa; Feb 13 '21, 02:08 PM.

    Comment

    • NeoPa
      Recognized Expert Moderator MVP
      • Oct 2006
      • 32633

      #47
      It may be interesting to know that whenever I do such file work using Open# etc (Actually, the # always goes as a prelude to the File Number specification rather than at the end of Open or any of the other hashed keywords, but that makes it hard to talk about so the Help system refers to them with the hash suffix instead.), I always have to go back to some previous code to find out how to do it. Once, long ago, I went through the Help system and worked out how to get it to work without the addition of all the metadata that it likes to add by default.

      I managed to work it out way back when, but it certainly wasn't for the faint-hearted. Since then I've avoided having to repeat that unpleasant experience by reviewing my existing code every time I've needed to do any work in that area. The Help system was always reasonably decent once you knew what you were looking for but :
      1. Get# & Put# are not easily discoverable. These are the Statements to use for direct file access as opposed to BASIC file access which likes to add the metadata for you - thus making it useless for anything other than files created using those same Statements (Write# & Print#.) Get# & Put# are listed in the Help system under Get & Put yet both describe the hashed versions.
      2. The parameters required for direct access when using Open# are also pretty obscure. It takes a good deal of reading through the comments, as well as a knowledge of Get# & Put#, in order to work out how it should be done.
      3. As far as I'm aware, it's also unbuffered I/O. This means very little nowadays I suppose, with the hardware buffering for you anyway, but it doesn't help to have O/S calls firing off so frequently in running code.

      I have a buffered I/O Class Module somewhere that deals with that for you. If I can dig it up I'll add it here - or maybe even a link to the article if I manage to write one.
      Last edited by NeoPa; Feb 13 '21, 02:36 PM.

      Comment

      • NeoPa
        Recognized Expert Moderator MVP
        • Oct 2006
        • 32633

        #48
        Hopefully Direct File I/O in VBA will prove helpful to people who are looking for some help with such work.

        Comment

        • ADezii
          Recognized Expert Expert
          • Apr 2006
          • 8834

          #49
          @Rabbit:
          1. Out of curiosity, I have a File named SmallFile.txt in the C:\Test\Folder which consists of a single Line of 26 characters (a-z).
          2. When I Open this File in BINARY READ Access using your Code, I get the Results depicted below (Before.jpg).
          3. I can obtain the correct results (After.jpg) by Redimensioning arrBytes to LOF(intFileNum) and not processing the last three Bytes of arrBytes(), but I am not really sure what is going on here, any ideas?


          Attached Files
          Last edited by NeoPa; Feb 13 '21, 07:25 PM.

          Comment

          • Rabbit
            Recognized Expert MVP
            • Jan 2007
            • 12517

            #50
            I suspect Rabbit is older than he seems and this part of the code is left over from when he first created the basic code many, many years ago.
            Or perhaps not. I googled it. I only knew that it would be odd if I wasn't able to read a file in binary mode. I was admittedly confused by the purpose of the % symbol. The code seems to run fine without any of the special symbols.
            Last edited by Rabbit; Feb 13 '21, 06:00 PM.

            Comment

            • Rabbit
              Recognized Expert MVP
              • Jan 2007
              • 12517

              #51
              @ADezii you have a CRLF at the end of the file

              Comment

              • NeoPa
                Recognized Expert Moderator MVP
                • Oct 2006
                • 32633

                #52
                Originally posted by Rabbit
                Rabbit:
                I was admittedly confused by the purpose of the % symbol.
                ROFL. Let me see if I can throw some light on that for you then :-)

                Type characters (Visual Basic) tells you what they all are. Be aware, they can also be used with literals to force a particular type, such a :
                Code:
                lngVar = 32&
                Historically, these were how you stipulated that variables were of a particular type. Not sure exactly when Dim was introduced but was certainly there for VBA - The earliest version I believe.

                Comment

                • ADezii
                  Recognized Expert Expert
                  • Apr 2006
                  • 8834

                  #53
                  Well, I may be a 'little' older than you guys, but if I remember correctly, the Variable Data Type Identifiers are:
                  Code:
                  % - INTEGER
                  & - LONG
                  ! - SINGLE
                  # - DOUBLE

                  Comment

                  • ADezii
                    Recognized Expert Expert
                    • Apr 2006
                    • 8834

                    #54
                    @Rabbit:
                    Thanks for the Reply, and didn't realize that you had posted a Link to the Identifiers.

                    Comment

                    • NeoPa
                      Recognized Expert Moderator MVP
                      • Oct 2006
                      • 32633

                      #55
                      @ADezii.
                      Not only do you have a CR & an LF at the end, you're also reporting incorrect values for each with one added to each value.

                      CR = 13 = &H0D
                      LF = 10 = &H0A
                      Originally posted by ADezii
                      ADezii:
                      Well, I may be a 'little' older than you guys
                      As I recall, last time we spoke you tripped over your beard it was that long :-D

                      And what many don't realise when the elders among us shout - is that far back in the pre-historical days, when ADezii was younger, mainframes generally stored data using EBCDIC which, while it did support lower case characters, hardly ever needed to as most text data back then was stored as capitals. So. It's not shouting as much as simply reverting to old habits :-D
                      Last edited by NeoPa; Feb 13 '21, 07:33 PM.

                      Comment

                      • Rabbit
                        Recognized Expert MVP
                        • Jan 2007
                        • 12517

                        #56
                        @NeoPa, the return values are fine, they're shifted by one because ADezii is applying the transformation in my code that Ricardo wants to do on the data

                        Comment

                        • NeoPa
                          Recognized Expert Moderator MVP
                          • Oct 2006
                          • 32633

                          #57
                          Ah. Thank you. I didn't realise.

                          Comment

                          • SioSio
                            Contributor
                            • Dec 2019
                            • 272

                            #58
                            Rabbit,
                            I apologize if I make a mistake,
                            Isn't "Step 2" necessary because it is converted to character units?

                            P.S.
                            I wrote the code in API.
                            Memory manipulation using the API was 300 times faster than using strings, but the processing speed was far less than the way of processing in the Bytes.

                            Code:
                                For i = 0 To UBound(aryByteSx1) Step 2

                            Comment

                            • Rabbit
                              Recognized Expert MVP
                              • Jan 2007
                              • 12517

                              #59
                              That depends.

                              Firstly, I don't believe we were ever really dealing with strings. I believe the true requirement, whether or not the OP realized, was working with binary files.

                              Secondly, even if we were dealing with string data, the number of bytes per character depends on the encoding. Ascii encoding, for example, only uses one byte per character.

                              Comment

                              • SioSio
                                Contributor
                                • Dec 2019
                                • 272

                                #60
                                Oh, it was. I'm dealing with 2-byte characters on a daily basis, so I didn't notice.

                                Comment

                                Working...