How: multiple program instances sharing same data

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Zytan

    How: multiple program instances sharing same data

    I want multiple instances of the same .exe to run and share the same
    data. I know they all can access the same file at the same time, no
    problem, but I'd like to have this data in RAM, which they can all
    access. It seems like a needless waste of memory to make them all
    maintain their own copy of the same data in RAM at the same time.

    What's the best way to achieve this?

    I've heard of memory mapped files, so maybe that's the answer. I've
    heard that .NET doesn't allow shared memory, so maybe that's my only
    choice. I would imagine making a DLL wouldn't help, since that just
    allows shared code, not data (right?).

    thanks for any help

  • Peter Duniho

    #2
    Re: How: multiple program instances sharing same data

    On 2007-11-06 14:53:12 -0800, Zytan <zytanlithium@g mail.comsaid:
    I want multiple instances of the same .exe to run and share the same
    data. I know they all can access the same file at the same time, no
    problem, but I'd like to have this data in RAM, which they can all
    access. It seems like a needless waste of memory to make them all
    maintain their own copy of the same data in RAM at the same time.
    >
    What's the best way to achieve this?
    >
    I've heard of memory mapped files, so maybe that's the answer. I've
    heard that .NET doesn't allow shared memory, so maybe that's my only
    choice. I would imagine making a DLL wouldn't help, since that just
    allows shared code, not data (right?).
    Memory mapped files are indeed a common way to address the issue, but
    you'd have to implement that using p/invoke and/or a managed C++ DLL.

    That said, unless you're talking about a huge amount of data (say,
    hundreds of megabytes), I don't see that having a separate copy in each
    application would necessarily be bad. If for some reason the data is
    very large, you could still share the same file, but only read small
    portions at a time.

    It's hard to answer the question without more specifics, but in the
    most common scenarios I don't think that a memory mapped file or
    similar would be necessary.

    Pete

    Comment

    • Norapinephrine

      #3
      Re: How: multiple program instances sharing same data

      On Nov 6, 5:53 pm, Zytan <zytanlith...@g mail.comwrote:
      I want multiple instances of the same .exe to run and share the same
      data. I know they all can access the same file at the same time, no
      problem, but I'd like to have this data in RAM, which they can all
      access. It seems like a needless waste of memory to make them all
      maintain their own copy of the same data in RAM at the same time.
      >
      What's the best way to achieve this?
      >
      I've heard of memory mapped files, so maybe that's the answer. I've
      heard that .NET doesn't allow shared memory, so maybe that's my only
      choice. I would imagine making a DLL wouldn't help, since that just
      allows shared code, not data (right?).
      >
      thanks for any help
      Here's a completely off the wall possibility : how about using
      remoting ? I've written up a "service" that interfaces with several
      different running apps of the same .exe. All of the .exe run local to
      the service, so it's only "remote" in name.

      In your case, each app could call remoting accessor/modifier functions
      to get/set the value. There should be a performance hit, but I'm not
      sure if that's important to your problem.

      Comment

      • Zytan

        #4
        Re: How: multiple program instances sharing same data

        Memory mapped files are indeed a common way to address the issue, but
        you'd have to implement that using p/invoke and/or a managed C++ DLL.
        ok, I've read about this already. I knew that was an option.
        That said, unless you're talking about a huge amount of data (say,
        hundreds of megabytes), I don't see that having a separate copy in each
        application would necessarily be bad. If for some reason the data is
        very large, you could still share the same file, but only read small
        portions at a time.
        The data multiplied by the # of copies running makes the memory
        footprint significant.

        I considered reading only portions of the file at a time, but it
        requires processing to convert the file data into usable data in
        memory, so it's not trivial. But, possible.
        It's hard to answer the question without more specifics, but in the
        most common scenarios I don't think that a memory mapped file or
        similar would be necessary.
        It's basically as if they all use the same database, it could be a SQL
        database they all access, in fact, but it's nothing heavy duty that
        requires something so complex. It's just an array of data that's
        sufficient large, shared by sufficient number of programs that the
        memory usage is just completely wasteful.

        thanks for your reply
        Zytan

        Comment

        • Zytan

          #5
          Re: How: multiple program instances sharing same data

          Here's a completely off the wall possibility : how about using
          remoting ? I've written up a "service" that interfaces with several
          different running apps of the same .exe. All of the .exe run local to
          the service, so it's only "remote" in name.
          >
          In your case, each app could call remoting accessor/modifier functions
          to get/set the value. There should be a performance hit, but I'm not
          sure if that's important to your problem.
          That's a good idea, thanks for that possibility...

          Hopefully I won't need anything so complex, though...

          Zytan

          Comment

          • Peter Duniho

            #6
            Re: How: multiple program instances sharing same data

            On 2007-11-07 10:01:02 -0800, Zytan <zytanlithium@g mail.comsaid:
            The data multiplied by the # of copies running makes the memory
            footprint significant.
            Why? How big is the data? Is it really hundreds of megabytes?

            If the data is relatively small but the number of copies is large, then
            you have much worse problems than the one you're asking about here.
            Running multiple processes can quickly degrade performance, if you're
            talking dozens or more.
            I considered reading only portions of the file at a time, but it
            requires processing to convert the file data into usable data in
            memory, so it's not trivial. But, possible.
            Define "usable data". You would have similar issues with memory mapped
            files anyway, since a memory mapped file is simply a file mapped to an
            otherwise typeless byte array. Depending on the language, you can
            usually reference sections of that byte array through typed structures,
            but if that's possible then you can do that with a byte array you read
            from a file as well.

            Memory mapped files have two benefits in this scenario, the first being
            convenience (you can easily reference a large piece of data as a single
            memory object without having to create a whole new swap-file-backed
            memory allocation), and the second being the ability to create a memory
            mapped file that's really just a shared memory section (i.e. no actual
            named file on the disk). Other than that, they behave a lot like just
            reading data from an actual file into an array and then accessing that
            array via conventional means.
            >It's hard to answer the question without more specifics, but in the
            >most common scenarios I don't think that a memory mapped file or
            >similar would be necessary.
            >
            It's basically as if they all use the same database, it could be a SQL
            database they all access, in fact, but it's nothing heavy duty that
            requires something so complex. It's just an array of data that's
            sufficient large, shared by sufficient number of programs that the
            memory usage is just completely wasteful.
            Well, I don't believe that .NET offers any "out of the box" support for
            this sort of thing. I don't think any of the options that do exist
            should be all that hard to implement, but it's not going to be a
            one-liner.

            Pete

            Comment

            • Norapinephrine

              #7
              Re: How: multiple program instances sharing same data

              On 7 nov, 13:04, Zytan <zytanlith...@g mail.comwrote:
              Here's a completely off the wall possibility : how about using
              remoting ? I've written up a "service" that interfaces with several
              different running apps of the same .exe. All of the .exe run local to
              the service, so it's only "remote" in name.
              >
              In your case, each app could call remoting accessor/modifier functions
              to get/set the value. There should be a performance hit, but I'm not
              sure if that's important to your problem.
              >
              That's a good idea, thanks for that possibility...
              >
              Hopefully I won't need anything so complex, though...
              >
              Zytan
              The hardest part of building it is getting an example for setting up a
              remoting singleton service.
              I think I spent a few days searching the net for some code without any
              luck. Thankfully, someone I spoke to knew what to do.
              After that, it was easy as pie to maintain and use.

              Comment

              • Zytan

                #8
                Re: How: multiple program instances sharing same data

                Why? How big is the data? Is it really hundreds of megabytes?

                The data size is about 10 MB, but the memory used by the process can
                be from 6 MB to 90 MB, but usually 25 MB. The program itself is
                small, under 1 MB. (And I understand that NET has GC issues, which
                makes it hard to see what's being used, and I know there's overhead
                for Lists and arrays and strings and objects, and things like that.)
                If the data is relatively small but the number of copies is large, then
                you have much worse problems than the one you're asking about here.
                Running multiple processes can quickly degrade performance, if you're
                talking dozens or more.
                The number of copies is about 50, times 20 MB each = 1 GB. The
                performance is not the problem. It's the memory consumption. I don't
                want to use up all the memory for multiple copies of the same data.
                Define "usable data". You would have similar issues with memory mapped
                files anyway, since a memory mapped file is simply a file mapped to an
                otherwise typeless byte array. Depending on the language, you can
                usually reference sections of that byte array through typed structures,
                but if that's possible then you can do that with a byte array you read
                from a file as well.
                Right. It looks like if there's no way to share the data as it
                resides in memory (in my own C# type) then the only solution is to not
                load it all at once, by either explicitly avoiding this, or by using a
                memory mapped file, which in both cases means i have to read/convert
                the file data to get it into a 'usable' C# type that I can deal with
                easily, each time i want to access it. This is ok, I think.
                Memory mapped files have two benefits in this scenario, the first being
                convenience (you can easily reference a large piece of data as a single
                memory object without having to create a whole new swap-file-backed
                memory allocation), and the second being the ability to create a memory
                mapped file that's really just a shared memory section (i.e. no actual
                named file on the disk). Other than that, they behave a lot like just
                reading data from an actual file into an array and then accessing that
                array via conventional means.
                If the file is just an array of text data, and I can access it as a
                list of strings from a memory mapped file, and ignoring the little CPU
                power required to read and convert it each time, then that's not a bad
                idea.
                Well, I don't believe that .NET offers any "out of the box" support for
                this sort of thing. I don't think any of the options that do exist
                should be all that hard to implement, but it's not going to be a
                one-liner.
                I think there are memory mapped file libraries available, people have
                already used pinvoke to access the required win32 functions to do it.

                thanks Pete!

                Zytan

                Comment

                • Zytan

                  #9
                  Re: How: multiple program instances sharing same data

                  The hardest part of building it is getting an example for setting up a
                  remoting singleton service.
                  I think I spent a few days searching the net for some code without any
                  luck. Thankfully, someone I spoke to knew what to do.
                  After that, it was easy as pie to maintain and use.
                  Thanks, but i think this is too much for what i need. I'll take a
                  look at memory mapped files.

                  thanks for your suggestion, though, it helps me make the proper
                  decision

                  Zytan


                  Comment

                  • Peter Duniho

                    #10
                    Re: How: multiple program instances sharing same data

                    On 2007-11-07 12:43:50 -0800, Zytan <zytanlithium@g mail.comsaid:
                    >If the data is relatively small but the number of copies is large, then
                    >you have much worse problems than the one you're asking about here.
                    >Running multiple processes can quickly degrade performance, if you're
                    >talking dozens or more.
                    >
                    The number of copies is about 50, times 20 MB each = 1 GB.
                    For what it's worth, 50 duplicate processes is a lot of processes.
                    Your effort would be better spent changing the design so that you
                    haven't created 50 different processes.

                    Running 64-bit Windows should alleviate the concern of the number of
                    processes somewhat, but then running 64-bit Windows should also
                    alleviate the concern of the memory usage of each process as well.
                    Under 32-bit Windows, I would never design a system that requires 50
                    identical processes. There's just too much per-process overhead in
                    Windows.

                    Pete

                    Comment

                    • Willy Denoyette [MVP]

                      #11
                      Re: How: multiple program instances sharing same data

                      "Zytan" <zytanlithium@g mail.comwrote in message
                      news:1194468230 .689513.64710@s 15g2000prm.goog legroups.com...
                      >Why? How big is the data? Is it really hundreds of megabytes?
                      >
                      The data size is about 10 MB, but the memory used by the process can
                      be from 6 MB to 90 MB, but usually 25 MB. The program itself is
                      small, under 1 MB. (And I understand that NET has GC issues, which
                      makes it hard to see what's being used, and I know there's overhead
                      for Lists and arrays and strings and objects, and things like that.)
                      >
                      >If the data is relatively small but the number of copies is large, then
                      >you have much worse problems than the one you're asking about here.
                      >Running multiple processes can quickly degrade performance, if you're
                      >talking dozens or more.
                      >
                      The number of copies is about 50, times 20 MB each = 1 GB. The
                      performance is not the problem. It's the memory consumption. I don't
                      want to use up all the memory for multiple copies of the same data.
                      >
                      >Define "usable data". You would have similar issues with memory mapped
                      >files anyway, since a memory mapped file is simply a file mapped to an
                      >otherwise typeless byte array. Depending on the language, you can
                      >usually reference sections of that byte array through typed structures,
                      >but if that's possible then you can do that with a byte array you read
                      >from a file as well.
                      >
                      Right. It looks like if there's no way to share the data as it
                      resides in memory (in my own C# type) then the only solution is to not
                      load it all at once, by either explicitly avoiding this, or by using a
                      memory mapped file, which in both cases means i have to read/convert
                      the file data to get it into a 'usable' C# type that I can deal with
                      easily, each time i want to access it. This is ok, I think.
                      >
                      >Memory mapped files have two benefits in this scenario, the first being
                      >convenience (you can easily reference a large piece of data as a single
                      >memory object without having to create a whole new swap-file-backed
                      >memory allocation), and the second being the ability to create a memory
                      >mapped file that's really just a shared memory section (i.e. no actual
                      >named file on the disk). Other than that, they behave a lot like just
                      >reading data from an actual file into an array and then accessing that
                      >array via conventional means.
                      >
                      If the file is just an array of text data, and I can access it as a
                      list of strings from a memory mapped file, and ignoring the little CPU
                      power required to read and convert it each time, then that's not a bad
                      idea.
                      >
                      >Well, I don't believe that .NET offers any "out of the box" support for
                      >this sort of thing. I don't think any of the options that do exist
                      >should be all that hard to implement, but it's not going to be a
                      >one-liner.
                      >
                      I think there are memory mapped file libraries available, people have
                      already used pinvoke to access the required win32 functions to do it.
                      >
                      thanks Pete!
                      >
                      Zytan
                      >


                      I don't get it, if you really need to run 50 copies of the same application,
                      just like it's done when running on Terminal Server or Citrix, then you need
                      to keep an eye on your resource usage, that is you need to account for the
                      necessary CPU resources and you need the Memory resources.
                      In your specific case you say 50 * 20 MB is ~1GB, you are reading data from
                      a common file, but you don't read the file in chunks of 20MB do you? You
                      also said you need to convert the data, that means that this data is not
                      sharable anyway, so your problem is nothing else that the file data right?
                      If this is true, all you need to do is restrict the amount of data read from
                      the file, there is no need to share this file data through whatever
                      mechanism at all, it just a waste of time to get it right, and it will not
                      solve anything, you still have to account for the converted data which is
                      not sharable
                      There are another couple of things that aren't clear to me, you said the
                      program is small "under 1MB", well this isn't possible, the smallest managed
                      application consumes at least 8MB of private non sharable memory, so
                      question is what do you mean with this 1MB. Also, you said, the program
                      uses - from 6 MB to 90 MB -, what memory are you talking about here, please
                      note that memory comes as shared an non shared, how much of this is shared?
                      and how comes that memory goes up from 6MB to 90MB, what memory counter are
                      you talking about and how did you measure.

                      IMO you are looking for a solution for something which isn't a real problem.

                      Willy.


                      Comment

                      • Zytan

                        #12
                        Re: How: multiple program instances sharing same data

                        Under 32-bit Windows, I would never design a system that requires 50
                        identical processes. There's just too much per-process overhead in
                        Windows.
                        Agreed, but it was an evolving design, so it wasn't expected. I
                        really need to look at a single process solution to this.

                        Zytan

                        Comment

                        • Zytan

                          #13
                          Re: How: multiple program instances sharing same data

                          In your specific case you say 50 * 20 MB is ~1GB, you are reading data from
                          a common file, but you don't read the file in chunks of 20MB do you?
                          I've drastically reduced the amount of data read in, and it's made
                          only a small dent in the memory usage. Thus, I think the windows
                          themselves are using up most of the 20 MB of usage, not the data. I
                          now maintain less than 1 MB of data in memory at all time, and it's
                          still using 20 MB of memory each.
                          There are another couple of things that aren't clear to me, you said the
                          program is small "under 1MB", well this isn't possible, the smallest managed
                          application consumes at least 8MB of private non sharable memory, so
                          question is what do you mean with this 1MB.
                          The .exe is < 1 MB, so I assumed that the code itself will use < 1 MB
                          of used memory. I know it access the .NET framework DLLs, but those
                          are all only loaded into memory once, so I ignore them. Right?
                          Also, you said, the program
                          uses - from 6 MB to 90 MB -, what memory are you talking about here, please
                          note that memory comes as shared an non shared, how much of this is shared?
                          and how comes that memory goes up from 6MB to 90MB, what memory counter are
                          you talking about and how did you measure.
                          I have no idea how to measure memory usage. I use Task Manager, and
                          it shows varying amounts. The same program will use different amounts
                          of memory, usually from 23 MB to 48 MB, loading the same data, doing
                          the same things. I can only assume NET and Garbage Collection is to
                          blame.

                          thanks for your reply Willy

                          Zytan

                          Comment

                          • Zytan

                            #14
                            Re: How: multiple program instances sharing same data

                            Thanks for you reply and all your help. Before I read it I want to
                            give an update: I've heard that minimizing an app will give a better
                            idea of the memory usage:http://forums.microsoft.com/MSDN/Sho...01165&SiteID=1
                            I tried minimizing mine, and its usage went from 27 MB to 2 MB!!
                            >
                            Then I tried to programatically minimize it with WindowState and even
                            using Win32's ShowWindow(), and neither makes a difference, but if I
                            do it manually, it works!
                            >
                            I think this is the solution, if I can just get it to work properly.
                            No, this is not a solution. The total memory usage stays the same.
                            Just the 'mem usage' in Task Manager drops the moment you minimize,
                            and then it starts going back up. So, this really doesn't mean a
                            thing.

                            Zytan

                            Comment

                            • Zytan

                              #15
                              Re: How: multiple program instances sharing same data

                              Willy, your posts are a great help.
                              No it's not, please forget about this nonsense.
                              Indeed.
                              Yes, I convert the data read in from the file. Basically just text
                              data stored in an array of structs internally.
                              >
                              Yes but these array of structs are getting allocated on the GC heap so they
                              consume private memory.
                              Right, so this memory isn't shared. I thought it was this data that
                              was using all the system's RAM, so I reduced it 10 fold, and it
                              doesn't make that much of a difference. So, now I'm thinking the
                              Windows Forms themselves are what is using most of the memory, with
                              all the threads that they make by default. But, I've deleted controls
                              like crazy, and it seems to make no dent at all in the overall memory
                              usage :(
                              Sure, a 1MB exe (MSIL) can consume more than 1MB of code pages, first the
                              MSIL must be loaded and then it gets run-time compiled to machine code. Both
                              MSIL and the compiled code are no-sharable, that means that each instance of
                              the same application will load it's own copy of MSIL and will have it's
                              private copy of JITted code.
                              Ok, because the .exe isn't native to my machine, it gets just-in-time
                              compiled for it, and all of this is repeated for each instance. If I
                              had my .exe NGEN'd, then this wouldn't be the case, but then the .exe
                              would run only on the machine I NGEN'd it for, unless I can NGEN it
                              for the lowest common denominator CPU that it'd be running on, like
                              Win2000, so it'll run on Win2K and WinXP, and WinVista. Right?
                              The native DLL's and most of the framework (V2) libraries are Ngen'd, so
                              they are shared amongst the instances.
                              I see, I was just assuming ALL of the framework (I use NET 2.0) was
                              shared.
                              Most of the framework libraries are copmpiled (ngen'd) and as such they are
                              share, but their MSIL must be loaded anyway, the CLR needs their metadata at
                              run-time, also some security and versioning constraints may prevent ngen'd
                              methods to run , so these need to get JIT compiled.
                              Ah.
                              Perfmon and Process Explorer all use the same prerf. counters, so you may
                              use Process Explorer as long as we are talking about the same counters there
                              is no problem.
                              Ok, sure.
                              What counts here is the Private part, so you should watch the Private
                              Process counters for your process. The GC heap is part of the private
                              memory
                              of a process, the amount of memory you consume from the GC heap depends
                              on
                              your allocation scheme, the more and the larger the objects you create
                              ,the
                              larger the GC heap grows and consequently the larger your Private bytes.
                              Ok, my private bytes usage is still very high (20 MB). But it could
                              be high just because GC has not cleaned up, right?
                              I could probably reduce the stack space by half or more, I bet,
                              without causing problems, I forgot about that. I'll have to see how I
                              can change that.
                              >
                              You can't do this for the 3 threads created by the run-time, you can only
                              change the default size of the stack of the threads you create explicitely.
                              Right.
                              I've seen a simple do nothing .NET app take 10 MB
                              >
                              Yep, all depends on what you call a .NET app., a console application takes
                              less than a Windows Forms app. It also depends on the version of the
                              framework, newer version have larger System and Mscorlib libraries.
                              I am talking about a Windows NET App, for 8 to 10 MB. Console uses
                              about half, 5 MB, for a "do nothing". This is why I may remove the
                              GUI completely, just to save 5 MB each. I am running .NET 2.0, btw.
                              So, I would suggest you to start to:
                              - measure the Virtual bytes consumed by a single instance.
                              - NGEN the exe and measure again, if the Virtual bytes is considerably
                              lower
                              keep using the Ngen'd image in your scenario, else forget them.
                              >
                              Is NGEN "Native Image Generator"?
                              >
                              Yep.
                              Use it whenever you need to run several instances of the same application,
                              it produces sharable code images (see above). You can't live without it when
                              running on Terminal Server for instance.
                              I still haven't tried this, yet, and I will when I find some time, and
                              report back here.
                              Well try with ngen first. And watch memory consumption in Process Explorer.
                              Thanks for all your help, Willy!!!

                              Zytan

                              Comment

                              Working...