FileSystemWatcher question

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • =?Utf-8?B?Tmljaw==?=

    FileSystemWatcher question

    Hello,

    I've got a FileSystemWatch er object setup to check for new files in a
    folder. I've never done much with threading, but I think I mght need to here.
    When I find a new file there are times when I need to do some time consuming
    tasks. When this happens, as it is now, the FileSystemWatch er doesn't seem to
    see all of the files if I move a large batch of files in the watch folder at
    once. Could someone give me a push in the right direction? Thanks for any
    help, I really appreciate it.

    Thanks,
    Nick
  • Peter Morris

    #2
    Re: FileSystemWatch er question

    public class FileProcessor
    {
    private object SyncRoot = new object();
    private Queue<stringQue ue = new Queue<string>;
    private bool Terminated = false;

    public FileProcessor()
    {
    var threadStart = new ThreadStart(Exe cute);
    new Thread(threadSt art).Start();
    }

    public void QueueFile(strin g fileName)
    {
    lock (SyncRoot)
    {
    Queue.Enqueue(f ileName);
    Monitor.Pulse(S yncRoot);
    }
    }

    private void Execute()
    {
    while (!Terminated)
    {
    string fileName = null;
    lock(SyncRoot)
    {
    if (Queue.Count 0 && !Terminated)
    fileName = Queue.Dequeue() ;
    }

    if (fileName != null)
    ProcessFile(fil eName);
    else
    Monitor.WaitFor (SyncRoot);
    }
    }

    public void Terminate()
    {
    Terminated = true;
    lock (SyncRoot)
    {
    Monitor.Pulse(S yncRoot);
    }
    }



    }


    Something like that anyway. Probably wont even compile, but you never know
    your luck :-)




    --
    Pete
    ====



    Comment

    • Peter Duniho

      #3
      Re: FileSystemWatch er question

      On Wed, 19 Nov 2008 15:04:35 -0800, Peter Morris
      <mrpmorrisNO@sp amgmail.comwrot e:
      Something like that anyway. Probably wont even compile, but you never
      know your luck :-)
      Nope, it won't. And even if it did, it would throw an exception. But you
      were _really_ close. :)

      Here's how I might have done it, were I to implement a class like that
      (note that other than fixing the compile and run-time errors, I also
      changed the termination condition so as to simplify the class a bit, and
      added a possible implementation for the "ProcessFil e()" method) (also,
      more comments at the end):

      public class FileProcessor
      {
      private object SyncRoot = new object();
      private Queue<stringque ue = new Queue<string>;

      public FileProcessor()
      {
      new Thread(Execute) .Start();
      }

      public void QueueFile(strin g fileName)
      {
      if (fileName == null)
      {
      throw new ArgumentNullExc eption("fileNam e");
      }

      lock (SyncRoot)
      {
      queue.Enqueue(f ileName);
      Monitor.Pulse(S yncRoot);
      }
      }

      private void Execute()
      {
      lock (SyncRoot)
      {
      while (true)
      {
      if (queue.Count 0)
      {
      string fileName = queue.Dequeue() ;

      if (fileName == null)
      {
      break;
      }

      ProcessFile(fil eName);
      }
      else
      {
      Monitor.Wait(Sy ncRoot);
      }
      }

      }
      }

      public void Terminate()
      {
      lock (SyncRoot)
      {
      queue.Enqueue(n ull);
      Monitor.Pulse(S yncRoot);
      }
      }

      // Subscribe to the FileFound event to be able to process filenames
      // registered with this class
      public event Action<stringFi leFound;

      private void ProcessFile(str ing fileName)
      {
      Action<stringha ndler = FileFound;

      if (handler != null)
      {
      handler(fileNam e);
      }
      }
      }

      Of course, I make no guarantees that this will compile either. :) Just
      because I fixed some of the previous mistakes, that doesn't mean I didn't
      introduce my own.

      To the OP: I think the point of the above is that Pete M. is suggesting
      that you don't do lengthy processing in your event handler(s) for the
      FileSystemWatch er event(s), but rather just queue filenames to a separate
      consumer class and let it handle things as it can. The above would allow
      that.

      Keep in mind, however, that while doing this should improve things, by
      keeping the FileSystemWatch er class ready and available to monitor other
      changes, it simply cannot be relied upon to give 100% notification for
      every change that you are trying to look for. Part of your processing
      when the FileSystemWatch er class notifies you of a change should be to
      look for other changes in the same directory, in case changes came too
      quickly for the FileSystemWatch er class to notice them all (it relies on a
      fixed-sized buffer than if filled before additions can be processed will
      cause some notifications to be lost).

      So even with a producer/consumer setup as Pete M. suggests, you are not
      guaranteed 100% reliable behavior. But you should be able to get a lot
      closer than if you are doing lengthy processing in your FileSystemWatch er
      event handlers.

      Pete

      Comment

      • =?Utf-8?B?Tmljaw==?=

        #4
        Re: FileSystemWatch er question

        Many thanks to both of you for your help. I'll give this a try.

        "Peter Duniho" wrote:
        On Wed, 19 Nov 2008 15:04:35 -0800, Peter Morris
        <mrpmorrisNO@sp amgmail.comwrot e:
        >
        Something like that anyway. Probably wont even compile, but you never
        know your luck :-)
        >
        Nope, it won't. And even if it did, it would throw an exception. But you
        were _really_ close. :)
        >
        Here's how I might have done it, were I to implement a class like that
        (note that other than fixing the compile and run-time errors, I also
        changed the termination condition so as to simplify the class a bit, and
        added a possible implementation for the "ProcessFil e()" method) (also,
        more comments at the end):
        >
        public class FileProcessor
        {
        private object SyncRoot = new object();
        private Queue<stringque ue = new Queue<string>;
        >
        public FileProcessor()
        {
        new Thread(Execute) .Start();
        }
        >
        public void QueueFile(strin g fileName)
        {
        if (fileName == null)
        {
        throw new ArgumentNullExc eption("fileNam e");
        }
        >
        lock (SyncRoot)
        {
        queue.Enqueue(f ileName);
        Monitor.Pulse(S yncRoot);
        }
        }
        >
        private void Execute()
        {
        lock (SyncRoot)
        {
        while (true)
        {
        if (queue.Count 0)
        {
        string fileName = queue.Dequeue() ;
        >
        if (fileName == null)
        {
        break;
        }
        >
        ProcessFile(fil eName);
        }
        else
        {
        Monitor.Wait(Sy ncRoot);
        }
        }
        >
        }
        }
        >
        public void Terminate()
        {
        lock (SyncRoot)
        {
        queue.Enqueue(n ull);
        Monitor.Pulse(S yncRoot);
        }
        }
        >
        // Subscribe to the FileFound event to be able to process filenames
        // registered with this class
        public event Action<stringFi leFound;
        >
        private void ProcessFile(str ing fileName)
        {
        Action<stringha ndler = FileFound;
        >
        if (handler != null)
        {
        handler(fileNam e);
        }
        }
        }
        >
        Of course, I make no guarantees that this will compile either. :) Just
        because I fixed some of the previous mistakes, that doesn't mean I didn't
        introduce my own.
        >
        To the OP: I think the point of the above is that Pete M. is suggesting
        that you don't do lengthy processing in your event handler(s) for the
        FileSystemWatch er event(s), but rather just queue filenames to a separate
        consumer class and let it handle things as it can. The above would allow
        that.
        >
        Keep in mind, however, that while doing this should improve things, by
        keeping the FileSystemWatch er class ready and available to monitor other
        changes, it simply cannot be relied upon to give 100% notification for
        every change that you are trying to look for. Part of your processing
        when the FileSystemWatch er class notifies you of a change should be to
        look for other changes in the same directory, in case changes came too
        quickly for the FileSystemWatch er class to notice them all (it relies on a
        fixed-sized buffer than if filled before additions can be processed will
        cause some notifications to be lost).
        >
        So even with a producer/consumer setup as Pete M. suggests, you are not
        guaranteed 100% reliable behavior. But you should be able to get a lot
        closer than if you are doing lengthy processing in your FileSystemWatch er
        event handlers.
        >
        Pete
        >

        Comment

        • Peter Morris

          #5
          Re: FileSystemWatch er question

          Hi Pete

          Any chance you could explain a few things?
          private object SyncRoot = new object();
          private Queue<stringque ue = new Queue<string>;
          >
          public FileProcessor()
          {
          new Thread(Execute) .Start();
          }
          Yeah, I was tempted to do the same thread start but couldn't remember if I
          needed ThreadStart or not, and I wasn't willing to start up VS :_)

          public void QueueFile(strin g fileName)
          {
          if (fileName == null)
          {
          throw new ArgumentNullExc eption("fileNam e");
          }
          hehe. You work too hard :-) For newsgroups I just can't be bothered with
          stuff like this, it's bad enough that I have to write it in my real code let
          alone newsgroups. I suppose I should try to be a perfectionist in NG source
          as well as my own, but at the end of the day I think the person reading has
          the responsibility to learn some good techniques of their own rather than
          responders being responsible for teaching them.
          private void Execute()
          {
          lock (SyncRoot)
          {
          while (true)
          This is where I am starting to disagree with you :-)

          {
          if (queue.Count 0)
          {
          string fileName = queue.Dequeue() ;
          >
          if (fileName == null)
          {
          break;
          That's it, we are now officially in disagreement :-) Why would you go for
          such a vague way of terminating the thread rather than my crystal clear
          "Terminated " member?

          Other than the change to how I terminate the thread + the check for null I
          don't see how yours is functionally different from mine, what were your
          thoughts at the time of reading my source?
          // Subscribe to the FileFound event to be able to process filenames
          // registered with this class
          public event Action<stringFi leFound;
          >
          private void ProcessFile(str ing fileName)
          {
          Action<stringha ndler = FileFound;
          >
          if (handler != null)
          {
          handler(fileNam e);
          }
          }
          }
          I think I see now! You are showing him how to handle ProcessFile! There
          was just no way I was going to bother :-)

          To the OP: I think the point of the above is that Pete M. is suggesting
          that you don't do lengthy processing in your event handler(s) for the
          FileSystemWatch er event(s), but rather just queue filenames to a separate
          consumer class and let it handle things as it can. The above would allow
          that.
          Exactly. In my service that does something similar I have a class that
          decides what to do with the file based on its location + extension. It goes
          into one of three of these queues based on how long the operation takes

          Fast - Typically moves the file elsewhere
          Intermediate - Typically unzips a small file
          Long - Generates a MySQL db and zips it

          So that the long process doesn't hold up the stuff that should occur
          immediately despite the fact that there is a long term operation in
          progress.
          FileSystemWatch er class to notice them all (it relies on a fixed-sized
          buffer than if filled before additions can be processed will cause some
          notifications to be lost).
          Any idea what size this buffer is? For me there will be a maximum of 10
          people connecting at once so I doubt I will need to be concerned, but it's
          always interesting to know.



          --
          Pete
          ====



          Comment

          • Peter Duniho

            #6
            Re: FileSystemWatch er question

            On Thu, 20 Nov 2008 09:37:53 -0800, Peter Morris
            <mrpmorrisNO@sp amgmail.comwrot e:
            Hi Pete
            >
            Any chance you could explain a few things?
            >
            > private object SyncRoot = new object();
            > private Queue<stringque ue = new Queue<string>;
            >>
            > public FileProcessor()
            > {
            > new Thread(Execute) .Start();
            > }
            >
            Yeah, I was tempted to do the same thread start but couldn't remember if
            I needed ThreadStart or not, and I wasn't willing to start up VS :_)
            Since C# 2.0, the compiler will correctly infer the necessary delegate
            type and create it for you.

            Your approach isn't wrong, I just prefer the more concise syntax.
            >
            > public void QueueFile(strin g fileName)
            > {
            > if (fileName == null)
            > {
            > throw new ArgumentNullExc eption("fileNam e");
            > }
            >
            hehe. You work too hard :-) For newsgroups I just can't be bothered
            with stuff like this,
            Understandable. To each his own. Note, however, to some extent I put
            that in to make it very clear that, with my change to the termination
            condition, the _only_ way a null should be able to get into the queue is
            via the Terminate() method. The same thing would apply to your own code
            as well, though the consequences of a mistake would be less drastic in
            your version (the processing thread could pause prematurely, whereas in my
            mine it would just exit altogether).
            [...]
            > {
            > if (queue.Count 0)
            > {
            > string fileName = queue.Dequeue() ;
            >>
            > if (fileName == null)
            > {
            > break;
            >
            That's it, we are now officially in disagreement :-) Why would you go
            for such a vague way of terminating the thread rather than my crystal
            clear "Terminated " member?
            Why do you say it is vague? There's no ambiguity at all here.
            Other than the change to how I terminate the thread + the check for null
            I don't see how yours is functionally different from mine, what were
            your thoughts at the time of reading my source?
            I've removed a variable for one. With the removal of that variable, not
            only have I saved as many as 4 bytes from the size of my class, I have
            avoided the risk of making the mistake you did: failing to either put the
            use of that variable inside a synchronized block of code, or making the
            variable "volatile". :)

            Another maintainance advantage is that by putting the termination
            condition in the queue itself, I can then keep the "wait or dequeue" code
            completely separate from the "dequeue then process" code. In your
            version, those two functions of the block of code are intertwined. In
            mine, they are clearly delineated, separate from each other.

            Assuming one fixes your bug, there is no _functional_ difference between
            the two. Both techniques would successfully allow the thread to exit.
            But, I find my way simpler and thus more maintainable. The fewer
            variables you have, the fewer chances you have to mess up how you use
            those variables, and I like to be able to understand the code in smaller
            pieces where possible.

            I admit that the difference is relatively minor (assuming the other
            approach is implemented correctly ;) ), and I wouldn't spend a lot of time
            trying to argue in favor of one or the other. But hopefully the above
            adequately describes why I do in fact have a preference of one over the
            other.
            > // Subscribe to the FileFound event to be able to process filenames
            > // registered with this class
            > public event Action<stringFi leFound;
            >>
            > private void ProcessFile(str ing fileName)
            > {
            > Action<stringha ndler = FileFound;
            >>
            > if (handler != null)
            > {
            > handler(fileNam e);
            > }
            > }
            >}
            >
            I think I see now! You are showing him how to handle ProcessFile!
            There was just no way I was going to bother :-)
            lol...well, in that case you just guarantee that your code example
            wouldn't compile. :)
            [...]
            >FileSystemWatc her class to notice them all (it relies on a fixed-sized
            >buffer than if filled before additions can be processed will cause
            >some notifications to be lost).
            >
            Any idea what size this buffer is? For me there will be a maximum of 10
            people connecting at once so I doubt I will need to be concerned, but
            it's always interesting to know.
            No, I don't know. Last time I looked at this, I did take a look to see if
            it was documented, but couldn't find any specifics. Just the admonition
            to try to process the notifications as quickly as possible. I'm not sure
            the exact number would be very useful anyway, since whether the buffer
            gets filled depends not only on how fast one processes the notifications,
            but how frequently changes are happening to the file system. In addition,
            the two relate according to things that are independent of the buffer
            size: CPU speed, disk i/o bandwidth, stuff like that.

            Pete

            Comment

            • Peter Duniho

              #7
              Re: FileSystemWatch er question

              On Fri, 21 Nov 2008 02:58:32 -0800, Peter Morris
              <mrpmorrisNO@sp amgmail.comwrot e:
              Hi Pete
              >
              >
              >Why do you say it is vague? There's no ambiguity at all here.
              >
              In your case I see a null entry in the list as a kind of "magic number"
              approach. If the value is null you end, or maybe you would put in the
              string "STOP" or "FINISH" or "QUIT". My point is that stopping based on
              the value of an entry is vague because what that value should be is
              entirely up to the person writing the code, different people would have
              different opinions on what that value should be. However, having a
              Terminated boolean member to indicated that the Terminate has been
              called is obvious, the value will be "true" if terminated, it is
              indisputible.
              >
              >
              >I've removed a variable for one. With the removal of that variable,
              >not only have I saved as many as 4 bytes from the size of my class, I
              >have avoided the risk of making the mistake you did: failing to either
              >put the use of that variable inside a synchronized block of code, or
              >making the variable "volatile". :)
              >
              Yes, I did miss that, didn't I :-) I'd still rather go with the boolean
              approach though :-)
              >
              >Assuming one fixes your bug, there is no _functional_ difference
              >between the two.
              >
              Except when you terminate :-)
              Ah, I see what you mean. Yes, you're right...there's a slight
              difference. Though, I'll point out that one reason it's hard to detect
              (you had trouble yourself :) ) is that you've got your "dequeue" mixed up
              with your "terminate" . :)

              That behavior isn't precluded by using a sentinel value (and for the
              record, sentinel values are quite common in software...ther e's nothing
              fundamentally wrong with them), but you'd have to use an implementation of
              Queue<Tthat allows for inserting at the head (i.e. not .NET's
              implementation) . You could write your own, but I agree that doing so
              _just_ to avoid the flag wouldn't make sense.

              Also note that the code you posted doesn't actually behave exactly as you
              say. In particular, this statement isn't true:
              My approach will stop processing items in the loop immediately,
              The code you posted doesn't terminate when you call Terminate(). It
              simply goes back and waits again. It won't terminate until the _next_
              time the monitor is pulsed (whether that happens due to another call to
              Terminate() or someone trying to enqueue something).

              Again, I think this illustrates how a simpler design is easier to code
              correctly. By my count, we're up to three bugs in your implementation
              just related to termination alone. I got my implementation right the
              first try, and I don't think you can say it's because I'm a better
              programmer. I'm not...I just know better than to complicate things when I
              don't have to. :)

              Now, if you have to -- that is, you have that specific requirement to
              terminate immediately -- I suppose you have to implement it that way. But
              neither of us have the actual specification to work with, and it's
              entirely possible my version would work fine in this context.
              [...]
              I'm more of a "here's the general idea, now write it youself" kind of
              helper. The best lesson you can learn is how to teach yourself :-)
              Well, I agree with that generally. In fact, you'll note that I rarely
              actually post actual code. I prefer to just describe a solution, or point
              someone to the documentation. But, I'm of the mind that when I do post
              code, I should try to get it as close to correct as possible. Just
              because I'm not being paid, that doesn't mean I shouldn't take pride in my
              work. :)
              >>Any idea what size this buffer is? For me there will be a maximum of
              >>10 people connecting at once so I doubt I will need to be concerned,
              >>but it's always interesting to know.
              >>
              >No, I don't know. Last time I looked at this, I did take a look to see
              >if it was documented, but couldn't find any specifics.
              >
              I should be okay with around 10 at the time time, maximum, I hope! This
              should definitely be documented, and obvious!
              As long as in your case, 10 connections translates directly to no more
              than 10 file system changes at once, then I think you're definitely safe.
              That's an odd guarantee to be able to have, but if you have it, no
              problem. :)

              Pete

              Comment

              • =?Utf-8?B?Tmljaw==?=

                #8
                Re: FileSystemWatch er question

                Thanks again to both of you for the excellent discussion.

                "Peter Duniho" wrote:
                On Fri, 21 Nov 2008 02:58:32 -0800, Peter Morris
                <mrpmorrisNO@sp amgmail.comwrot e:
                >
                Hi Pete

                Why do you say it is vague? There's no ambiguity at all here.
                In your case I see a null entry in the list as a kind of "magic number"
                approach. If the value is null you end, or maybe you would put in the
                string "STOP" or "FINISH" or "QUIT". My point is that stopping based on
                the value of an entry is vague because what that value should be is
                entirely up to the person writing the code, different people would have
                different opinions on what that value should be. However, having a
                Terminated boolean member to indicated that the Terminate has been
                called is obvious, the value will be "true" if terminated, it is
                indisputible.

                I've removed a variable for one. With the removal of that variable,
                not only have I saved as many as 4 bytes from the size of my class, I
                have avoided the risk of making the mistake you did: failing to either
                put the use of that variable inside a synchronized block of code, or
                making the variable "volatile". :)
                Yes, I did miss that, didn't I :-) I'd still rather go with the boolean
                approach though :-)
                Assuming one fixes your bug, there is no _functional_ difference
                between the two.
                Except when you terminate :-)
                >
                Ah, I see what you mean. Yes, you're right...there's a slight
                difference. Though, I'll point out that one reason it's hard to detect
                (you had trouble yourself :) ) is that you've got your "dequeue" mixed up
                with your "terminate" . :)
                >
                That behavior isn't precluded by using a sentinel value (and for the
                record, sentinel values are quite common in software...ther e's nothing
                fundamentally wrong with them), but you'd have to use an implementation of
                Queue<Tthat allows for inserting at the head (i.e. not .NET's
                implementation) . You could write your own, but I agree that doing so
                _just_ to avoid the flag wouldn't make sense.
                >
                Also note that the code you posted doesn't actually behave exactly as you
                say. In particular, this statement isn't true:
                >
                My approach will stop processing items in the loop immediately,
                >
                The code you posted doesn't terminate when you call Terminate(). It
                simply goes back and waits again. It won't terminate until the _next_
                time the monitor is pulsed (whether that happens due to another call to
                Terminate() or someone trying to enqueue something).
                >
                Again, I think this illustrates how a simpler design is easier to code
                correctly. By my count, we're up to three bugs in your implementation
                just related to termination alone. I got my implementation right the
                first try, and I don't think you can say it's because I'm a better
                programmer. I'm not...I just know better than to complicate things when I
                don't have to. :)
                >
                Now, if you have to -- that is, you have that specific requirement to
                terminate immediately -- I suppose you have to implement it that way. But
                neither of us have the actual specification to work with, and it's
                entirely possible my version would work fine in this context.
                >
                [...]
                I'm more of a "here's the general idea, now write it youself" kind of
                helper. The best lesson you can learn is how to teach yourself :-)
                >
                Well, I agree with that generally. In fact, you'll note that I rarely
                actually post actual code. I prefer to just describe a solution, or point
                someone to the documentation. But, I'm of the mind that when I do post
                code, I should try to get it as close to correct as possible. Just
                because I'm not being paid, that doesn't mean I shouldn't take pride in my
                work. :)
                >
                >Any idea what size this buffer is? For me there will be a maximum of
                >10 people connecting at once so I doubt I will need to be concerned,
                >but it's always interesting to know.
                >
                No, I don't know. Last time I looked at this, I did take a look to see
                if it was documented, but couldn't find any specifics.
                I should be okay with around 10 at the time time, maximum, I hope! This
                should definitely be documented, and obvious!
                >
                As long as in your case, 10 connections translates directly to no more
                than 10 file system changes at once, then I think you're definitely safe.
                That's an odd guarantee to be able to have, but if you have it, no
                problem. :)
                >
                Pete
                >

                Comment

                Working...