File.Move Access Issue

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • mcfly1204
    New Member
    • Jul 2007
    • 233

    File.Move Access Issue

    I am attempting to move a file, but I am receiving an exception stating that the file cannot be moved because it is in use. I realize this is a common error, and also that my application is the one that is keeping the file open, but what I don't know is where.

    I am opening the file (.html) using the Html Agility Pack, which has a .Load method. There is not a way, that I know of, to unload or close a document once open. Other than that, I scrape some data from the .html document but not a lot more. I am also downloading a file (using webclient) from a link from the .html file, but I cannot move that file either. Note my code below.

    Code:
    private void DeleteFiles(string strPODoc)
                 {
                     string strDestination; 
    		 string strPOFile = ParseString(strPODoc, @"c:\");                
    
                     strDestination=@"\\epsa\orders\PROCESSED\";
    
                     if (File.Exists(strPODoc))
                     {
                         File.Move(strPODoc, strDestination + strPOFile);
                         Console.WriteLine(strPODoc + " has been processed");
                     }
                     else
                     {
                         Console.WriteLine(strPODoc + " does not exist");
                         Console.Read();
                     }                 
                 }
  • balabaster
    Recognized Expert Contributor
    • Mar 2007
    • 798

    #2
    If the Agility pack is loading the file and not allowing you to unload it and that's what's causing the lock, then my view is that you need to ditch the agility pack and use something more productive... either that or you need to email the producers of this pack and let them know that there's a shortfall in their API that needs addressing in short order.

    How are you scraping the file exactly? In the past, I've always used something Regular Expressions to scrape files, regardless of the format. This allows a far greater level of control in your application meaning that you can control when files are open and closed.

    Comment

    • mcfly1204
      New Member
      • Jul 2007
      • 233

      #3
      With the agility pack, I can collect all similar tags into an array, and then just select which one I need to use. I briefly looked at using regular expressions to parse the html, but I was very pleased with the ease of use of the Html Agility Pack.

      Comment

      • mcfly1204
        New Member
        • Jul 2007
        • 233

        #4
        I should also note that I am not able to move the file I download with WebClient. Even if I call WebClient.Dispo se(), I cannot gain access to the file.

        Comment

        • nukefusion
          Recognized Expert New Member
          • Mar 2008
          • 221

          #5
          Originally posted by mcfly1204
          I should also note that I am not able to move the file I download with WebClient. Even if I call WebClient.Dispo se(), I cannot gain access to the file.
          How are you downloading the file? Using the DownloadFile() method of the WebClient?
          Do you access the file using any sort of FileStream?
          If possible, a small code sample of your download function would certainly be helpful.

          Comment

          • balabaster
            Recognized Expert Contributor
            • Mar 2007
            • 798

            #6
            Originally posted by mcfly1204
            With the agility pack, I can collect all similar tags into an array, and then just select which one I need to use. I briefly looked at using regular expressions to parse the html, but I was very pleased with the ease of use of the Html Agility Pack.
            By similar, do you mean, for example, all the anchor tags? or all the paragraph tags? etc

            <a href="firstlink .aspx">First Link Text</a>
            <a href="secondlin k.aspx">Second Link Text</a>
            <a href="thirdlink .aspx">Third Link Text</a>
            ...

            Using Regex you can grab them:

            Code:
            Dim MatchCol = Regex.Matches(HtmlString, "(?i:<a.*?</a>)")
            I'd probably use that...MatchCol is then a collection of all my anchor tags. You could do something similar for other tags changing up some of the inside pattern.

            There are a whole bunch of example patterns on the net for this, but most of them reference a longer pattern that doesn't allow for things like anchor tags that contain images instead of text.

            Comment

            • mcfly1204
              New Member
              • Jul 2007
              • 233

              #7
              Originally posted by nukefusion
              How are you downloading the file? Using the DownloadFile() method of the WebClient?
              Do you access the file using any sort of FileStream?
              If possible, a small code sample of your download function would certainly be helpful.
              I download the file using the following:

              Code:
              private void GetVectorArt(string strPODoc)
                          {
                              string strURL = strBaseURL + strArtURL;
                              string strPath = @"c:\" + strPO + "download.zip";
              
                              WebClient client = new WebClient();
              
                              client.DownloadFile(strURL, strPath);
              
                              this.SendToArtwork(strPath, strPODoc);
                          }
              Once downloaded, I send an email with the file as an attachment. After that, I am simply trying to move or delete the downloaded file.

              Comment

              • balabaster
                Recognized Expert Contributor
                • Mar 2007
                • 798

                #8
                Never mind... (delete this reply please)

                Comment

                • mcfly1204
                  New Member
                  • Jul 2007
                  • 233

                  #9
                  Originally posted by balabaster
                  By similar, do you mean, for example, all the anchor tags? or all the paragraph tags? etc

                  <a href="firstlink .aspx">First Link Text</a>
                  <a href="secondlin k.aspx">Second Link Text</a>
                  <a href="thirdlink .aspx">Third Link Text</a>
                  ...

                  Using Regex you can grab them:

                  Code:
                  Dim MatchCol = Regex.Matches(HtmlString, "(?i:<a.*?</a>)")
                  I'd probably use that...MatchCol is then a collection of all my anchor tags. You could do something similar for other tags changing up some of the inside pattern.

                  There are a whole bunch of example patterns on the net for this, but most of them reference a longer pattern that doesn't allow for things like anchor tags that contain images instead of text.
                  I am an in the process of changing from the Agility Pack to regular expressions as you hve recommended, but I seem to be having some trouble with nested tags. Do you have much regex experience in regards to HTML? I have been using this expression for TDs:

                  (?s)(?<=<td[^>]*>.*?)<td[^>]*>.*?</td>

                  Comment

                  • balabaster
                    Recognized Expert Contributor
                    • Mar 2007
                    • 798

                    #10
                    Originally posted by mcfly1204
                    I am an in the process of changing from the Agility Pack to regular expressions as you hve recommended, but I seem to be having some trouble with nested tags. Do you have much regex experience in regards to HTML? I have been using this expression for TDs:

                    (?s)(?<=<td[^>]*>.*?)<td[^>]*>.*?</td>
                    Ah yes, this particular element of HTML can be a bit more of a chore when it comes to regular expressions. Not to say it can't be done if you know what you're doing.

                    The following regular expression will find the corresponding closing tag for your opening tag:

                    (?x:<td>(?>(?!< td>|</td>).|<td>(?<De pth>)|</td>(?<-Depth>))*(?(Dep th)(?!))</td>)
                    You would then need to recurse through each nested item that contained other nested items. This will pull out your top level cells from the table as a collection. You'd then need to figure out which (if any) of those contained nested tables and then do the same thing.

                    Now, this might be easier with something like the Agility pack, I don't know, I've not used it.

                    With XElement and XQuery you can basically strip out a collection of all <td> elements which you can then work on as an entity. However, the XML objects aren't great with HTML and I've not done an awful lot with HTML - I've done a bunch of XML and while I know they're remarkably similar, there's aspects of html that the XML objects just fall over and die on - such as elements that don't have closures, like the <IMG> tag, the <BR> tag and the <HR> tag and while their old forms have been deprecated in favour of new XML standards <IMG /> <HR /> and <BR />, there are still many websites out there using the old form that will trip your application up.

                    To be honest though, if the Agility pack is locking the file and not allowing you to copy it, I would get onto them and report it as a bug and tell them it needs fixing. I hate having to code around bugs, and for something as potentially complex as nested tags, like tables, spans, divs, fonts etc, Regular Expressions could have the potential to become a nightmare if you don't know what you're doing or if you need to start getting complicated with what you're trying to pull out of the HTML.

                    Comment

                    • mcfly1204
                      New Member
                      • Jul 2007
                      • 233

                      #11
                      So I am now parsing all of my data using regex, but I am still having the issue of not being able to move the original file.

                      Code:
                      public void SelectFiles()
                                  {
                                      string Pickup;
                                      string[] strFiles;
                                      string[] HTMLs;
                      
                                      Pickup = @"\\epsa\orders";
                                      strFiles = Directory.GetFiles(Pickup, "*.htm");
                                      
                                      foreach (string htm in strFiles)
                                      {
                                          string HTML;
                      
                                          HTML = ParseString(htm.ToString(), ".htm");
                                          HTML = ParseString(HTML, @"\\epsa\orders\");
                      
                                          File.Copy(htm, @"c:\orders\" + HTML + ".html");
                                      }
                      
                                      HTMLs = Directory.GetFiles(Pickup, "*.html");
                      
                                      if (HTMLs.Length < 1)
                                      {
                                          return;
                                      }
                                      else
                                      {
                                          Console.WriteLine("Total HTML Files: " + HTMLs.Length + System.Environment.NewLine);
                                      }
                                      
                                      foreach (string x in HTMLs)
                                      {
                                              this.GetContents(x);
                                      }
                                  }
                      I cannot figure out what is locking the file. GetContents simply reads the contents of the file with a streamreader inclosed in a using statement.

                      Comment

                      • balabaster
                        Recognized Expert Contributor
                        • Mar 2007
                        • 798

                        #12
                        I don't see anything wrong with this code... I need to see the processing code for where you open the file and parse it. This code is fine... I suspect you're not closing and disposing the file stream where you're parsing the file which is what is causing your problem.

                        Comment

                        • mcfly1204
                          New Member
                          • Jul 2007
                          • 233

                          #13
                          Code:
                          private void GetContents(string HTML)
                                      {
                                          using (StreamReader reader = new StreamReader(HTML))
                                          {
                                              string htmlContent = reader.ReadToEnd();
                          
                                              this.GetLinks(HTML, htmlContent);
                                              this.GetTDs(HTML, htmlContent);
                                              this.GetSpans(HTML, htmlContent);
                                          }
                                      }
                          
                                      private void GetLinks(string HTML, string htmlContent)
                                      {
                                          string linkRegEx = "(?i:<a.*?</a>)";
                                          link = new string[4];
                          
                                          Regex rLinks = new Regex(linkRegEx);
                          
                                          MatchCollection links = rLinks.Matches(htmlContent);
                          
                                          for (int i = 0; i < (links.Count - 1); i++)
                                          {
                                              link[i] = links[i].Value;
                                          }
                          
                                          ArtURL = SplitString(link[2], '"', 1);
                                          
                                          BaseURL = @"https://springboard.4imprint.com/PO/view/";
                                          ArtURL = BaseURL + ArtURL;
                                      }
                          
                                      private void GetTDs(string HTML, string htmlContent)
                                      {
                                          string tdRegEx = @"<td(?:\s[^>]*)?>(?:(?>[^<]+)|<(?!td(?:\s[^>]*)?>))*?</td>";
                                          td = new string[95];
                          
                                          Regex rTds = new Regex(tdRegEx);
                          
                                          MatchCollection tds = rTds.Matches(htmlContent);
                          
                                          for (int i = 0; i < (tds.Count - 1); i++)
                                          {
                                              td[i] = tds[i].Value;
                                          }
                          
                                          PONum = SplitString(td[1], '>', 2);
                                          PONum = ParseString(PONum, @"</span");
                                          PONum = ParseString(PONum, "Purchase Order");
                                          DistAddr1 = "101 Commerce St.";
                                          DistAddr2 = "PO Box 320";
                                          DistAddr3 = "Oshkosh, WI 54901";
                                          DistPhone = "920-236-7272";
                                          DistFax = "902-236-7282";
                                          OppCon = SplitString(td[22], '>', 3);
                                          OppCon = ParseString(OppCon, @"</td");
                          
                                          ShipToName = SplitString(td[81], '>', 1);
                                          ShipToName = ParseString(ShipToName, @"</td");
                                          ShipToAddr1 = SplitString(td[82], '>', 1);
                                          ShipToAddr1 = ParseString(ShipToAddr1, @"</td");
                                          ShipToAddr2 = SplitString(td[83], '>', 1);
                                          ShipToAddr2 = ParseString(ShipToAddr2, @"</td");
                                          ShipToAddr3 = SplitString(td[84], '>', 1);
                                          ShipToAddr3 = ParseString(ShipToAddr3, @"</td");
                                          ShipCity = SplitString(td[85], '>', 1);
                                          ShipCity = ParseString(ShipCity, @"</td");
                                          ShipState = SplitString(td[86], '>', 1);
                                          ShipState = ParseString(ShipState, @"</td");
                                          ShipZip = SplitString(td[87], '>', 1);
                                          ShipZip = ParseString(ShipZip, @"</td");
                          
                                          OrderDate = SplitString(td[21], '>', 3);
                                          OrderDate = ParseString(OrderDate, @"</td");
                                          ReqShipDate = SplitString(td[92], '>', 2);
                                          ReqShipDate = ParseString(ReqShipDate, @"</span");
                                          InHandsDate = SplitString(td[94], '>', 2);
                                          InHandsDate = ParseString(InHandsDate, @"</span");
                          
                                          ItemNo = SplitString(td[32], '>', 1);
                                          ItemNo = ParseString(ItemNo, @"</td");
                                          ItemQty = SplitString(td[91], '>', 2);
                                          ItemQty = ParseString(ItemQty, @"</span");
                                          UnitCost = SplitString(td[34], '>', 1);
                                          UnitCost = ParseString(UnitCost, @"</td");
                                          TotalCost = SplitString(td[35], '>', 1);
                                          TotalCost = ParseString(TotalCost, @"</td");
                          
                                          ImpPhrase = "";
                                          Notes = ParseString(td[63], @"</td>");
                                          Notes = ParseString(Notes, @"<br>");
                                          Notes = ParseString(Notes, @"<td>");
                                      }

                          Comment

                          • mcfly1204
                            New Member
                            • Jul 2007
                            • 233

                            #14
                            I think I have it, it has nothing to do with what I have posted, it is because of a method where I send the file as an attachment:

                            Code:
                            private void SendToArtwork(string Artwork, string PODoc)
                                        {
                                            PONum = PONum.Trim();
                            
                                            MailMessage message = new MailMessage();
                                            message.To.Add("toemail");
                                            message.From = new MailAddress("fromemail");
                            
                                            message.Priority = MailPriority.Normal;
                                            message.Subject = "Artwork and Purchase Order For PO# " + PONum;
                                            message.Body = "Attached is the original purchase order number " + PONum + ", as well as the included vector artwork.";
                            
                                            Attachment attPO = new Attachment(PODoc);
                                            Attachment attArt = new Attachment(Artwork);
                                            message.Attachments.Add(attPO);
                                            message.Attachments.Add(attArt);
                            
                                            SmtpClient smtp = new SmtpClient("exchangeserver");
                                            smtp.Credentials = new NetworkCredential("username", "password");
                            
                                            try
                                            {
                                                smtp.Send(message);
                                            }
                                            catch
                                            {
                                                System.Environment.Exit(5);
                                            }
                                        }

                            Comment

                            • balabaster
                              Recognized Expert Contributor
                              • Mar 2007
                              • 798

                              #15
                              Add this to the end of the try block (after your catch block) in your SendToArtwork method and see if it helps:
                              Code:
                              finally{
                                attArt.Dispose();
                                attPO.Dispose();
                                message.Dispose();
                              }
                              You want to do it as a finally so that resources are always closed and memory reclaimed, even if the try fails.

                              It looks to me like the failure to dispose of the attachment objects and the message object properly may be what's causing your issue.

                              You don't close your reader properly in your GetContents() method - inside the closure for your using (StreamReader.. ..) statement add:
                              Code:
                              reader.Close();
                              To make sure that your reader is closed releasing any objects that are locked by it prior to it being disposed by the closure of your using statement.
                              So the full block of code should look like:
                              Code:
                              using (StreamReader reader = new StreamReader(HTML))
                              {
                                string htmlContent = reader.ReadToEnd();
                                this.GetLinks(HTML, htmlContent);
                                this.GetTDs(HTML, htmlContent);
                                this.GetSpans(HTML, htmlContent);
                                reader.Close();
                              }
                              Given that in this case the reader is only working on a string, it won't make any real world difference - but it is good coding practice to close connections when you've finished with them so that no adverse side-effects of locked resources occur later. Other than that, I don't see anywhere else your code could be locking objects and not releasing resources before the classes are disposed.

                              One thing I'm not entirely sure of: The using keyword is used to automatically dispose of objects that implement the IDisposable interface. If a class implements IDisposable properly, then it will be disposed of at the closure of your using statement...the assumption is that Microsoft implements all their classes properly but if it's not implemented properly, then there's a chance your object may not be disposed of properly at the closure of the using statement so while it is expected that:
                              Code:
                              using obj as new TargetObject(){
                              }
                              would dispose of the TargetObject instance at the completion of the code block, if TargetObject() doesn't implement IDisposable properly, then the instance must be disposed of manually using something like obj.Dispose();

                              Don't assume that things work the way they should be expected to work until you've proved to yourself that they do...

                              I've not used the mail and smtp classes, so I've not got any experiences with their glitches or gaps I'm afraid.

                              Comment

                              Working...