So this is a very new territory for me and from what I can't find online, it's a new territory for most.
I am using tessnet2 OCR to read a pdf so I can extract some words/information.
When tessnet2 reads the file it compiles the results into a class..
Now, most of that is useless for what i'm trying to accomplish. The one i'm focusing on is
- That is where each distinct word the OCR finds is stored.
Ok, so what I need help with is trying to use Linq to search through the Class result and find everything between
and
This code is as far as I've gotten really.
That that doesn't give me the range..just the two individual words.
I am using tessnet2 OCR to read a pdf so I can extract some words/information.
When tessnet2 reads the file it compiles the results into a class..
Code:
public class Word { public int Blanks; public int Bottom; public List<Character> CharList; public double Confidence; public int FontIndex; public int Formating; public int Left; public int LineIndex; public int PointSize; public int Right; public object Tag; public string Text; public int Top; public Word(); public override string ToString(); }
Code:
string Text
Ok, so what I need help with is trying to use Linq to search through the Class result and find everything between
Code:
string word1 = "Number;";
Code:
string word 2 = "Vendor:";
Code:
List<tessnet2.Word> specificWord1 = result.FindAll(x => x.Text == "Number;"); List<tessnet2.Word> specificWord2 = result.FindAll(x => x.Text == "Vendor:");