visual basic programming course - How to Extract Text from PDF File in VB Net

 Visual Basic Programming Course

How to Extract Text from PDF File in VB Net

visual-basic-programming course pdf files manipulations tips
Visual Basic Tips on How to Manipulate PDF Files
This course blog discusses dealing with PDF File Format (PDF). Since Adobe Acrobat PDF reader is the best tool until today, then I will use it. iTextSharp is the most common well known tool for manipulating PDF Files in most of development environments.
As stated here , there is a new release called itext7 which adds new security features when you move to itext7, you could get either commercial copy or AGPL from their official website.
Unless you purchase a commercial License, you are not permitted to change the PDF Info concerning 'PDF Producer'.
You can find Documentations on iTextSharp 5.0 here 
Microsoft Windows 32bit or 64bit
Microsoft Visual Studio 2010
iTextSharp <<Download it from NuGet>> AGPL3 License 'Open Source'
Awesomium 1.7.5 <<Download it from NuGet>>
Visual Basic Project Logic
We will create a Visual Basic Project that accomplish those tasks :
  1. Open PDF File using iTextSharp
  2. Search for Word within PDF Text
  3. Display a PDF File with Highlighted Search Results
  4. Stamp the New PDF File with Highlighted Search Results with Watermark.
Our outputs will look like this:
iTextSharp Visual Basic .NET Project
The application Life Cycle is as follows :
  • User opens the Application.
  • User chooses the PDF File.
  • User attempts to search for a word or a phrase.
  • Application opens the PDF File and get the all Text within.
  • Application returns the number of pages in and the search results count in the PDF File.
  • Application replaces the search results with highlighted search results.
  • Application stamps the Temp PDF File with Watermark image.
  • Application creates a new PDF File 'Temp file' with the new highlighted search results.
  • Application displays the Temp PDF File with highlighted search results.
  • Application copies and pastes the Temp PDF File in the Project Directory 'overwrites it'
  • user closes the application.
  • Application removes the Temp PDF File.
Visual Basic Project Design
- We will need a PDF File sample to use it in our a Search task. <<Call it GNU GPL.pdf>>
- Create a new Visual Basic WinForm Project 2010
- .NET Framework 4.0
- Form1.vb design is as follows:
iTextSharp Visual Basic. NET Project example
Visual Basic Code

Visual Basic Code for choosing PDF File on Form1 TxtPdfFile.Text, using OpenFileDialog

  • Try
  • Using OFD As New OpenFileDialog With {
  • .CheckPathExists = True,
  • .Filter = ("PDF File Format *pdf|*.pdf"),
  • .DefaultExt = "pdf",
  • .Multiselect = False,
  • .RestoreDirectory = True,
  • .InitialDirectory = (Application.StartupPath),
  • .SupportMultiDottedExtensions = False}
  • If OFD.ShowDialog = System.Windows.Forms.DialogResult.OK Then
  • TxtPdfPath.Text = OFD.FileName
  • End If
  • End Using
  • Catch ex As Exception
  • MsgBox(ex.Message)
  • End Try

Visual Basic Code for starting PDF File Reader using iTextSharp

  • Private strSource As String = PDF_FILE_LOCATION
  • Using pdfFileReader = New PdfReader(strSource)
  • End Using
GitHub Repo - Full VB .NET Project iTextSharp

Popular posts from this blog

VB .NET Google Drive Api Source Code Example

VB .NET DropBox Api Source Code Example

VB .NET WebView2 WinForms tips

Visual Basic Online Courses DataGridView Add Edit Delete

Visual Basic Online Course DevExpress 12.2.4 full for Visual Studio 2010

Visual Basic 2010 Working With DataBase Full Project Example