PDF Extraction To Excel Automation Using Zappy

#1

Refer video:

Following are the steps for creating Zappy Automation task ( PDF Extraction To Excel ).

Step 1: Create an Excel sheet with the fields you want to extract like Company Name, Country etc. as shown below.

image

Step 2: Right click on Zappy icon at bottom right corner of your window and select Task Editor .

image

Step 3: Click on Open Folder image and select file ExtractValues .zappy task, click on Open as shown. This is a saved zappy task that comes with demo installation.

(You can find this zappy file at: %userprofile%\documents\ZappySamples\Python)

image

Step 4: Click on Get Files In directory and set the path of the folder where all the pdf files are saved and select the FileSearchPattern, add *.particular extension.

(If we want all pdf files from the folder then set search filter as *.pdf ).

image

Step 5: Click on the action “ String Data Search ” and make changes on property panel according to your PDF file.

image

Step 6:

  • DataSearchOrientation – Select the orientation of searching as “Right”, “Left”, “Top”, “Bottom”, “CustomOffset”.
  • ResultTextPosition – Set the text position where you search the data.
    • Allwords- Returns full string on the (Right”, “Left”, “Top”, “Bottom”) according to the selection in DataSearchOrientation.
    • SingleWordFirst- Returns first word on the (Right”, “Left”, “Top”, “Bottom”) according to the selection in DataSearchOrientation.
    • SingleWordLast- Returns last word on the (Right”, “Left”, “Top”, “Bottom”) according to the selection in DataSearchOrientation.
    • CustomOffsetHorizontal- Returns the text until the selected separator in ResultValueSplitChar.
  • SearchText – Enter the text you want to search.
  • SourceTe xt – Source text where you search the data.

(On right hand side in SearchText enter the text you want to search. In DataSearchOrientations select the position of the data from the drop down box to be extracted from your input document. We have selected Right from the drop down box as our Company’s Name is on the right side of field Company Name.)

image

Step 7: Click on Excel Set Property and on RHS enter your created WorkbookName and SheetName as shown below.

image

In this way you can create actions for all the fields you want to extract from your PDF files. Then save your automation by clicking on image

Step 8: Click on Execute image

Step 9: You will see “ Successfully Executed Task ” message from Zappy on successful execution.

image

Step 10: Check your Excel sheet. All the required data will be transferred into excel sheet after extraction from PDF file.

image

In this way you can modify the automation for your PDF’s and save it. This saved automation can be used for any number of PDF’s for extracting same fields. You can also easily delete and add actions by drag and drop of required action (Ex: StringDataSearch) from activity panel onto task editor in your saved automation and use it for other PDF’s also..