Amazon AWS - Textract¶
Nothing can be simpler to interact with Amazon Textract than the BotCity plugin.
The BotCity plugin for AWS Textract allows you to analyze and extract quickly from hundreds of documents, whether entered or handwritten.
pip install botcity-aws-textract-plugin
Importing the Plugin¶
After you installed this package, the next step is to import the package into your code and start using the functions.
from botcity.plugins.aws.textract import BotAWSTextractPlugin
Setting up connection¶
There are two different ways to authenticate.
1. Creating the
.aws folder in the home directory, you need to create two files.
# ~/.aws/config [default] region=<region_code>
# ~/.aws/credentials [default] aws_access_key_id=<your_aws_access_key_id> aws_secret_access_key=<your_aws_secret_access_key>
2. Passing credentials in the class constructor.
# Using the `.aws` folder textract = BotAWSTextractPlugin() # Alternative using the credentials as constructor arguments textract = BotAWSTextractPlugin( region_name='<region_code>', use_credentials_file=False, access_key_id='<your_aws_access_key_id>', secret_access_key='<your_aws_secret_access_key>', )
As a demonstration of the library, let's build a simple example together that will parse the text from the following image:
Reading text from the image¶
Now let's read the text from the image.
# Read the text from the image textract.read("otter_crossing.jpg") # Print the text from the image print(textract.full_text())
The output should look like this:
CAUTION Otters crossing for next 6 miles
Let's take a look at the complete code:
# Instantiate the plugin using the `.aws` folder textract = BotAWSTextractPlugin() # Read the text from the image textract.read("otter_crossing.jpg") # Print the text from the image print(textract.full_text())
This plugin allows you to use
method chaining so the code above could be written as:
text = BotAWSTextractPlugin() \ .read("otter_crossing.jpg") \ .full_text() # Print the text from the image print(text)