BUSINESS IMPACT

we're hiring

Need a career change? We're hiring top talent to join our team!

Oct 01, 2019

Knowledge Mining Series: Using Form Recognizer in Azure Search

Kevin Jackson Posted by Kevin Jackson

In this installment of BlueGranite’s Knowledge Mining Showcase, a series showcasing how the human-like reasoning capabilities of Microsoft’s Azure Cognitive Search can maximize insight from all kinds of data, we are going to learn how to integrate one of Microsoft’s newest cognitive services, Form Recognizer – a comprehensive AI-powered document mining tool – into Azure Search to create a powerful discovery solution.

azure cognitive services search

First, let’s review Azure Search’s revolutionary capabilities. The intelligent, cloud solution-as-a-service is reinventing knowledge mining – the practice of using cognitive search to extract facts from unstructured data.

Azure Search recognizes and extracts text and identity from images; can discover key talking points in text; and can identify and classify people, places, and things from text and images. Trailblazers are using it to discover and leverage important, enterprise-driving data.

By incorporating Form Recognizer with the dynamic Azure Search tool, we can pull key-value pairs from forms that can then be searched as metadata, adding another layer to Search’s capability and sophistication.

Say we need to comb thousands of invoices to uncover those associated with a specific company – let’s call it “Acme Inc.” Surfacing those with Azure Search is a breeze. But we’re going to further complicate things; let’s assume that, in addition to invoicing Acme Inc., our company also resells its products. However, we only want to surface invoices in which we’ve billed Acme Inc. Adding Form Recognizer to Azure Search allows us to do just that, rather than showing all associated billing (such as invoices where we’ve charged others for Acme Inc. products). 

Form Recognizer in Azure Search

The fast, accurate Form Recognizer service uses machine learning to deliver precise, consistent results; results that once required data science know-how. However, Form Recognizer is currently only available in limited-access preview. To try out our tutorial below, you’ll need to request Form Recognizer access by completing and submitting Microsoft’s form here. If Microsoft’s Azure Cognitive Services team approves the request, you'll receive an email with access instructions. 

We’re going to review and elaborate on this Microsoft “Analyze form sample skill for cognitive search” training module. You can follow along this exercise simply by reading, but if you’d like to take a more hands-on approach, you’ll need:

  1. An Azure subscription
  2. Access to the Form Recognizer resource
  3. Microsoft’s Visual Studio 2019 or a recent version of a C# compiler
  4. A REST API tool, such as cURL or Postman (we will be using Postman)
  5. Clone the GitHub azure-search-power-skills repository locally on your machine

Create a Form Recognizer Resource

When you're granted access to use Form Recognizer, you'll receive a “Welcome” email with several links and resources. Use the "Azure portal" link in that message to open the Azure portal and create a Form Recognizer resource. In the Create pane, provide the following information:

Name

A descriptive name for your resource. We recommend using a straightforward format, such as MyNameFormRecognizer.

Subscription

Select the Azure subscription that has been granted access.

Location

The location of your cognitive service instance. Different locations may introduce latency, but have no impact on the runtime availability of your resource.

Pricing tier

The cost of your resource depends on the pricing tier you choose and your usage. For more information, see the Cognitive Services API pricing details.

Resource group

The Azure resource group that will contain your resource. You can create a new group, or add it to a pre-existing group.

 

Important

Normally, when you create a Cognitive Service resource in the Azure portal, you have the option to create a multi-service subscription key (used across multiple cognitive services) or a single-service subscription key (used only with a specific cognitive service). However, because Form Recognizer is a preview release, it is not included in the multi-service subscription, and you cannot create the single-service subscription unless you use the link provided in the Welcome email.

For this example, I created the following Form Recognition service:

form recognizer in azure search example

Note that I used the Free tier, the name of my service is BGFormRecognizer, and the Endpoint is https://bgformrecognizer.cognitiveservices.azure.com

Train the Form Recognizer Model

To begin training our Form Recognizer model, we'll need a set of training data in an Azure Storage blob. We need a minimum of five filled-in forms (PDF documents and/or images); we can also use a single empty form with two completed forms. The empty form file name must include the word "empty”. (See Microsoft’s “Build a training data set for a custom model” for tips and options if you are putting together training data).

For this example, we’ll use the sample forms included in the aforementioned GitHub azure-search-power-skills repository. You’ll find these under the SampleData folder, right off the root folder.

1. Create a blob container in an Azure Storage account and upload the five SampleData folder forms into it.

form recognizer azure search example pt 2

  1. In this example, I’ve uploaded the five sample forms into a blob container named “form-recognizer”, under a storage account named “kmdatasource”.
  2. The blob container access level is set to Public – container. We wouldn’t do this in a production environment, but rather would use a shared access signature (SAS). For simplicity’s sake, we’ll keep it public for this example.


2. We’ll train the Form Recognizer model using an API call in the following format:

Request:
POST https://[form recognizer end point]/formrecognizer/v1.0-preview/custom/train

Headers:
Content-Type: application/json
Ocp-Apim-Subscription-Key: [form recognizer subscription key]

Body:
{
    "source": "[blob storage end point]"
}

            Where:

[form recognizer end point]

The end point of your Form Recognizer service; you’ll find this on the Overview and the Quickstart tabs for your Form Recognizer service.

[form recognizer subscription key]

The API key of your Form Recognizer service; you’ll find this on the Keys and Quickstart tabs for your Form Recognizer service.

[blob storage end point]

The end point of the blob container where you stored your sample forms.


            Using the Form Recognizer service and blob container I created earlier, my calls look like this:

Request:
POST https:// bgformrecognizer.cognitiveservices.azure.com/formrecognizer/v1.0-preview/custom/train

Headers:
Content-Type: application/json
Ocp-Apim-Subscription-Key: 32-bit hexadecimal value

Body:
{
    "source": "https://kmdatasource.blob.core.windows.net/form-recognizer"
}

3. Train the model using our correctly formed POST API call. When I run the train API call with all of the above settings, I get a 200 OK response with the following body:

 

{
    "modelId": "794f2366-045a-4fbd-81cb-8af57f541e6f",
    "trainingDocuments": [
        {
            "documentName": "Invoice_1.pdf",
            "pages": 1,
            "errors": [],
            "status": "success"
        },
        {
            "documentName": "Invoice_2.pdf",
            "pages": 1,
            "errors": [],
            "status": "success"
        },
        {
            "documentName": "Invoice_3.pdf",
            "pages": 1,
            "errors": [],
            "status": "success"
        },
        {
            "documentName": "Invoice_4.pdf",
            "pages": 1,
            "errors": [],
            "status": "success"
        },
        {
            "documentName": "Invoice_5.pdf",
            "pages": 1,
            "errors": [],
            "status": "success"
        }
    ],
    "errors": []
}

Take special note of the “modelId” value returned. We will be using that value as we progress.

Validate Your Trained Model

Now that we’ve trained our model, let’s validate it by returning some key-value pairs from the sample data. We’re going to do that by making another API call in the format of:

Request:
POST https://[form recognizer end point]/formrecognizer/v1.0-preview/custom/models/[modelId]/analyze

Headers:
Content-Type: multipart/form-data
Ocp-Apim-Subscription-Key: [form recognizer subscription key]

Body:
form: [form to validate]
type: application/pdf

            Where:

[form recognizer end point]

The end point of your Form Recognizer service; you’ll find this on the Overview and the Quickstart tabs for your Form Recognizer service.

[modelId]

The Id of the model returned when you trained your Form Recognizer model in the previous step.

[form recognizer subscription key]

The API key of your Form Recognizer service; you’ll find this on the Keys and Quickstart tabs for your Form Recognizer service.

[form to validate]

The actual form to validate against the model; this is not a link to the file. In Postman, you are prompted to select a file.

 

Using the Form Recognizer service, blob container, and trained model I created earlier, my calls look like this:

Request:
POST https:// bgformrecognizer.cognitiveservices.azure.com/formrecognizer/v1.0-
preview/custom/models/794f2366-045a-4fbd-81cb-8af57f541e6f/analyze

Headers:
Content-Type: multipart/form-data
Ocp-Apim-Subscription-Key: 32-bit hexadecimal value

Body:
form: Invoice_3.pdf   // Uploaded from my local drive
type: application/pdf

When I run the train API call with all of the above settings, I get a 200 OK response with the following body:

{
    "status": "success",
    "pages": [
        {
            "number": 1,
            "height": 792,
            "width": 612,
            "clusterId": 0,
            "keyValuePairs": [
                {
                    "key": [
                        {
                            "text": "Address:",
                            "boundingBox": [
                                57.4,
                                683.1,
                                100.5,
                                683.1,
                                100.5,
                                673.7,
                                57.4,
                                673.7
                            ]
                        }
                    ],
                    "value": [
                        {
                            "text": "1111 8th st.",
                            "boundingBox": [
                                 … cut for space
                            ],
                            "confidence": 0.86
                        },
                        {
                            "text": "Bellevue, WA 99501",
                            "boundingBox": [
                                … cut for space
                            ],
                            "confidence": 0.86
                        }
                    ]
                },
                {
                    "key": [
                        {
                            "text": "Invoice For:",
                            "boundingBox": [
                                … cut for space
                            ]
                        }
                    ],
                    "value": [
                        {
                            "text": "Alpine Ski House",
                            "boundingBox": [
   … cut for space
                            ],

                            "confidence": 1.0
                        },
                        {
                            "text": "1025 Enterprise Way",
                            "boundingBox": [
   … cut for space
                            ],

                            "confidence": 1.0
                        },
                        {
                            "text": "Sunnyvale, CA 94024",
                            "boundingBox": [
   … cut for space
                            ],

                            "confidence": 1.0
                        }
                    ]
                },
        … cut for space
}


Now we’ve validated the model and can see that the Form Recognizer service is correctly picking up key-value pairs for the model we trained. Next, let’s look at integrating our trained Form Recognizer model into Azure Search.

Create the Search Service

Now we need to create an Azure Search service. If you are unfamiliar with this process, please see the previous blog post on Azure Search in our Knowledge Mining Showcase series.

I created an empty Azure Search service called bgforrecognition-search. For now, don’t create an index, data source, or skillset. We first need to create an Azure Function App to call our Form Recognizer service.

Create an Azure Function App to call the Form Recognition service

Go to your local repository that you cloned from azure-search-power-skills. Open up the PowerSkills.sln file in Visual Studio 2019. Make the Vision/AnalyzeForm project the default project. It should look something like this:

azure function app

There are only two files we are concerned with at this point: the AnalyzeForm.cs file and the launchSettings.json file under the Properties folder.

First, let’s look at the launchSettings.json file.

azure function app 2

You’ll want to replace the value for “FORMS_RECOGNIZER_API_KEY” with the key from the Quickstart or Keys tabs from your Form Recognizer service in the Azure Portal. Replace the “FORMS_RECOGNIZER_MODEL_ID” with the modelId value that was returned from the train API call you made earlier.

In the AnalyzeForm.cs file you’ll need to update the static string formsRecognizerApiEndpoint variable with the end point of the Form Recognition service you created.

azure funciton app 3

Note the section right under where our settings variables are being set:

azure function app 4

You can update the fieldMappings to return any of the key-value pairs you want for your forms.

The first value in the dictionary is the Key from the form itself. The second value is what it will be mapped to and exposed to Azure Search as.

For this exercise, we’ll leave the mappings as they are.

Let’s test out our function app and make sure it works. You can run this locally and the Azure Functions Core Tools will launch to give you the API call you need to make.

azure function app 5

Copy the URL for the call and create a new request in Postman. You’ll see a README.md file in the AnalyzeForm project. That file contains a sample post body that we can use to test the function.

{
    "values": [
        {
            "recordId": "record1",
            "data": {
   "formUrl": "https://github.com/Azure-Samples/azure-search-power-skills/raw/master/SampleData/Invoice_4.pdf",
                "formSasToken":  "?st=sasTokenThatWillBeGeneratedByCognitiveSearch"
            }
        }
    ]
}

My setup in Postman looks like this:

azure function app 6

Send the request and you should get back the results shown in the README.md file.

Now, we need to deploy our function app to Azure. Right click on the AnalyzeForm project and select Publish. You should see a screen that looks like this:

azure function app 7

Select the Azure Functions Consumption Plan option and then press the Publish button. You’ll then see a screen like this:

azure function app 8


Fill out the fields on the left side of the screen. Azure will give you a default Name that isn’t very descriptive, so you may want to change it. The function app needs blob storage to store metadata, so create a new blob container, or select one that already exists. Go ahead and press the Create button. This can take a while, don’t worry that it’s not working.


Once the function app has successfully deployed, you should see it in the Portal under App Services.

Microsoft form recognizer

Now that our function app connection to our Form Recognition service is configured and deployed, lets configure our Azure Search service to use it.

Update the Search Service

Now, we’ll take the search service that we created earlier and update it to make use of our trained Form Recognition model. You will need the Service Name and the API Key for your search service to make the following API calls. The Service Name is simply the name of the Search Service; you can get the API key from the Keys tab on the search service portal page.

We’ll give some examples on the API calls to create these search resources. For a full description of how to create these resources, please reference the BlueGranite Knowledge Mining Showcase article here.

Create the Datasource

Using the blob container and search service I created earlier, my call looked like this:

Request:
POST https:// bgformrecognizer-search.search.windows.net/datasources?api-version=2019-05-06

Headers:
Content-Type:    application/json
Api-key: 32-bit hexadecimal value    // From Search Service keys

Body:
{
    "name": "pdf-datasource", 
    "description": "pdf files for searching.", 
    "type": "azureblob",
    "credentials":
    {"connectionString":
      "DefaultEndpointsProtocol=https;AccountName=kmdatasource;AccountKey=r/Ue6aX23+OqOt0SxmZhvzt97xrVjiki8JyhLrlTW59QyAnppsdE1fEQFMA63xxxxxxxxxxxxxxxx;EndpointSuffix=core.windows.net"

    }, 
    "container": {"name": "form-recognizer"}

Create the Skillset

The skillset is how we add all cognitive services to Azure Search. It defines how to process data from the search using external tools and functions. You can find more information on creating skillsets here.

I used the following API call to create the skillset for my search (that uses the function app we created to integrate the Form Recognition model we created).

Using the blob container and search service created earlier, my call looks like this:

Request:
POST https:// bgformrecognizer-search.search.windows.net/skillsets/formskillset?api-version=2019-05-06

Headers:
Content-Type:    application/json
Api-key: 32-bit hexadecimal value    // From Search Service keys

Body:
{
    "name": "formskillset",
    "description": "Skillset for Form Recognition",
    "skills": [
        {
           "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
            "name": "formrecognizer",
            "description": "Extracts fields from a form using a pre-trained form recognition model",
            "uri": "https://formrecognition.azurewebsites.net/api/analyze-form?code=d4MaIrcb2PP7pmIMqC7E8UcjK2zEhEJxle1ivtyi3xxxxxxxxxx==",
            "context": "/document",
            "batchSize": 1,
            "inputs": [
                {
                    "name": "formUrl",
                    "source": "/document/metadata_storage_path"
                },
                {
                    "name": "formSasToken",
                    "source": "/document/metadata_storage_sas_token"
                }
            ],
            "outputs": [
                {
                    "name": "address",
                    "targetName": "address"
                },
                {
                    "name": "recipient",
                   "targetName": "recipient"
                }
            ]
        }
    ],
    "cognitiveServices": {
        "@odata.type": "#Microsoft.Azure.Search.CognitiveServicesByKey",
        "key": "29261034dde2452db3deexxxxxxxxxx"
    }
}

Create the Index

The Index is what defines our searchable content. For this exercise, I’ve made a very simple one. This is what my Create Index request looks like:

Request:
PUT https:// bgformrecognizer-search.search.windows.net/indexes/formindex?api-version=2019-05-06

Headers:
Content-Type: application/json
Api-key: 32-bit hexadecimal value    // From Search Service keys

Body:
{
    "fields": [
        {
            "name": "id",
            "type": "Edm.String",
            "key": true,
            "searchable": true,
            "filterable": false,
            "facetable": false,
            "sortable": true
        },
        {
            "name": "content",
            "type": "Edm.String",
            "sortable": false,
            "searchable": true,
            "filterable": false,
            "facetable": false
        },
        {
            "name": "address",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "retrievable": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "synonymMaps": []
        },
        {
            "name": "recipient",
            "type": "Edm.String",
            "searchable": true,
            "filterable": false,
            "retrievable": true,
            "sortable": false,
            "facetable": false,
            "key": false,
            "indexAnalyzer": null,
            "searchAnalyzer": null,
            "analyzer": null,
            "synonymMaps": []
        }
    ]
}

Create the Indexer

The Indexer does the work of cracking the documents, applying our skillsets, and extracting the document data into our index. This is what my Indexer request looks like. Note that you will get a created response back from your call, but the Indexer will take a little longer to complete.

Request:
PUT https:// bgformrecognizer-search.search.windows.net/indexers/formindexer?api-version=2019-05-06

Headers:
Content-Type:    application/json
Api-key: 32-bit hexadecimal value    // From Search Service keys

Body:
{
    "dataSourceName""formdatasource",
    "targetIndexName": "formindex",

    "skillsetName": "formskillset",
    "fieldMappings": [
        {
            "sourceFieldName": "metadata_storage_path",
            "targetFieldName": "metadata_storage_path"
        },
        {
            "sourceFieldName": "metadata_storage_path",
            "targetFieldName": "id",
            "mappingFunction": {
                "name": "base64Encode"
            }
        }
    ],
    "outputFieldMappings": [
        {
            "sourceFieldName": "/document/address",
            "targetFieldName": "address"
        },
        {
            "sourceFieldName": "/document/recipient",
            "targetFieldName": "recipient"
        }
    ],
    "parameters": {
        "maxFailedItems": -1,
        "maxFailedItemsPerBatch": -1,
        "configuration": {
            "dataToExtract": "contentAndMetadata",
            "imageAction": "generateNormalizedImages"
        }
    }
}


Searching on Form Key-Value Pairs

Now to test the cognitive search we’ve just created. We can do that by opening our Index on the Azure Portal or by creating an API call to return results. Since our whole dataset consists of 5 documents, I did a quick query to return all documents so we could see the results. Here’s what my query looked like:

Request:
GET https://bgformrecognizer-search.search.windows.net/indexes/formindex/docs?search=*&$select=address,recipient&api-version=2019-05-06

Headers:
Content-Type:    application/json

Api-key: 32-bit hexadecimal value    // From Search Service keys

This is what was returned:

{
   
"@odata.context": "https://bgformrecognition-search.search.windows.net/indexes('formindex')/$metadata#docs(*)",
   "value": [
       {
           "@search.score": 1.0,
           "address": "22 1st way Suite 4000 Redmond, WA 99243",
           "recipient": "Contoso 456 49th st New York, NY 87643"
       },
       {
           "@search.score": 1.0,
           "address": "1 Redmond way Suite 6000 Redmond, WA 99243",
           "recipient": "Microsoft 1020 Enterprise Way Sunnayvale, CA 87659"
       },
       {
           "@search.score": 1.0,
           "address": "1020 Enterpirse Way. Sunnyvale, CA 94088",
           "recipient": "Wingtip Toys 1010 El Camino Real Cupertino, CA 98024"
       },
       {
           "@search.score": 1.0,
           "address": "1111 8th st. Bellevue, WA 99501",
           "recipient": "Southridge Video 1060 Main St. Atlanta, GA 65024"
       },
       {
           "@search.score": 1.0
,
           "address": "1111 8th st. Bellevue, WA 99501",
           "recipient": "Alpine Ski House 1025 Enterprise Way Sunnyvale, CA 94024"

       }
   ]
}

We can now search our form fields directly with a search like this:

Request:
GET https://bgformrecognizer-search.search.windows.net/indexes/formindex/docs? search=Contoso&$select=address,recipient&searchFields=recipient&api-version=2019-05-06

Headers:
Content-Type:    application/json
Api-key: 32-bit hexadecimal value    // From Search Service keys

 

Which results in the following response:

{
    "@odata.context": "https://bgformrecognition-search.search.windows.net/indexes('formindex')/$metadata#docs(*)",
    "value": [
        {
            "@search.score": 0.095891505,
            "address": "22 1st way Suite 4000 Redmond, WA 99243",
            "recipient": "Contoso 456 49th st New York, NY 87643"
        }
    ]
}

Congratulations! If you’ve followed along, you’ve created an Azure Cognitive Search solution using the new Form Recognizer.

Stay Tuned

We’ll continue to explore ways to use Azure Cognitive Search to uncover knowledge from previously challenging data sources. BlueGranite can also help make the most of your data. Contact us today to learn how.

Editor's note: This post was edited 10/2020 to reflect system updates.

Knowledge Mining [large 2]
Kevin Jackson

About The Author

Kevin Jackson

Kevin Jackson has worked in software development for almost 30 years, specializing in architecture and technical leadership. He has been part of numerous successful projects working with teams that varied in size from just a few people to hundreds working across different time zones and continents. He enjoys working closely with clients to design and deliver world-class systems on time and on budget.

Latest Posts

Knowledge Mining [small2]