Text Analysis (NLP)

The Text Analysis action leverages Tonkean's proprietary natural language processing (NLP) functionality to read, recognize, and extract relevant content from raw text.

There are many use cases where the data coming into your workflow contains large amounts of text, whether that's in an email, a document, a PDF, or other similar file. Rather than assigning a person to read through and understand this content, you can use the Text Analysis action to decipher and extract the relevant words or phrases so you can use them in your solution.

If you plan on reusing the same NLP model in other modules, we recommend building the model as an NLP search training set. NLP models built using the Text Analysis action are specific to the module you create them in, and each model requires creating and maintaining an individual branch, whereas you can handle all of that configuration and maintenance in a single location with a training set.


Best Practices

To get the most out of this action, there are several best practices we recommend:

  • You should always create a branch for situations where no match is found. Even if you don't want to take any action in those situations, incorporating a "failed to match" branch makes reading and understanding the module history easier. Additionally, we recommend adding in a Train action to help improve the accuracy of the NLP branches and avoid missed keywords in future workflow runs.

  • It's important to consider whether branches should be exclusive or not. That is, you may want only one branch to run or all matching branches to run. In general, we recommend only running the best matched branch, but there are use cases where you may want multiple matching branches to run. For example, if you receive a request for multiple documents in a single email and you have a branch dedicated to fulfilling requests for each form, you would want each of those branches to run and provide all the requested documents.

  • If you have two branches with the same or similar keywords, you can add conditions to those branches to further specify when each branch should run. Select the conditions icon, conditions.png, in the lower left corner of the branch block to add a condition.


Common Use Cases

  • The Text Analysis action is often used in workflows that monitor an email inbox and automate responses to various requests. For example, a Legal Ops team might build a module that listens to a shared email inbox for contract templates. They could include a Text Analysis action to identify keywords (like "NDA", for example) and run a workflow that sends the requested template back to the requester.

    If the email inbox receives a wider range of requests, you could use the Text Analysis action to determine the content of each incoming message and initiate workflows in different branches where each branch is a dedicated workflow for that type of request.

  • The training set option in the Text Analysis action is often used in conjunction with OCR to process read-only documents like PDFs. By converting a PDF to text, you can use a training set to read and extract important fields from the documents and then use those fields in various workflows.


The configuration for each section in the action panel is detailed below.

Input text to process

Specify the text you want to analyze. Select the insert field icon, insert_field.png, to select from available fields.

Insert spaces between the keywords.


Then, define the keywords that, when found in the specified text, create a new workflow branch in the module builder. Select + Add Branch to create a new branch block.


A new branch block is created in the module builder.


Configure Branch Block

Each branch block defines the keywords that trigger the workflow to continue from that block. It's common, depending on the use case, to have two or three branches, with each branch looking for a separate keyword or phrase. You can have many branches, but as you add more, the configuration of the Text Analysis action that determines matching becomes increasingly important.


Name the Branch Block

Select the branch title or the blue pencil icon, blue_edit_pencil.png, and provide a descriptive title for the branch. We recommend a title that describes the keywords used in the branch.

Define keywords

Define the words and phrases, separated by a comma, that activate the workflow that follows the branch block.


Training examples

Training examples leverage Tonkean machine learning to teach the matching algorithm what words or phrases constitute a match or which ones don't.

Enter example words or phrases that represent matches with the keyword, select Positive example in the dropdown, then select Add to add the training example to the branch block. Alternatively, you can enter example words or phrases that may be common expected examples close to the keyword (common misspellings, for example), select Negative example in the dropdown, then select Add. Several examples of both positive examples and negative examples help to improve the accuracy of the text analysis.


Which Branches should run

Determine whether to activate all branches that contain keywords matching the text found in the analyzed text or only the branch that matches most closely. Select the dropdown and select one of the options:

  • All matching Branches (default) - Activate all branches that have a keyword match, running the workflow that follows the branch block. This option is useful when there is little chance that more than one branch will have a match on a given item, or where the activation of multiple branches is acceptable. If you have multiple branches with similar keywords, or where substantially different (or even conflicting) workflows follow different branch blocks, select Best matched Branch instead.

  • Best matched Branch - Activate only one branch that most closely matches the defined criteria, running the workflow that follows only that branch block. This option is useful when you have multiple branches with similar keywords, or where the activation of multiple branches isn't acceptable and may cause errors.

The Text Analysis action uses loose matching, which considers two values that are relatively close to be a match. For example, minor deviations in spelling ("cease and desist" matches "cease and desisted"), word position ("cease and desist" matches "desist and cease"), and the use of alternate symbols or abbreviations ("cease and desist" matches "cease & desist") are all acceptable.

No match was found Branch

This option provides a path for the module if no match is found. Select the No match was found Branch toggle to create a dedicated branch block for this situation.

We recommend including a branch block and logic for situations when no match is found.