Skip to main content

Text Analysis (NLP)

The Text Analysis action leverages Tonkean's proprietary natural language processing (NLP) functionality to read, recognize, and extract relevant content from raw text.

There are many use cases where the data coming into your workflow contains large amounts of text, whether that's in an email, a document, a PDF, or other similar file. Rather than assigning a person to read through and understand this content, you can use the Text Analysis action to decipher and extract the relevant words or phrases so you can use them in your solution.

If you plan on reusing the same NLP model in other modules, we recommend building the model as an NLP search training set. NLP models built using the Text Analysis action are specific to the module you create them in, and each model requires creating and maintaining an individual branch, whereas you can handle all of that configuration and maintenance in a single location with a training set.

text_analysys_overview.png

Best Practices

To get the most out of this action, there are several best practices we recommend:

  • You should always create a branch for situations where no match is found. Even if you don't want to take any action in those situations, incorporating a "failed to match" branch makes reading and understanding the module history easier. Additionally, we recommend adding in a Train action to help improve the accuracy of the NLP branches and avoid missed keywords in future workflow runs.

  • It's important to consider whether branches should be exclusive or not. That is, you may want only one branch to run or all matching branches to run. In general, we recommend only running the best-matched branch, but there are use cases where you may want multiple matching branches to run. For example, if you receive a request for multiple documents in a single email and you have a branch dedicated to fulfilling requests for each form, you would want each of those branches to run and provide all the requested documents.

  • If you have two branches with the same or similar keywords, you can add conditions to those branches to further specify when each branch should run. Select the conditions icon, conditions.png, in the lower left corner of the branch block to add a condition.

    best_practices_branch_conditions.png

Common Use Cases

  • The Text Analysis action is often used in workflows that monitor an email inbox and automate responses to various requests. For example, a Legal Ops team might build a module that listens to a shared email inbox for contract templates. They could include a Text Analysis action to identify keywords (like "NDA", for example) and run a workflow that sends the requested template back to the requester.

    If the email inbox receives a wider range of requests, you could use the Text Analysis action to determine the content of each incoming message and initiate workflows in different branches where each branch is a dedicated workflow for that type of request.

  • The training set option in the Text Analysis action is often used in conjunction with OCR to process read-only documents like PDFs. By converting a PDF to text, you can use a training set to read and extract important fields from the documents and then use those fields in various workflows.

Configuration

The configuration for each section in the action panel is detailed below.

Input text to process

Specify the text you want to analyze. Select the insert field icon, insert_field.png, to choose from available fields.

Insert spaces between the keywords.

input_text_to_process.png

Then, define the keywords that, when found in the specified text, create a new workflow branch in the module builder.

Text analysis type

Select whether to configure text analysis manually in the module builder or using an existing NLP search training set:

Manual

Set up text analysis manually, creating branches with defined keywords in the module builder.

text_analysis_type_select.png

Training Set

Use an existing NLP search training set for text analysis. When leveraging a training set for text analysis, your NLP branches are defined by the models you create in your training set.

text_analysis_type_training_set.png
Choose NLP Searcher

Select the NLP search training set to use.

choose_nlp_searcher.png

Add and Configure Branch Blocks for a Manual Text Analysis Action

Each branch block defines the keywords that trigger the workflow to continue from that block. It's common, depending on the use case, to have two or three branches, with each branch looking for a separate keyword or phrase. You can have many branches, but as you add more, the configuration of the Text Analysis action that determines matching becomes increasingly important.

Select + Add Branch to create a new branch block.

define_keywords_add_branch.png

A new branch block is created in the module builder.

new_branch_block.png

Name the Branch Block

Select the branch title or the blue pencil icon, blue_edit_pencil.png, and provide a descriptive title for the branch. We recommend a title that describes the keywords used in the branch.

Define keywords

Define the words and phrases, separated by a comma, that activate the workflow that follows the branch block.

branch_define_keywords.png

Training examples

Training examples leverage Tonkean machine learning to teach the matching algorithm what words or phrases constitute a match or which ones don't.

Enter example words or phrases that represent matches with the keyword, select Positive example in the dropdown, then select Add to add the training example to the branch block. Alternatively, you can enter example words or phrases that may be commonly expected examples close to the keyword (common misspellings, for example), select Negative example in the dropdown, then select Add. Several examples of both positive examples and negative examples help to improve the accuracy of the text analysis.

branch_training_examples.png

Which Branches should run

Determine whether to activate all branches that contain keywords matching the text found in the analyzed text or only the branch that matches most closely. Select the dropdown and select one of the options:

  • All matching Branches (default) - Activate all branches that have a keyword match, running the workflow that follows the branch block. This option is useful when there is little chance that more than one branch will have a match on a given item, or where the activation of multiple branches is acceptable. If you have multiple branches with similar keywords, or where substantially different (or even conflicting) workflows follow different branch blocks, select Best matched Branch instead.

  • Best matched Branch - Activate only one branch that most closely matches the defined criteria, running the workflow that follows only that branch block. This option is useful when you have multiple branches with similar keywords, or where the activation of multiple branches isn't acceptable and may cause errors.

The Text Analysis action uses loose matching, which considers two values that are relatively close to be a match. For example, minor deviations in spelling ("cease and desist" matches "cease and desisted"), word position ("cease and desist" matches "desist and cease"), and the use of alternate symbols or abbreviations ("cease and desist" matches "cease & desist") are all acceptable.

No match was found Branch

This option provides a path for the module if no match is found. Select the No match was found Branch toggle to create a dedicated branch block for this situation.

We recommend including a branch block and logic for situations when no match is found. For example, you might include a Send Notification action that forwards the details of the item so they can triage it and determine the next steps.

no_match_found_branch.png

And and Configure Branch Blocks for a Training Set Text Analysis Action

With your NLP search training set doing most of the work, it's relatively simple to configure branch blocks for a Text Analysis action that's leveraging a training set. In addition to the two branch blocks created by default, No match was found and Other active models, you can add a branch block for each relevant training set model you have configured.

training_set_default_branch_blocks.png

Define NLP branches

Create NLP branch blocks that correlate with the models in your NLP search training set. From each branch block, you can build additional logic that's activated when the corresponding model matches.

Select + Add Branch to create a new branch block.

define_nlp_branches_add_new_block.png

In the new branch block, select the dropdown to choose a training set model. Any matches to this model activate the branch block and initiate any workflow logic that follows it.

Once you select a training set model, the branch block is automatically renamed to the model name.

training_set_branch_block_select_model.png

Once you add a new branch block, you can choose its corresponding training set model in the Text Analysis configuration panel as well.

After you select a model for the branch block, select the plus icon, add_block.png, to add a new action block and continue the module workflow for any matches with that NLP model.

No match was found Branch

This option provides a path for the module if no match is found. We recommend including a branch block and logic for situations when no match is found. For example, you might include a Send Notification action that forwards the details of the item so they can triage it and determine the next steps.

For Text Analysis actions using the NLP search training set, this option is turned on by default.

no_match_found_toggled_on.png

Other active models Branch

This option provides a path for the module if a match is found for a model that doesn't have a corresponding branch block. That is, if your training set includes a model for the phrase "NDA" but you don't have a branch block configured for that model in the module, the Other active models branch block activates.

For Text Analysis actions using the NLP search training set, this option is turned on by default.

other_active_models_branch.png