Named Entity Recognition (NER)

Named Entity Recognition (NER) is a subtask of the information extraction process that locates and classifies named entities mentioned in unstructured text into predefined categories, such as the names of persons, organizations, and locations.

For instance, when a user searches for "Eiffel Tower," NER recognizes "Eiffel Tower" as a location. This allows it to not only fetch documents containing the exact phrase but also to prioritize content that is relevant to the Eiffel Tower as a landmark in Paris.

NER and SearchUnify’s Cognitive Search

By incorporating NER, SearchUnify-powered search goes beyond keyword-based searches to understand the context of user queries. NER gives it the capability to handle nuanced and complex queries, leading to a more intuitive and effective search experience.

How NER Works in SearchUnify

NER systems use models trained on large corpora of annotated text, where entities are manually labeled. In SearchUnify, Taxonomy plays the role of the dictionary based on which the NER model is trained.

NOTE.

Make sure the Named-entity recognition (NER) is toggled off until you have completed the following steps.

Add Taxonomy Entity

Add an entity or entities and add values to those. An entity can be “Organizations” and the names of the different organizations are its values. The instructions on how to set up Taxonomy are here: Taxonomy

Add Content Annotation Rule

After creating taxonomy entities, link them to content annotation. Content Annotation is used to categorize data into categories. The categories are defined by applying a set of tags on content source data. Refer to this doc to know how to add content annotation rules: Automatic Text Classification with Content Annotation

NOTE.

Make sure that the Entity Name and the Content field name should be the same (as shown in the image below).

Based on data size, annotation rules processing can take anywhere between a few minutes to a few hours. Wait for some time and check if the annotation request has been approved.

Toggle On Named-entity recognition (NER)

After the content annotation request has been approved, toggle on the Named-entity recognition (NER).

The NER should be working now in your search client.

Add Content Annotation Rules as Filters on the Search Page

  1. After enabling NER, navigate to Search Clients and open a search client for editing.

  2. Go to Content Sources and select a content source and an object.

  3. Select annotation request content field name in Filters.

  4. Save the settings.

On the search result page, the content field name will display as a filter.