Use a Website As a Content Source

Search your internal and external websites from one search box. This article explains the process of readying a website for indexing.

Establish a Connection

  1. Navigate to Content Sources.

  2. Click Add new content source.

  1. Select Website.

  2. Name your content source.

  3. Insert the website URL.

  4. Decide the Depth. Depth is the distance of a page from the home page. The home page has a depth of 1, any page with a banklink from the home page has a depth of 2, and it goes on. You can insert a number between 1 and 5.

  5. Select an Authentication Method.

  6. Toggle JavaScript Enabled Crawling on if your website relies on JavaScript for crucial functionality. Else, stick with the default status.

  7. Click Connect.

Set Up Crawl Frequency

  1. Click to fire up a calendar and select a date. Only the web pages after the selected date will be indexed.

  2. Use the Frequency dropdown to select how often SearchUnify should index the website.

  3. Click Set.

Select Fields for Indexing

SearchUnify indexes a website by capturing and storing the data inside HTML elements.

⚠ IMPORTANT

A website is not indexed if no HTML element is specified.

The admins can write CSS selectors to specify the elements for indexing. The CSS selectors are stored in an Object, which we will create next.

  1. Give your Object a Name and a Label. The name and the label do not have to be valid HTML tags.

  2. Click Add Object to create an empty object.

  3. Click to add content fields for indexing.

  4. Selector is the most important field. HTML tags, classes, and IDs are valid selectors. IDs are preceded by an # (octothorpe), classes by a . (dot), but the standard HTML tags are not surrounded by angle brackets.

  5. Assign the Selector a Type.

  6. Give the selector a Name and Label.

  7. Select Multiple to treat each instance of a matched HTML element separate in search, or Single to combine their data into a field.

  8. Select isFilterable to use the field as a facet, or leave the default value to find it in search results.

  9. Press Add and then Save.

  10. Press Save.

You have successfully added Website as a content source.

Last updatedTuesday, June 23, 2020