How do I configure the SharePoint Remote Connector to only crawl Files and List Item Attachments

0 Likes
over 1 year ago

The content in SharePoint is organized in the following structure.

SPStructure.PNG

 

 

 

 

 

 

 

 

 

The IndexSites, IndexLists, and IndexFolders connector cfg settings specify whether to index a metadata-only document for each container object (highlighted in yellow above).


To prevent these metadata-only container objects from being indexed, set these to False in the [FetchTasks] section or directly in the repository Task section. The list items or files that they contain are still indexed though.

[FetchTasks]

IndexSites=false
IndexFolders=false
IndexLists=false

The screenshot below shows some List Items in my SharePoint repository when I don’t have IndexLists=false (this setting defaults to True if not in your cfg file)

SP1.PNG

 

 

 

 

 

 

 

 

If I add IndexLists=false then these will not be indexed. However, as mentioned above, List Items will still be indexed (highlighted in the screenshot below)

Sp2.PNG

 

List Items are just metadata in SharePoint. List Items can have attachments, and these will be of value to you, but the actual List Items are not binary documents and so you will be unable to perform any actions on them, nor would you want to. If you want to ignore List Items but index their Attachments then you will need to add some custom lua to your configuration.

I have created a file called ExcludeListItems.lua with the following content:

function handler(config,document,params)

local docPath = document:getFieldValue("TYPE")
if docPath == "LISTITEM" then
return false
end

return true
end

Drop the lua file into your SharePoint Remote Connector working directory (\Program Files\Micro Focus\ControlPoint\Indexer\SharePoint Remote Connector\).
Before you scan a new repository, you will need to edit the task section by updating the IngestActions to include the LUA:

[TaskSP_Test]
IndexSites=false
IndexFolders=false
IndexLists=false
...
IngestActions=META:CPREPOSITORYTYPEID=9,META:SecurityType=SharePointSecurity,META:AUTN_CATEGORIZE=false,META:AUTN_EDUCTION=true, LUA:ExcludeListItems.lua
...

With this lua and the IndexSites, IndexLists, and IndexFolders settings in place, only SharePoint Files and List Item Attachments will be indexed.

 

 

Labels:

How To-Best Practice
Comment List
Anonymous
Related Discussions
Recommended