Support Tip: Dealing with Orphaned documents in ControlPoint

0 Likes

ControlPoint 5.x


Situation

When you delete connector_TASK<reponame>_datastore.db for a given repository the next scan will always be a full scan. This can lead to orphaned docs getting left behind in ControlPoint if the source has changed from what was previously ingested into ControlPoint. If this happens ControlPoint has got a built-in scheduled task available to run called Delete Orphaned Documents which is accessible under Administration - Scheduled tasks - System. This scheduled task just identifies items that are in Metastore but no longer in the source repository (by checking the scanID) and deletes those from Metastore/IDOL.


Cause

Detailed below is an example of how a doc can become orphaned:

Scan a repository to ingest 1 file.
Check the file is visible in CP UI.
Then delete the file in the source.
Add a new file to source.
Delete connector_TASK<reponame>_datastore.db for this repository
Then do a scan - which will default to doing a full scan as there is no connector_TASK<reponame>_datastore.db
Afterwards the new file will be picked up and ingested into CP.
The original file will NOT be removed and becomes an orphaned doc.
If you try to open this file you will not be able to because it does not exist on the source.

Find resolution here

Labels:

Knowledge Docs
Comment List
Related
Recommended