Performing a Data Verification Operation on Deduplicated Data

Use the data verification feature on a deduplication engine to verify the deduplicated data managed by the deduplication database (DDB). By default, the deduplicated data verification operation is automatically associated with the System Created DDB Verification schedule policy. This schedule policy runs an incremental deduplicated data verification job every day at 11:00 AM on all the active DDBs in the CommCell that have the Verification of Existing Jobs on Disk and Deduplication Database check box selected.

Note

You can perform only up to 50 concurrent data verification jobs. Any additional jobs that you perform get queued.

Before You Begin

During the data verification process, the DDB is consulted until the verification process is complete. Therefore, before you run a data verification job, make sure that the following jobs are not running against the storage policy that is in effect for the data verification:

  • DDB Move

  • DDB Reconstruction

  • DDb Space Reclamation

If any one of the above jobs listed is running, the data verification job does not start and an appropriate error message is generated. Wait for the jobs to complete and then run the data verification job.

Procedure

  1. From the CommCell Browser, expand Storage Resources > Deduplication Engines > storage_policy_copy.

  2. Right-click the appropriate deduplication database, click All Tasks > Run Data Verification.

  3. In the Data Verification dialog box, select the appropriate options to verify deduplicated data:

    1. Run full or incremental data verification job.

      • For full data verification job, clear the Run Incremental Verification check box.

      • For incremental data verification job, select the Run Incremental Verification check box.

        Default: Selected.

        Note

        Incremental DDB data verification runs only if the DDB and the data mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.

    2. In the Data Verification Options area, choose one of the following options to run deduplicated data verification:

      Options

      When to use

      Applies To

      Quick verification of existing jobs on disk and deduplication database

      Use this option for a quick verification of all the deduplicated jobs (unique data blocks and all references to the blocks) on the disk with the DDB and on the CommServe database.

      This option validates if the existing backup jobs are valid for restores and can be copied during Auxiliary Copy operations.

      In comparison with the Verification of existing jobs on disk and deduplication database option, this option is faster because it does not read the data blocks on the disk. Instead, it ensures that both the DDB and disk are in sync.

      Full and Incremental Data Verification Job

      Verification of existing jobs on disk and deduplication database

      Use this option if you want to verify all existing backups and to ensure that the new backups refer only to valid data blocks.

      This option validates if the existing backup jobs are valid for restores and can be copied during Auxiliary Copy operations.

      Full and Incremental Data Verification Job

    3. In the No of Streams to be used in Parallel area, choose one of the following options:

      • To configure a specific number of streams for which backups are verified during the data verification operation click Number of Streams and type the number.

        If the number of streams specified are less than 50, for example 15, then 15 streams are used during the Verify Data phase and 15 streams are used during the Validate Data phase.

        If the number of streams specified is more than 50, then 50 streams are used during the Verify Data phase and 50 streams are used during the Validate Data phase.

      • To use the maximum number of streams during the data verification operation click Allow Maximum.

        If no streams are specified and the Allow Maximum check box is selected, then 20 streams are used during the Verify Data phase and 50 streams are used during the Validate Data phase.

  4. To enhance the scalability of the data verification operation and to optimize the processes of scaling the resources, select the Use Scalable Resource Allocation option. For more information, see Scalable Resource Allocation.

    This option is enabled by default.

  5. Click OK.

For more information, see Data Verification Options.

Result

A deduplicated data verification job is displayed in the Job Controller window. You can view the deduplicated data verification job history at CommCell level or DDB level and the data verification status for the backup jobs from the storage policy level.

Note

  • When the DDB data verification job is running, you can run backups and auxiliary copy operations if the DDB and the Data Mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.

  • If deduplicated data verification job goes into pending state, then the job attempts to run five times, for every 20 minutes. If the data verification job exceeds five attempts, then the job status is marked as failed.

  • If deduplicated data verification job is killed during the Verify Data phase, then any backup job not verified during Verify Data phase will not have data verification status updated. In this scenario, rerun the deduplicated data verification job.

  • There may be read and egress charges involved for cloud storage based on the cloud vendor and the location of the MediaAgent used for the data verification job.

Loading...