AI governance

Datasets

Register and track datasets used by your AI systems, with PII detection and bulk upload.

Overview

The Datasets page is where you register and track every dataset your organization uses for AI systems. Each dataset record captures what the data contains, where it came from, whether it includes personal information, and what biases have been identified.

Datasets can be linked to models and projects, so you have a clear picture of which data feeds into which system.

Viewing datasets

The main page shows all datasets in a table with columns for name, description, status, type, classification, owner, and source. Above the table, summary cards show how many datasets are in each status (Draft, Active, Deprecated, Archived).

You can search by name or description, filter by status, type, or classification, and group datasets by any of those fields to organize them into collapsible sections.

Adding a dataset

  1. Click Add dataset in the top right.
  2. Fill in the basic fields: name, description, version, and owner.
  3. Set the type (Training, Validation, Testing, Production, or Reference) and classification (Public, Internal, Confidential, or Restricted).
  4. If the dataset contains personal data, check Contains PII and list the PII types present.
  5. Document any known biases and mitigation steps.
  6. Link the dataset to relevant models and projects.
  7. Click Save.

Dataset fields

FieldWhat it captures
NameA short, recognizable name for the dataset
DescriptionWhat the dataset contains and what it is used for
VersionVersion identifier (e.g., 1.0.0)
OwnerPerson or team responsible for the dataset
TypeTraining, Validation, Testing, Production, or Reference
ClassificationPublic, Internal, Confidential, or Restricted
SourceWhere the data came from
FormatFile format (CSV, JSON, Parquet, etc.)
LicenseData license or usage terms
Contains PIIWhether the dataset includes personally identifiable information
PII typesSpecific PII categories present (email, SSN, phone, etc.)
Known biasesDocumented bias issues in the data
Bias mitigationSteps taken to address identified biases
Collection methodHow the data was gathered
PreprocessingCleaning or transformation steps applied

Bulk upload

If you have multiple dataset files to register at once, use the bulk upload feature instead of adding them one by one.

  1. Click Bulk upload in the top right.
  2. Drag and drop files (CSV, XLS, or XLSX, up to 30 MB each) or click to browse.
  3. The system scans column headers for potential PII (email, phone, SSN, address, etc.) and flags any matches.
  4. Review the auto-detected metadata for each file. Edit names, types, or classifications before uploading.
  5. Click Upload to register all files as dataset records.
PII auto-detection
During bulk upload, the system checks column headers against 49 known PII keywords (email, ssn, phone, salary, credit_card, etc.). If any match, the dataset is automatically flagged as containing PII. You can override this in the review step.

Linking datasets to models and projects

Each dataset can be linked to one or more models from the model inventory and one or more projects. These links create a traceable chain from data to model to use case, which is what auditors look for when verifying data governance.

Set these links when creating or editing a dataset. You can also view all datasets linked to a specific model from the model inventory page.

Dataset statuses

StatusWhen to use
DraftDataset is being documented but not yet in use
ActiveDataset is currently used by one or more models or systems
DeprecatedDataset is being phased out and should not be used for new work
ArchivedDataset is no longer in use but kept for audit records

Change history

Every edit to a dataset is tracked in a change history log. This gives auditors a record of what changed, when, and by whom.

Who can do what

ActionRequired role
View datasetsAny authenticated user
Add or edit datasetsAdmin or Editor
Bulk uploadAdmin or Editor
Delete datasetsAdmin or Editor
PreviousAI Trust Center
NextAgent discovery
Datasets - AI governance - VerifyWise User Guide