Spend Docs
Datasets

Creating a New Dataset

The Dataset creation workflow allows you to isolate specific segments of spend—by time, source, or business unit—to prepare them for AI classification.

🛠️ Configuration Workflow

  1. Initiate Creation: Navigate to the Datasets module in the sidebar and click the + Create Dataset button. Accessing Dataset Creation

  2. Define Basic Information: In the Basic Info tab, provide a recognizable identity for your data segment:

    • Dataset Name: Enter a unique identifier (e.g., FY2025_Q1_Indirect).
    • Description: Add context regarding the scope of data included.
  3. Select & Filter Data Sources: Click + Add Source to link your raw transaction data. In the popup dialog:

    • Source Name: Select your configured data source (e.g., ERP_Export_Production).
    • Date Range: Set the start and end dates to filter specific transactions (e.g., Jan 2024 – Aug 2025). Source Selection & Filtering
  4. Review & Save: Navigate to the Review Data tab to inspect a preview of the filtered transaction lines. Ensure the column headers and row counts align with your expectations, then click Save Dataset. Dataset Review and Finalization

💡 Best Practices

  • Naming Conventions: Use a standard format like [Region]_[Department]_[Year] to make searching easier for your team.
  • Granularity: If you plan to use different AI Agents for different regions, create separate datasets for each region rather than one large global file.
  • Validation: Always check the Review Data tab to confirm that your date filters haven't excluded expected records.

Next Step: Once saved, your data is ready for the next phase. Proceed to Vendor Normalization to clean your supplier list.

On this page