Accreditation LogoAristAI Logo
Data sources

The Datasets Behind NAC AIHub Verification

Program review inputs combined with state, federal, and industry datasets. Together they help reviewers validate providers, evaluate programs, compare policies, and assess workforce alignment with clearer evidence context.

Overview

What these datasets are used for

NAC AIHub does not rely on a single dataset. It uses a connected evidence environment made of institutional reference records, provider licensure data, workforce approval lists, labor-market intelligence, education datasets, policy sources, skills mappings, and quality analysis.

Some sources verify whether a provider or program is valid and properly described. Others determine whether a program aligns with in-demand occupations, state priorities, or employer-relevant job skills. Still others provide operational or outcome context that supports reviewer judgment without replacing it.

The goal is to move reviewers from fragmented manual checking to a structured process where evidence sits side by side in one place.

Use cases

Main review workflows

  1. Institution and provider verification

    Compare provider submissions against licensure, accreditation, institutional reference, and ETPL records to confirm the organization is represented accurately and is operating in acceptable standing.

  2. Program validation and classification review

    Check program records against CIP and SOC mappings, workforce-board approvals, and occupation lists to evaluate whether a training program is described consistently and aligned to the correct workforce category.

  3. Policy and compliance review

    Compare submitted policy materials against state cancellation and refund requirements and other operational checkpoints so reviewers can flag missing language or inconsistencies before final action.

  4. Labor-market and outcome context

    Frame program relevance through wages, occupations, skills, and broader labor-market indicators, providing additional context when evaluating training value and workforce alignment.

Live integration

College Scorecard starter layer

First integration slice. One upstream source, one normalized server wrapper, one internal API route, one search surface. The same shape will host ACS, O*NET, BLS, and IPEDS as they come online.

Returns institution name, location, size, cost, completion, and median earnings.

Dataset catalog

Source groups used across verification and review

Client-provided and state reference datasets

These sources support core ETPL verification workflows by grounding provider, program, policy, and approval checks in state-specific records already used by reviewers and administrators.

State Colleges and Universities
Public-facing state licensure and institutional reference list used to confirm institutional standing and basic eligibility context.
Licensed Proprietary Schools
State-published licensure directory used to validate proprietary school status and support provider verification.
Local Workforce Board approved program list
Reference list used to confirm whether a program has local workforce board approval where relevant to ETPL review.
Priority Employment Opportunities (PEO) List
State-published priority occupations list used to evaluate whether training aligns with state-recognized workforce demand.
State cancellation and refund regulations
Policy source used to compare submitted provider policies against state regulatory requirements.
CIP-SOC crosswalk
Classification mapping used to validate instructional program codes, occupational codes, and consistency between training and workforce outcomes.
Program QA evaluation layer
Smart quality review layer intended to evaluate program data, identify inconsistencies, and help structure reviewer-facing findings.

State agency datasets

These sources broaden the platform beyond surface-level provider checks by connecting review activity to statewide education and workforce evidence where available.

State Longitudinal Data
State-level longitudinal information intended to support deeper program and outcome analysis across time.

Federal workforce and education datasets

Federal sources help the platform compare provider and program claims against national education, labor market, wage, and occupational reference systems.

BLS Occupational Employment and Wage Statistics
Used to support wage, employment, and occupational demand analysis tied to program outcomes and workforce relevance.
O*NET
Used for occupational descriptions, skill context, job task alignment, and related occupation analysis.
IPEDS
Used to validate postsecondary institution reference data and support institutional context checks.
College Scorecard
Used as an additional education outcome and institution reference source for comparative review.
American Community Survey (ACS)
Used for broader population and regional context that can inform market, access, and workforce analysis.

Industry and market intelligence datasets

These sources add employer, market, and credential transparency context so the platform can assess whether a program is relevant, current, and aligned to real workforce demand.

GrayDI PES Economics and Outcomes
Aggregated and cleaned data products intended to support economic and program outcome review.
GrayDI PES Markets
Market-oriented reference layer for workforce and training demand context.
GrayDI PES Academic Management
Academic management-oriented data used to support broader program evaluation workflows.
LinkedIn aggregate profile analytics
Used only in aggregate analytics to understand skills and labor-market patterns, not for individual-level review decisions.
Credential Engine / CTDL
Used to support credential transparency, credential description structure, and comparability across training offerings.
U.S. Chamber of Commerce JEDx
Used to support jobs and employment data exchange functions relevant to skills and workforce alignment.
National Labor Exchange (NLx)
Used as an additional labor-market and job-demand reference source.

Trust, security, and institutional datasets

These sources help the platform include trust, compliance, and enterprise-readiness context where relevant to provider or operational review.

Coleridge Initiative / FedRAMP
Reference source for federally recognized trust and compliance context where applicable.
Coleridge Initiative / StateRAMP
Reference source for state-oriented trust and compliance context where applicable.
Institution curricular mapping to in-demand skills
Institution-level curricular mapping integrated with AristAI to compare program content against employer-demanded job skills.
Review principles

How the data should be understood

  1. The platform supports review by organizing, summarizing, and comparing information across datasets. Final decisions remain with staff.

  2. Not every source serves the same purpose. Some are used for verification, some for market context, some for classification mapping, and some for aggregate analytics.

  3. The platform brings fragmented sources into one workspace so reviewers do not have to search across disconnected systems.

  4. Aggregate and external datasets should strengthen reviewer understanding, not replace evidence validation or policy judgment.

Continue

Value to reviewers and administrators

Bringing these datasets together reduces repetitive manual searching and gives staff a stronger starting point for program review. The platform summarizes, compares, and organizes evidence across sources so teams can see where records align, where they conflict, and where more verification is needed.

Faster, more traceable decision-making, with reviewer judgment and the compliance trail intact.