Data sources

The Datasets Behind NAC AIHub Verification

Program review inputs combined with state, federal, and industry datasets. Together they help reviewers validate providers, evaluate programs, compare policies, and assess workforce alignment with clearer evidence context.

Open program review Back to NAC AIHub

Overview

What these datasets are used for

NAC AIHub does not rely on a single dataset. It uses a connected evidence environment made of institutional reference records, provider licensure data, workforce approval lists, labor-market intelligence, education datasets, policy sources, skills mappings, and quality analysis.

Some sources verify whether a provider or program is valid and properly described. Others determine whether a program aligns with in-demand occupations, state priorities, or employer-relevant job skills. Still others provide operational or outcome context that supports reviewer judgment without replacing it.

The goal is to move reviewers from fragmented manual checking to a structured process where evidence sits side by side in one place.

Use cases

Main review workflows

Institution and provider verification
Compare provider submissions against licensure, accreditation, institutional reference, and ETPL records to confirm the organization is represented accurately and is operating in acceptable standing.
Program validation and classification review
Check program records against CIP and SOC mappings, workforce-board approvals, and occupation lists to evaluate whether a training program is described consistently and aligned to the correct workforce category.
Policy and compliance review
Compare submitted policy materials against state cancellation and refund requirements and other operational checkpoints so reviewers can flag missing language or inconsistencies before final action.
Labor-market and outcome context
Frame program relevance through wages, occupations, skills, and broader labor-market indicators, providing additional context when evaluating training value and workforce alignment.

Live integration

College Scorecard starter layer

First integration slice. One upstream source, one normalized server wrapper, one internal API route, one search surface. The same shape will host ACS, O*NET, BLS, and IPEDS as they come online.

Returns institution name, location, size, cost, completion, and median earnings.

Dataset catalog

Source groups used across verification and review

Client-provided and state reference datasets

These sources support core ETPL verification workflows by grounding provider, program, policy, and approval checks in state-specific records already used by reviewers and administrators.

State Colleges and Universities: Public-facing state licensure and institutional reference list used to confirm institutional standing and basic eligibility context.
Licensed Proprietary Schools: State-published licensure directory used to validate proprietary school status and support provider verification.
Local Workforce Board approved program list: Reference list used to confirm whether a program has local workforce board approval where relevant to ETPL review.
Priority Employment Opportunities (PEO) List: State-published priority occupations list used to evaluate whether training aligns with state-recognized workforce demand.
State cancellation and refund regulations: Policy source used to compare submitted provider policies against state regulatory requirements.
CIP-SOC crosswalk: Classification mapping used to validate instructional program codes, occupational codes, and consistency between training and workforce outcomes.
Program QA evaluation layer: Smart quality review layer intended to evaluate program data, identify inconsistencies, and help structure reviewer-facing findings.

State agency datasets

These sources broaden the platform beyond surface-level provider checks by connecting review activity to statewide education and workforce evidence where available.

State Longitudinal Data: State-level longitudinal information intended to support deeper program and outcome analysis across time.

Federal workforce and education datasets

Federal sources help the platform compare provider and program claims against national education, labor market, wage, and occupational reference systems.

BLS Occupational Employment and Wage Statistics: Used to support wage, employment, and occupational demand analysis tied to program outcomes and workforce relevance.
O*NET: Used for occupational descriptions, skill context, job task alignment, and related occupation analysis.
IPEDS: Used to validate postsecondary institution reference data and support institutional context checks.
College Scorecard: Used as an additional education outcome and institution reference source for comparative review.
American Community Survey (ACS): Used for broader population and regional context that can inform market, access, and workforce analysis.

Industry and market intelligence datasets

These sources add employer, market, and credential transparency context so the platform can assess whether a program is relevant, current, and aligned to real workforce demand.

GrayDI PES Economics and Outcomes: Aggregated and cleaned data products intended to support economic and program outcome review.
GrayDI PES Markets: Market-oriented reference layer for workforce and training demand context.
GrayDI PES Academic Management: Academic management-oriented data used to support broader program evaluation workflows.
LinkedIn aggregate profile analytics: Used only in aggregate analytics to understand skills and labor-market patterns, not for individual-level review decisions.
Credential Engine / CTDL: Used to support credential transparency, credential description structure, and comparability across training offerings.
U.S. Chamber of Commerce JEDx: Used to support jobs and employment data exchange functions relevant to skills and workforce alignment.
National Labor Exchange (NLx): Used as an additional labor-market and job-demand reference source.

Trust, security, and institutional datasets

These sources help the platform include trust, compliance, and enterprise-readiness context where relevant to provider or operational review.

Coleridge Initiative / FedRAMP: Reference source for federally recognized trust and compliance context where applicable.
Coleridge Initiative / StateRAMP: Reference source for state-oriented trust and compliance context where applicable.
Institution curricular mapping to in-demand skills: Institution-level curricular mapping integrated with AristAI to compare program content against employer-demanded job skills.

Review principles

How the data should be understood

The platform supports review by organizing, summarizing, and comparing information across datasets. Final decisions remain with staff.
Not every source serves the same purpose. Some are used for verification, some for market context, some for classification mapping, and some for aggregate analytics.
The platform brings fragmented sources into one workspace so reviewers do not have to search across disconnected systems.
Aggregate and external datasets should strengthen reviewer understanding, not replace evidence validation or policy judgment.

Continue

Value to reviewers and administrators

Bringing these datasets together reduces repetitive manual searching and gives staff a stronger starting point for program review. The platform summarizes, compares, and organizes evidence across sources so teams can see where records align, where they conflict, and where more verification is needed.

Faster, more traceable decision-making, with reviewer judgment and the compliance trail intact.

Open application review View approved ETPL