Accreditation Logo
×
AristAI Logo
Data sources

Datasets powering the Arkansas AI-Verify workspace

The Arkansas AI-Verify project expands NAC AIHub into an AI-assisted ETPL verification and compliance platform by combining Arkansas program review inputs with state, federal, and industry datasets. Together, these sources help reviewers validate providers, evaluate programs, compare policies, assess workforce alignment, and support higher-volume verification work with clearer evidence context.

Coverage
State, federal, and industry sources
Primary purpose
Verification, compliance, and workforce alignment
Review model
AI-assisted, human-in-the-loop
Overview

What these datasets are used for

This website does not rely on a single dataset. It uses a connected evidence environment made up of institutional reference records, provider licensure data, workforce approval lists, labor-market intelligence, education datasets, policy sources, skills mappings, and AI-supported quality analysis.

The goal is to help staff move from fragmented manual checking to a more structured review process in which evidence can be compared in one place.

Some datasets are used to verify whether a provider or program is valid and properly described. Others help determine whether a program aligns with in-demand occupations, state priorities, or employer-relevant job skills. Still others provide broader operational or outcome context that can support reviewer judgment without replacing it.

Quick summary

What the platform is doing

Verifies providers against licensure and institutional sources
Checks programs against classification and approval references
Compares policy materials against compliance requirements
Adds labor-market and outcome context for staff review
Core workflows

Main review use cases

01

Institution and provider verification

The platform compares provider submissions against licensure, accreditation, institutional reference, and ETPL-related records to help reviewers confirm whether the organization is represented accurately and is operating in an acceptable standing.

02

Program validation and classification review

Program records can be checked against CIP and SOC mappings, workforce-board approvals, occupation lists, and related evidence to help reviewers evaluate whether a training program is described consistently and aligned to the appropriate workforce categories.

03

Policy and compliance review

Submitted policy materials can be compared against state cancellation and refund requirements and other operational review checkpoints so staff can identify missing language, inconsistencies, or follow-up needs before final action.

04

Labor-market and outcome context

Federal and industry datasets help the platform frame program relevance through wages, occupations, skills, and broader labor-market indicators, giving staff additional context when evaluating training value and workforce alignment.

First live integration

College Scorecard starter layer

Live source

Search schools from the official Scorecard API

This is the first integration slice: one upstream source, one normalized server wrapper, one internal API route, and one simple UI card to prove the pattern.

Returns institution name, location, size, cost, completion, and earnings.
Integration pattern

How to expand this cleanly

Keep the API key on the server only and call Scorecard through your own route.
Normalize raw fields now so the UI never depends on upstream field names.
Add filters one by one: school name, state, zip radius, degree level, cost, and earnings.
Reuse the same wrapper pattern later for ACS, O*NET, BLS, and IPEDS.
Suggested next fields
ProgramsAdmission rateSAT/ACTDebtEarnings by majorPredominant degree
First live integration

College Scorecard starter layer

Live source

Search schools from the official Scorecard API

This is the first integration slice: one upstream source, one normalized server wrapper, one internal API route, and one simple UI card to prove the pattern.

Returns institution name, location, size, cost, completion, and earnings.
Integration pattern

How to expand this cleanly

Keep the API key on the server only and call Scorecard through your own route.
Normalize raw fields now so the UI never depends on upstream field names.
Add filters one by one: school name, state, zip radius, degree level, cost, and earnings.
Reuse the same wrapper pattern later for ACS, O*NET, BLS, and IPEDS.
Suggested next fields
ProgramsAdmission rateSAT/ACTDebtEarnings by majorPredominant degree
Dataset catalog

Source groups used across verification and review

Dataset group

Client-provided and Arkansas reference datasets

These sources support core Arkansas ETPL verification workflows by grounding provider, program, policy, and approval checks in state-specific records already used by reviewers and administrators.

State Colleges and Universities

Public-facing state licensure and institutional reference list used to confirm institutional standing and basic eligibility context.

Licensed Proprietary Schools

State-published licensure directory used to validate proprietary school status and support provider verification.

Local Workforce Board approved program list

Reference list used to confirm whether a program has local workforce board approval where relevant to ETPL review.

Priority Employment Opportunities (PEO) List

Arkansas priority occupations list used to evaluate whether training aligns with state-recognized workforce demand.

State cancellation and refund regulations

Policy source used to compare submitted provider policies against Arkansas regulatory requirements.

CIP-SOC crosswalk

Classification mapping used to validate instructional program codes, occupational codes, and consistency between training and workforce outcomes.

Program QA evaluation layer

AI-assisted quality review layer intended to evaluate program data, identify inconsistencies, and help structure reviewer-facing findings.

Dataset group

State agency datasets

These sources broaden the platform beyond surface-level provider checks by connecting review activity to statewide education and workforce evidence where available.

Arkansas State Longitudinal Data

State-level longitudinal information intended to support deeper program and outcome analysis across time.

Dataset group

Federal workforce and education datasets

Federal sources help the platform compare provider and program claims against national education, labor market, wage, and occupational reference systems.

BLS Occupational Employment and Wage Statistics

Used to support wage, employment, and occupational demand analysis tied to program outcomes and workforce relevance.

O*NET

Used for occupational descriptions, skill context, job task alignment, and related occupation analysis.

IPEDS

Used to validate postsecondary institution reference data and support institutional context checks.

College Scorecard

Used as an additional education outcome and institution reference source for comparative review.

American Community Survey (ACS)

Used for broader population and regional context that can inform market, access, and workforce analysis.

Dataset group

Industry and market intelligence datasets

These sources add employer, market, and credential transparency context so the platform can assess whether a program is relevant, current, and aligned to real workforce demand.

GrayDI PES Economics and Outcomes

Aggregated and cleaned data products intended to support economic and program outcome review.

GrayDI PES Markets

Market-oriented reference layer for workforce and training demand context.

GrayDI PES Academic Management

Academic management-oriented data used to support broader program evaluation workflows.

LinkedIn aggregate profile analytics

Used only in aggregate analytics to understand skills and labor-market patterns, not for individual-level review decisions.

Credential Engine / CTDL

Used to support credential transparency, credential description structure, and comparability across training offerings.

U.S. Chamber of Commerce JEDx

Used to support jobs and employment data exchange functions relevant to skills and workforce alignment.

National Labor Exchange (NLx)

Used as an additional labor-market and job-demand reference source.

Dataset group

Trust, security, and institutional datasets

These sources help the platform include trust, compliance, and enterprise-readiness context where relevant to provider or operational review.

Coleridge Initiative / FedRAMP

Reference source for federally recognized trust and compliance context where applicable.

Coleridge Initiative / StateRAMP

Reference source for state-oriented trust and compliance context where applicable.

Institution curricular mapping to in-demand skills

Institution-level curricular mapping integrated with AristAI to compare program content against employer-demanded job skills.

Review principles

How the data should be understood

1

AI supports review by organizing, summarizing, and comparing information across datasets, but final decisions remain with staff.

2

Not every data source serves the same purpose: some are used for verification, some for market context, some for classification mapping, and some for aggregate analytics.

3

The platform is designed to bring fragmented sources into one review workspace so users do not need to manually search across disconnected systems.

4

Aggregate and external datasets should strengthen reviewer understanding, not replace evidence validation or policy judgment.

Why it matters

Value to reviewers and administrators

Bringing these datasets together helps reduce repetitive manual searching and gives staff a stronger starting point for program review.

The platform can use AI to summarize, compare, and organize evidence across sources, helping teams see where records align, where they conflict, and where more verification is needed.

This supports faster, more traceable decision-making while preserving reviewer judgment and maintaining a clear compliance trail.