Datasets powering the Arkansas AI-Verify workspace
The Arkansas AI-Verify project expands NAC AIHub into an AI-assisted ETPL verification and compliance platform by combining Arkansas program review inputs with state, federal, and industry datasets. Together, these sources help reviewers validate providers, evaluate programs, compare policies, assess workforce alignment, and support higher-volume verification work with clearer evidence context.
What these datasets are used for
This website does not rely on a single dataset. It uses a connected evidence environment made up of institutional reference records, provider licensure data, workforce approval lists, labor-market intelligence, education datasets, policy sources, skills mappings, and AI-supported quality analysis.
The goal is to help staff move from fragmented manual checking to a more structured review process in which evidence can be compared in one place.
Some datasets are used to verify whether a provider or program is valid and properly described. Others help determine whether a program aligns with in-demand occupations, state priorities, or employer-relevant job skills. Still others provide broader operational or outcome context that can support reviewer judgment without replacing it.
What the platform is doing
Main review use cases
Institution and provider verification
The platform compares provider submissions against licensure, accreditation, institutional reference, and ETPL-related records to help reviewers confirm whether the organization is represented accurately and is operating in an acceptable standing.
Program validation and classification review
Program records can be checked against CIP and SOC mappings, workforce-board approvals, occupation lists, and related evidence to help reviewers evaluate whether a training program is described consistently and aligned to the appropriate workforce categories.
Policy and compliance review
Submitted policy materials can be compared against state cancellation and refund requirements and other operational review checkpoints so staff can identify missing language, inconsistencies, or follow-up needs before final action.
Labor-market and outcome context
Federal and industry datasets help the platform frame program relevance through wages, occupations, skills, and broader labor-market indicators, giving staff additional context when evaluating training value and workforce alignment.
College Scorecard starter layer
Search schools from the official Scorecard API
This is the first integration slice: one upstream source, one normalized server wrapper, one internal API route, and one simple UI card to prove the pattern.
How to expand this cleanly
College Scorecard starter layer
Search schools from the official Scorecard API
This is the first integration slice: one upstream source, one normalized server wrapper, one internal API route, and one simple UI card to prove the pattern.
How to expand this cleanly
Source groups used across verification and review
Client-provided and Arkansas reference datasets
These sources support core Arkansas ETPL verification workflows by grounding provider, program, policy, and approval checks in state-specific records already used by reviewers and administrators.
State Colleges and Universities
Public-facing state licensure and institutional reference list used to confirm institutional standing and basic eligibility context.
Licensed Proprietary Schools
State-published licensure directory used to validate proprietary school status and support provider verification.
Local Workforce Board approved program list
Reference list used to confirm whether a program has local workforce board approval where relevant to ETPL review.
Priority Employment Opportunities (PEO) List
Arkansas priority occupations list used to evaluate whether training aligns with state-recognized workforce demand.
State cancellation and refund regulations
Policy source used to compare submitted provider policies against Arkansas regulatory requirements.
CIP-SOC crosswalk
Classification mapping used to validate instructional program codes, occupational codes, and consistency between training and workforce outcomes.
Program QA evaluation layer
AI-assisted quality review layer intended to evaluate program data, identify inconsistencies, and help structure reviewer-facing findings.
State agency datasets
These sources broaden the platform beyond surface-level provider checks by connecting review activity to statewide education and workforce evidence where available.
Arkansas State Longitudinal Data
State-level longitudinal information intended to support deeper program and outcome analysis across time.
Federal workforce and education datasets
Federal sources help the platform compare provider and program claims against national education, labor market, wage, and occupational reference systems.
BLS Occupational Employment and Wage Statistics
Used to support wage, employment, and occupational demand analysis tied to program outcomes and workforce relevance.
O*NET
Used for occupational descriptions, skill context, job task alignment, and related occupation analysis.
IPEDS
Used to validate postsecondary institution reference data and support institutional context checks.
College Scorecard
Used as an additional education outcome and institution reference source for comparative review.
American Community Survey (ACS)
Used for broader population and regional context that can inform market, access, and workforce analysis.
Industry and market intelligence datasets
These sources add employer, market, and credential transparency context so the platform can assess whether a program is relevant, current, and aligned to real workforce demand.
GrayDI PES Economics and Outcomes
Aggregated and cleaned data products intended to support economic and program outcome review.
GrayDI PES Markets
Market-oriented reference layer for workforce and training demand context.
GrayDI PES Academic Management
Academic management-oriented data used to support broader program evaluation workflows.
LinkedIn aggregate profile analytics
Used only in aggregate analytics to understand skills and labor-market patterns, not for individual-level review decisions.
Credential Engine / CTDL
Used to support credential transparency, credential description structure, and comparability across training offerings.
U.S. Chamber of Commerce JEDx
Used to support jobs and employment data exchange functions relevant to skills and workforce alignment.
National Labor Exchange (NLx)
Used as an additional labor-market and job-demand reference source.
Trust, security, and institutional datasets
These sources help the platform include trust, compliance, and enterprise-readiness context where relevant to provider or operational review.
Coleridge Initiative / FedRAMP
Reference source for federally recognized trust and compliance context where applicable.
Coleridge Initiative / StateRAMP
Reference source for state-oriented trust and compliance context where applicable.
Institution curricular mapping to in-demand skills
Institution-level curricular mapping integrated with AristAI to compare program content against employer-demanded job skills.
How the data should be understood
AI supports review by organizing, summarizing, and comparing information across datasets, but final decisions remain with staff.
Not every data source serves the same purpose: some are used for verification, some for market context, some for classification mapping, and some for aggregate analytics.
The platform is designed to bring fragmented sources into one review workspace so users do not need to manually search across disconnected systems.
Aggregate and external datasets should strengthen reviewer understanding, not replace evidence validation or policy judgment.
Value to reviewers and administrators
Bringing these datasets together helps reduce repetitive manual searching and gives staff a stronger starting point for program review.
The platform can use AI to summarize, compare, and organize evidence across sources, helping teams see where records align, where they conflict, and where more verification is needed.
This supports faster, more traceable decision-making while preserving reviewer judgment and maintaining a clear compliance trail.