Understanding Academic Performance &

Growth Patterns in VPK

This page is optimized for desktop viewing.

Since 2018, the Early Childhood Policy Research Group (ECPRG) has been developing the Sunshine State Early Childhood Information Portal and Early Childhood Integrated Data System. These resources leverage comprehensive administrative data for the purpose of understanding and describing child and family development in the state of Florida. Specifically, the research team has used these data, which reflect child, household, and classroom characteristics to identify patterns of experiences that result in different outcomes for young children enrolled in Florida’s Voluntary Prekindergarten Education Program (VPK). By harnessing the power of machine learning and artificial intelligence, the ECPRG has developed and implemented an innovative research program that strives to describe the heterogeneity of child experiences.

Study Overview

The ECPRG used the Early Childhood Integrated Data System linked dataset to investigate learning outcomes among students enrolled in VPK across Florida. Specifically, the ECPRG used machine learning and artificial intelligence to detect and describe kindergarten readiness (KR) growth patterns characterized by individual-, household-, and classroom-level features. Children have different experiences—both negative and positive—as members of their respective families and peer groups.¹,² Within education and numerous other fields, Bronfenbrenner’s bioecological systems framework attributes differential outcomes to different combinations of exposures, such that each exposure is considered in context.¹,² Children who attended VPK exhibited differential learning trajectories depending on family, peer and classroom context, with varying effects on development and preparation for kindergarten.
In this study, we investigated the effects of children’s home- and school-based learning environments on their academic growth, operationalized by (1) children’s initial scores on the Florida Assessment of Student Thinking (FAST) upon entering VPK and (2) their growth (difference between initial and end-of-year FAST assessment). Our machine learning and artificial intelligence analyses identified contexts under which children are likely to be ready for kindergarten.

1. Bronfenbrenner, U., & Morris, P. A. (2006). The bioecological model of human development. In W. Damon & R. M. Lerner (Eds.), Handbook of Child Psychology, Vol. 1: Theoretical models of human development (6th ed., pp. 793 – 828). New York: Wiley.
2. Elder Jr, G. H. (1998). The life course as developmental theory. Child Development, 69(1), 1-12.

Key Domains

Starting with a broad collection of child and family characteristics, and scores on the Classroom Assessment Scoring System (CLASS), our machine learning analyses identified five domains of predictors related to kindergarten readiness. Each of these five domains include directly observed characteristics of children, families, and classrooms. These individual characteristics, which constitute the broader domains, were included in the analyses that follow.

Mother’s Highest Education

Mother’s Birth Country

Health

Social Services

Attendance

Key Findings


    school

    Lower initial scores often associated with accelerated growth rates, which occur in specific contexts

    supervisor_account

    Combinations of family and sociodemographic factors contribute to initial scores


    pattern

    Complex interaction patterns identified between child, family, and classroom characteristics


    monitoring

    CLASS dimensions and composite had negligible associations with VPK growth, and only within narrowly-defined contexts

    Figure 1. Machine Learning Framework

    Terminal Node Random Forests

    Ctree for Subgroup Identification

    Random Forest for Initial Predictor Performance

    Step 1: Identify globally important predictors across the entire population

    Methodological Details:
    • Variable importance scores represent population-level effects
    • The threshold for variable selection becomes particularly meaningful as it determines the complexity of subsequent stages
    • Selected variables represent robust, generalizable relationships since they emerge in the full population
    • No need for external validation of variable importance since this represents the true population structure

    Step 2: Identify subpopulations from the population based on the important predictors

    Methodological Details:
    • Splits identified using conditional inference trees (ctree) represent true population-level heterogeneity rather than sample-specific patterns
    • Terminal nodes define genuine subpopulations rather than sample-based groupings
    • The hierarchical structure reveals how key predictors interact to create naturally occurring subgroups
    • Node size becomes a direct measure of subpopulation prevalence
    • The sequence of splits shows the relative strength of different contextual effects in the population

    Step 3: Identify conditional importance patterns within defined subpopulations

    Methodological Details:
    • Results reveal true conditional relationships within subpopulations
    • Including all predictors (even those below the Stage 1 threshold) allows detection of context-specific importance
    • Variable importance within nodes shows genuine heterogeneity in predictor relationships
    • The comparative analysis across nodes reveals how context modifies predictor importance
    • These analyses reveal second-order relationships after accounting for primary population structure
    This infographic illustrates a three-layer approach to data analysis using Ctree Terminal Node Random Forests. Starting from the bottom with Random Forest for initial predictor performance, moving to Ctree for robust subgroup identification in the middle, and culminating with Ctree Terminal Node Random Forests at the top. Each layer builds upon the previous for more specialized analysis.

    Methodology

    Analysis based on comprehensive data collection across multiple domains, focusing on both baseline and progressive growth measures.

    Kindergarten Readiness Scores

    Each of the children included in these analyses had one to three screening windows, with the number of assessments and time between first and last available assessment varying across children. We simultaneously predicted initial FAST score and change score (multivariate analysis) to consider not only the level of child ability before the start of VPK, but simultaneously the trajectory of learning while each child participated in VPK.
    To calculate the latter, the team first restricted the sample to children with at least two FAST scores, where there were at least 45 days between the first and last assessment dates (final N=93,584). We then used all available FAST scores to calculate the average increase in FAST score per month for each child.

    Sample

    The analytic sample included children born in Florida, enrolled in VPK during the 2022-2023 program year, and with at least two FAST assessments during that school year.

    Analysis Framework

    • Conditional Inference Tree
    • Machine learning method for detecting complex interactions
    • Multivariate outcome
    • Initial assessment scores
    • Monthly growth

    Machine Learning Pipeline

    After multiple imputation to account for missing data, analyses proceeded in three successive steps (Figure 1): selection of analysis variables with random forest regression³; conditional inference trees to explain kindergarten readiness subgroups; random forests within each terminal node from ctree for subgroup-specific prediction. Once missing data were accounted for, random forest regression trimmed the large set of predictor variables to only the most important variables for predicting KR subgroups³. Then, conditional inference regression tree model was used to predict KR subgroups, which was then pruned and examined to characterize these profiles in terms of data subpopulations. Lastly, within each terminal node (subgroup) identified in the multivariate conditional inference tree, we used a subgroup-specific random forest to compute and summarize Shapley values, quantifying the relationship between each predictor and FAST growth within that subgroup.

    Monthly Growth Units

    MGU = Subgroup Growth/ Average Growth
    MGU Scale interpretation:

    • 1.0 = Average expected growth
    • >1.0 = Accelerated growth rate
    • <1.0 = Below expected growth rate

    For example, a MGU equal to 1.2 indicates that a subgroup experienced 20% greater growth per month than average; whereas a MGU equal to 0.80 indicates that a subgroup experienced 20% less growth per month than average.

    Understanding Classroom Context

    To understand the potential associations between the Classroom Assessment Scoring System (CLASS) and kindergarten readiness among VPK attendees, the ECPRG included the individual CLASS dimension scores in the machine learning model. This allowed us to discover associations between specific CLASS dimensions and differential kindergarten readiness among subgroups of children. Leveraging machine learning and artificial intelligence, we investigated how much the constituent dimensions of the CLASS, and the CLASS composite score, interacted in (potentially complex) ways with child, family and classroom characteristics in predicting FAST initial and growth scores.

    3. Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32.

    4. Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651-674.

    5. Shapley, L. (1953). 17. A value for n-person games. In H. Kuhn & A. Tucker (Ed.), Contributions to the Theory of Games, Volume II (pp. 307-318). Princeton: Princeton University Press.

    Key Findings for Pre-VPK Performance and Monthly Growth Units (MGUs)

    Education Level Growth Comparison

    Education Level Initial FAST Growth Rate (MGU) Growth vs Average
    ≤8th Grade 609.7 1.18 +17%
    9th-12th Grade 624.9 1.10 +10.3%
    HS/GED 639.0 1.05 +5.2%
    Some College 650.2 1.00 -0.4%
    Associate’s Degree 657.6 1.00 -0.2%
    Bachelor’s Degree 672.2 0.94 -5.6%
    Master’s Degree 683.1 0.92 -7.9%
    Doctoral Degree 689.9 0.90 -9.7%

    Definitions

    • A Monthly Growth Unit (MGU) represents the average rate of growth; values greater than 1 indicate higher-than-average growth, while values less than 1 indicate lower-than-average growth.
    • For Growth vs Average, growth at least 1% higher than average is presented in green, growth at least 1% below average is in red, and growth in between these values is presented in black.

    Key Findings

    • Growth-Education Inverse Relationship
      – Higher education → Lower growth rates
      – Lower education → Higher growth rates
    • Initial-Education Positive Relationship
      – Higher education → Higher initial scores
      – Lower education → Higher initial scores
    • Maximum Growth Differential
      – 0.38 MGU gap (≤8th Grade vs Bachelor’s)
      – 38% growth rate difference
    • Pattern suggests strong compensatory effect

    Education Level and Attendance Relationship to Initial FAST Scores

    Education Level Low Attendance Medium Attendance High Attendance
    ≤8th Grade 594.2 610.3 634.0
    9th-12th Grade 618.6 626.8 623.8
    HS/GED 629.9 640.4 643.2
    Some College 641.0 650.6 666.3
    Associate’s Degree 647.1 658.6 663.7
    Bachelor’s Degree 657.6 671.9 692.6
    Master’s Degree 672.5 682.7 699.8
    Doctoral Degree 685.4 688.5 707.9

    Education Level and Attendance Effects on Growth

    Education Level Average MGU Growth vs Average
    Low Attendance Medium Attendance High Attendance Low Attendance Medium Attendance High Attendance
    ≤8th Grade 0.97 1.21 1.23 -3.4% +21.0% +23.4%
    9th-12th Grade 1.02 1.10 1.36 +1.9% +10.0% +35.9%
    HS/GED 0.95 1.05 1.26 -5.1% +5.3% +26.4%
    Some College 0.91 1.00 1.11 -9.4% +0.3% +11.0%
    Associate’s Degree 0.96 0.99 1.20 -3.9% -1.1% +19.8%
    Bachelor’s Degree 0.94 0.94 0.97 -5.5% -5.9% -2.9%
    Master’s Degree 0.83 0.93 0.94 -16.8% -7.3% -5.6%
    Doctoral Degree 0.84 0.89 1.05 -15.6% -10.7% +5.2%

    Definitions

    • A Monthly Growth Unit (MGU) represents the average rate of growth; values greater than 1 indicate higher-than-average growth, while values less than 1 indicate lower-than-average growth.
    • Attendance Levels:
      – Low: Less than 50 hours per month
      – Medium: Between 50 and 60 hours per month
      – High: Greater than 60 hours per month
    • For Growth vs Average, growth at least 1% higher than average is presented in green, growth at least 1% below average is in red, and growth in between these values is presented in black.

    Key Findings

    • Greater attendance associated with higher growth
      – Consistent across education levels
      – Strongest at lower education levels
    • Compensatory Power
      – Highest MGUs: High attendance + lower education

    SNAP Status and Education Level on Initial and Growth

    Education Level SNAP Status Initial FAST Average MGU Growth vs Average
    ≤8th Grade No 606.2 1.22 +21.8%
    Yes 621.2 1.15 +14.8%
    9th-12th Grade No 624.3 1.11 +11.1%
    Yes 625.0 1.10 +10.0%
    HS/GED No 644.4 1.05 +5.0%
    Yes 636.2 1.05 +5.3%
    Some College No 655.3 0.97 -2.9%
    Yes 645.9 1.02 +1.6%
    Associate’s Degree No 663.3 0.97 -3.2%
    Yes 648.9 1.04 +4.4%
    Bachelor’s No 674.9 0.93 -7.0%
    Yes 658.8 1.01 +1.4%
    Master’s Degree No 684.7 0.92 -7.8%
    Yes 667.1 0.91 -9.3%
    Doctoral Degree No 692.0 0.89 -10.5%
    Yes 648.8 1.07 +6.6%

    Definitions

    • A Monthly Growth Unit (MGU) represents the average rate of growth; values greater than 1 indicate higher-than-average growth, while values less than 1 indicate lower-than-average growth.
    • SNAP = Supplemental Nutrition Assistance Program
    • For Growth vs Average, growth at least 1% higher than average is presented in green, growth at least 1% below average is in red, and growth in between these values is presented in black.

    Key Findings

    • SNAP use is associated with consistently lower initial scores across education levels.
    • There is a compensatory effect of VPK for children who received SNAP.

    Initial Score and Growth Rate (MGU) Comparison

    ≤8th Grade
    Initial Score Range
    608-624
    Growth Rate (MGU)
    1.32
    Growth vs Average
    32%
    9th-12th Grade
    Initial Score Range
    608-624
    Growth Rate (MGU)
    1.17
    Growth vs Average
    17%
    HS/GED
    Initial Score Range
    627-653
    Growth Rate (MGU)
    1.09
    Growth vs Average
    9%
    Some College/Associate’s
    Initial Score Range
    638-671
    Growth Rate (MGU)
    1.00
    Growth vs Average
    Average
    Bachelor’s Degree
    Initial Score Range
    657-680
    Growth Rate (MGU)
    0.94
    Growth vs Average
    -6%
    Graduate Degree
    Initial Score Range
    680-708
    Growth Rate (MGU)
    0.95
    Growth vs Average
    -5%

    Key Findings

    • Growth-Education Inverse Relationship
      – Higher education → Lower growth rates
      – Lower education → Higher growth rates
    • Maximum Growth Differential
      – 0.38 MGU gap (≤8th vs Bachelor’s)
      – 38% growth rate difference
    • Pattern suggests strong compensatory effect

    Initial Score and Growth Rate (MGU) Comparison

    ≤8th Grade
    Attendance Effect: High: 1.44 vs Low: 1.08
    Growth Advantage: ~36%

    9th-12th Grade
    Attendance Effect: High: 1.52 vs Low: 1.12
    Growth Advantage: ~40%

    HS/GED
    Attendance Effect: High: 1.45 vs Low: 0.95-1.05
    Growth Advantage: 40-50%

    Some College
    Attendance Effect: High: 1.20 vs Low: 1.00
    Growth Advantage: ~20%

    Associates
    Attendance Effect: High: 1.35 vs Low: 1.05
    Growth Advantage: ~30%

    Bachelors
    Attendance Effect: High: 1.25 vs Low: 0.95
    Growth Advantage: ~30%

    Graduate
    Attendance Effect: High: 1.15 vs Low: 0.90
    Growth Advantage: ~25%

    Key Findings

    • Greater attendance associated with higher growth
      – Consistent across education levels
      – Strongest at lower education levels
    • Compensatory Power
      – Highest MGUs: High attendance + lower education

    Initial Score Comparison

    ≤8th
    US Range: No Data
    Other Range: 608-609
    US Advantage: No Data
    9th-12th
    US Range: 624-641
    Other Range: 624-630
    US Advantage: ~0-11 points
    HS/GED
    US Range: 631-653
    Other Range: 608-650
    US Advantage: ~20-25 points
    Some College
    US Range: 649-664
    Other Range: 635-650
    US Advantage: ~14 points
    Associates
    US Range: 655-668
    Other Range: 646-657
    US Advantage: ~9-11 points
    Bachelor’s
    US Range: 668-680
    Other Range: 657-669
    US Advantage: ~10 points
    Graduate
    US Range: 682-694
    Other Range: 670-680
    US Advantage: ~12-14 points

    Growth Rate Comparison (GMU)

    ≤8th
    US MGU: No Data
    Other MGU: 1.19
    Other Advantage: No Data
    9th-12th
    US MGU: 0.98-1.05
    Other MGU: 1.15-1.18
    Other Advantage: ~10-20%
    HS/GED
    US MGU: 0.95-1.05
    Other MGU: 1.18-1.24
    Other Advantage: ~20-30%
    Some College
    US MGU: 0.90-1.01
    Other MGU: 1.08-1.16
    Other Advantage: ~15-25%
    Associates
    US MGU: 0.92-0.98
    Other MGU: 1.05-1.10
    Other Advantage: ~12-18%
    Bachelor’s
    US MGU: 0.89-0.95
    Other MGU: 1.01-1.03
    Other Advantage: ~8-12%
    Graduate
    US MGU: 0.88-0.92
    Other MGU: 0.98-1.02
    Other Advantage: ~10-14%

    Compensatory Learning

    • Children whose mothers are from the US have higher initial scores across all education levels.
    • Children whose mothers are not from the US have higher growth rates across all education levels.

      Early Health — Prenatal Visits (Kotelchuck Index)

      International Bachelor’s Holders
      Kotelchuck Level:>2
      Initial Score:669
      MGU:1.03
      Kotelchuck Level:≤2
      Initial Score:657
      MGU:1

      Key Findings

      • Higher Kotelchuck Index scores show persistent developmental benefits in terms of higher initial scores and greater MGUs.

      Current Health Status (BMI)

      Graduate Level
      BMI Range:>24.1
      Initial Score:682
      MGU0.59
      BMI Range:≤2
      Initial Score:657
      MGU:1
      HS/GED Level
      BMI Range:>25.8
      Initial Score:631
      MGU:1.01
      BMI Range:≤25.8
      Initial Score:641
      MGU:0.99

      Key Findings

      • Lower maternal BMI is associated with higher initial scores.
      • Children whose mothers had higher BMIs show a compensatory effect from VPK.

      Initial Score and Growth Rate (MGU) Comparison

      ≤8th
      SNAP Status:Yes
      Initial Score:631
      MGU Range:1.01
      9th-12th
      SNAP Status:No
      Initial Score:624-632
      MGU Range:1.01-1.15
      SNAP Status:Yes
      Initial Score:636-640
      MGU Range:0.95-1.05
      HS/GED
      SNAP Status:No
      Initial Score:640-653
      MGU Range:0.95-1.00
      SNAP Status:Yes
      Initial Score:624-641
      MGU Range:1.00-1.16
      Some College
      SNAP Status:No
      Initial Score:653-677
      MGU Range:0.88-1.06
      SNAP Status:Yes
      Initial Score:645-649
      MGU Range:0.97-1.16
      Associates
      SNAP Status:No
      Initial Score:657-680
      MGU Range:0.90-1.00
      SNAP Status:Yes
      Initial Score:646-667
      MGU Range:0.95-1.10
      Bachelor’s
      SNAP Status:No
      Initial Score:668-694
      MGU Range:0.88-0.95
      SNAP Status:Yes
      Initial Score:650-665
      MGU Range:0.95-1.05
      Graduate
      SNAP Status:No
      Initial Score:682-708
      MGU Range:0.85-0.92
      SNAP Status:Yes
      Initial Score:655-670
      MGU Range:0.90-1.00

      Key Findings

      • SNAP use is associated with consistently lower initial scores across education levels.
      • There is a compensatory effect of VPK for children who received SNAP.

      Initial Score and Growth Rate (MGU) Comparison

      ≤8th
      WIC Status:Yes
      Initial Score:608-614
      MGU Range:1.18-1.20
      9th-12th
      WIC Status:No
      Initial Score:636-644
      MGU Range:0.95-1.02
      WIC Status:Yes
      Initial Score:630-635
      MGU Range:1.12-1.18
      HS/GED
      WIC Status:No
      Initial Score:640-653
      MGU Range:~0.95
      WIC Status:Yes
      Initial Score:630-640
      MGU Range:1.00-1.24
      Some College
      WIC Status:No
      Initial Score:653-671
      MGU Range:0.88-1.02
      WIC Status:Yes
      Initial Score:635-650
      MGU Range:0.91-1.16
      Associates
      WIC Status:No
      Initial Score:657-680
      MGU Range:0.90-0.95
      WIC Status:Yes
      Initial Score:646-667
      MGU Range:1.02-1.15
      Bachelor’s
      WIC Status:No
      Initial Score:668-680
      MGU Range:0.89-0.92
      WIC Status:Yes
      Initial Score:635-650
      MGU Range:0.97-1.15
      Graduate
      WIC Status:No
      Initial Score:682-708
      MGU Range:0.85-0.90
      WIC Status:Yes
      Initial Score:655-665
      MGU Range:0.95-1.10

      Key Findings

      • WIC use is associated with consistently lower initial scores across education levels.
      • There is a compensatory effect of VPK for children who received WIC.

      Initial Score and Growth Rate (MGU) Comparison

      Associate
      Status:Married
      Initial Score:667-680
      MGU Range:0.94-0.97
      Status:No Married
      Initial Score:655-668
      MGU Range:0.95-0.96
      Bachelor’s
      Status:Married
      Initial Score:670-682
      MGU Range:0.92-0.95
      Status:No Married
      Initial Score:657-669
      MGU Range:0.94-0.96
      Graduate
      Status:Married
      Initial Score:680-694
      MGU Range:0.90-0.92
      Status:No Married
      Initial Score:668-679
      MGU Range:0.92-0.94

      Key Findings

      • Children whose mothers are married demonstrate consistently higher initial scores and have similar MGUs across the observed education levels

      Note: Data for marriage status was only available for education levels Associates and above in the provided dataset.