3.3 Athletics
A student’s athletics rating is the weighted sum of three variables: participation in a team sport, participation in an individual sport, and being names as a most valuable player (MVP). The record was discarded if either participation variable was missing, but the MVP variable was set to zero if missing. Once again the range was used for normalization.
4. Segmentation
The next step was to use the three performance variables to create the segmentation structure. The decision rule was derived with one eye on the logic behind the student segment names and the other on getting the kind of membership distribution described below. An empirically optimized decision rule may be achievable and this would certainly be desirable if time and resources permit.
The decision rule is as follows:
stuSeg = 1 if v1 ≥ 5.5 and v2 ≥ 5.5 else (Blue Chip)
stuSeg = 2 if v1 ≥ 7.0 else (Scholar)
stuSeg = 3 if v1 ≥ 4.0 and v2 ≥ 5.5 else (Extracurricular)
stuSeg = 4 if v3 ≥ 5.0 else (Athlete)
stuSeg = 5 if (v1 ≥ 4.0 if v2 ≥ 5) or v1 ≥ 5.5 else (Balanced)
stuSeg = 6 if v1 ≥ 4.0 else (Average)
stuSeg = 7. (Stretch)
The diagram illustrates the decision rule graphically as it applies to the academic and extracurricular indices. Membership in the "Athlete" segment (#4) requires an athletic rating of five or more—and it applied only to students who failed to qualify for the three preceding segments.

The segment membership percentages came out as shown in the figure. We had no normative information to guide us, but started with the idea that the four specialized segments (1-4) should aggregate roughly to the membership of the two general ones (5-6) and that the "Stretch" segment should be somewhat smaller. In retrospect, it may be desirable to revise the decision rule to produce more evenly balanced segments. This would mitigate the small-sample problems, discussed later, that have arisen in subcategories.
5. Student and Institutional Behavior by Segment
With the segment definitions in hand, the next step was to estimate the distribution of applications, admissions offers, matriculations, financial aid requests, and financial aid awards, for each institutional segment, by student segment and gender-ethnic group. While all these results are derived from the NELS database, one must remember that applications, matriculations, and financial aid requests represent student behavior while admissions offers and aid awards are the consequences of institutional decisions.
5.1 Data Extraction Procedure
To understand the data structure, imagine a table whose columns are the institutional segments and rows the student segments and gender-ethnic groups. Figures 3 and 4 have this structure. The data are extracted cell by cell, according to the following procedure:
a. Select all records in the ratings file that meet the student segment and gender-ethnic criteria.
b. From these, select all records for which the institutional segment code of "institution attended" or either of the two "applied to" schools equals the target institutional segment. ("Institution attended" may have been listed as one of the first two "applied-to" choices, but if different we know the student must have applied.)
c. Sum the respondent weights for all the selected records. (These weights are provided by NELS as projection factors to the universe.)
d. Divide each cell by the grand total for the column to get the fraction of an institutional segment’s applications (or admissions offers, etc.) accounted for by the student segment and gender-ethnic group.
We recognize that this procedure truncates the distribution of applications to the respondent’s first two preferences, or these two plus the institution attended, but must accept it given the NELS survey design. Similar problems arise in connection with the other variables. Further truncation occurs for the financial aid variable, because our NELS data do not include information about aid offers and awards for the institutions actually attended if different from the two "applied to" schools. As shown in the chart, however, the NELS data for number of applications per respondent averages less that two for every student segment except segment two. These data were

obtained by asking directly about the number of applications submitted, so they are not subject to truncation. They suggest that truncation is not a serious problem.
5.2 Projecting Results to the Average Institution
Our institutional database contains school-by-school data for undergraduate applications, admissions offers, and matriculations. The first three lines of Figure 2 present the average of these figures for each institutional segment. While equivalent numbers can be calculated from the NELS data, the institution-based estimates are sure to be more reliable. That is why step "d," above, divides the extracted sums by their column totals. Multiplying the resulting fractions by the overall average for each institutional segment (in Figure 2) produces the requisite per-institution figures for applications, offers, and matriculations by student segment and gender-ethnic group.
Figure 2: Average Applications, Offers, and Matriculations per School
(based on institutional data)

Figure 2 also shows the applications ratios and yield rates computed from the aforementioned averages. They play no role in the subsequent calculations but they do provide a useful reference point.
5.3 Illustration of Results
Figure 3 presents the results for student applications. Similar tables have been prepared for admissions offers, and matriculations. These tables will be stored in the CyberCampus database.
Figure 3: Distributions of Applications Across Student Categories
According to these results, Segment-1 institutions (the so-called "super-medallions") get an average of 119 applications from white male blue-chip students, 2519 from scholars, and none from extracurriculars. Some 350 applications come from white female blue-chips, 32 from minority male blue-chips, and none from minority female blue-chips.
The zero for white male extracurriculars illustrates the small-sample problem mentioned earlier. Despite the large size of the NELS dataset (8,018 respondents, of which more than 6,000 met our data acceptance criteria), some cells in the table are sparsely populated and thus subject to large sampling errors. For example, even one person reporting an application to a medallion institution would have changed the projection by a material amount. Figure 4 provides the information on sample sizes for the applications results. The data show, for example, that the difference between the blue chip and extracurricular super-medallion cells is only three respondents. It may prove desirable to reset the segmentation criteria to more evenly balance the segment memberships and/or smooth the results judgmentally before entering them in the CyberCampus database.
Figure 4: Sample Sizes for the Applications Results

Figure 5 uses the NELS data to addresses a different question: "How do members of a given student segment and gender-ethnic group distribute their applications among schools in the seven institutional segments?" Here the division is row by row rather than in terms of column totals. There is no reason to supplement the NELS data because the results need not be projected to the universe of institutions.
Figure 5: Distributions of Applications Across Institutional Segments

According to the table, the NELS white male blue chip students submitted 1.1% of their applications to super medallion schools (instSeg 1), 4.6% to medallion schools (instSeg 2), 25.8% to name brand schools (instSeg 3), and so on. Scholars, on the other hand, were much more likely to favor the super-medallions and medallions. We doubt that the differences between the blue chips and scholars are as great as indicated, and once again suspect sampling errors due to sparse data. On the other hand, the table does illuminate the broad patterns of student application behavior. In particular, the mass of applications shifts rightward as one moves down the segmentation hierarchy.
6. Application Ratios and Yield Rates
Section 7 of Technical Document 3.1 describes how CyberCampus will use the results of Section 5.2 to determine the applications ratios and yield rates for the player-generated institution. These calculations are not strictly relevant to the present discussion. However, it will be useful to illustrate the result in juxtaposition with Figure 3.
Calculation of the PGI’s application figures and yield rates proceeds in four stages, which are illustrated in Figure 6. First we compute a weighted average of the applications, offers, and matriculations for the seven institutional segments as shown in the first three columns. Next we compute the PGI’s overall applications ratio, shown at the bottom of the fourth column, based on the totals for applications and matriculations. The third step computes the percentage distribution of applications (also in the fourth column) by dividing each cell in the first column by the column total. The final step computes the yield rates (column 5) by dividing each row’s matriculations by the associated figure for offers.
Figure 6: Illustration of Application Ratios and Yield Rates for the PGI
(based on an assumed set of player specifications for the PGI)

The CyberCampus player determines targets for undergraduate matriculations and admissions through the initial conditions or by his or her decisions as the game progresses. The target matriculations number gets multiplied by the overall applications ratio to determine total applications, which are then distributed across student segments and gender-ethnic groups using the calculated percentages. The games admissions algorithm produces figures for each student segment and gender-ethnic group. Applying the computed yield rates to these results produces the requisite figures for matriculations. Changes in the PGI initial specification and its evolution over time will change the weights, which will produce interesting variations in the number and distribution of applications, offers, and yields.
7. Family Income Statistics
The undergraduate financial aid algorithm, described in Technical Document 2.1 on Enrollment Management, requires the mean and standard deviation of family income by student segment and gender-ethnic group. These data have been derived from the NELS database, and the results are illustrated in Figure 7.
Figure 7: Average Income and Standard Deviation of Income

Two transformations were applied to the raw data before Figure 7 was calculated. First, the NELS income codes were transformed to dollar amounts, in thousands, by using the mid-points of cells. (The open-ended upper cell, "above $150,000," was transformed to 250.) Then we took the square root of the decoded figures in order to compensate for skewness in the income distribution. Hence the numbers should be interpreted as the mean and standard deviation of the square root of income expressed in thousands of dollars.
We may wish to smooth some of the figures, although rebalancing the student segments may stabilize them somewhat. (Income data from independent sources to help with the smoothing would be welcome.) Stabilizing the data also may allow us to break out the averages by institutional segment, based on student applications, rather than using the overall figure for each student category. (The standard deviations probably would remain pooled.) This would add another interesting element to the game’s representation of the market for students.3
Endnotes:
- The ratings are calculated and are stored and the segmentation decision rule is applied in '.NELS-stuSegs.Setup'. The files 'NELS-Academics', 'NELS-Extracurricular Activities', and 'NELS-Athletics must be open when working with the 'Setup' file. Then PASTE-SPECIAL-values the segment assignments column (SCORES: H) to the same column in '.NELS-stuSegs'.
- Memory limitations led us to calculate the regressions based on only the first 6013 of the 8018 variables in the dataset. This shouldn't change the coefficients materially but it would be good to correct the problem if the system is recomputed.
- The results were calculated in '.NELS-stuSegs'. File '.NELS-Applied_Admit_Fin Aid' must be open to run '.NELS-stuSegs'. Follow the "run-macro" instructions above the tables.