Does Cooperative Learning Improve Student Learning Outcomes?

What is the effect of small-group learning on student learning outcomes in economic instruction? In spring 2002 and fall 2004, the author applied cooperative learning to one section of intermediate macroeconomics and taught another section using a traditional lecture format. He identified and then tracked measures of student learning outcomes. Using multivariate regression analysis, he found that students taught by cooperative learning achieved greater academic performance in the form of higher exam scores.

Of particular relevance, Brawner et al. (2002) reported that 60 percent of engineering faculty surveyed used assigned group learning at some point in their classes.
My objective in this article is to understand better the effect of small-group learning on student learning results in economic instruction. I applied cooperative learning to a course in intermediate macroeconomics in spring 2002 and again in fall 2004. During the same semester, I taught another section of the course, using a traditional lecture approach. I tracked five types of student learning outcomes: interest, preparation, participation, attendance, and academic performance. Using multivariate regression analysis, I then tested the effect of cooperative learning on learning outcomes.
Researchers in undergraduate science, mathematics, engineering, and technology (SMET) instruction have found that group learning can improve learning performances. For example, Felder (1995) and Felder, Felder, and Dietz (1998) taught five consecutive chemical engineering classes to a cohort of students using cooperative and other active learning techniques. Felder reported that the experimental cohort achieved higher academic performance (retention, grades) and interest levels compared to an instructor-taught cohort. Springer, Stanne, and Donavan (1999) conducted a meta-analysis of 39 studies and found that "various forms of small-group learning are effective in promoting greater academic achievement, more favorable attitudes toward learning, and increased persistence through SMET courses and programs" (p. 21).
In economic instruction, the evidence on the effectiveness of group learning has been promising but far from conclusive. Johnston (1997) found that introductory microeconomic students tutored in a group setting performed better on examinations. Similarly, Moore (1998) showed that students who participated in cooperative learning labs outside of the classroom reported the labs worthwhile and enjoyable. In more recent articles, Johnston et al. (2000) and Brooks and Khandker (2002) incorporated a cooperative-learning approach in weekly recitations (labs). Johnston et al. found that students in cooperative-learning recitations spent more time preparing for the tutorials and were more interested but did not perform any better on the examinations. However, Brooks and Khandker, found that students in small cooperative-learning labs scored higher on the final exam. In a survey of 34 liberal arts colleges, Jensen and Owen (2001) reported that less lecture and more group learning in the classroom encouraged students to take more economic courses and become economics majors.
My application of group learning differed from earlier studies of economics instruction in that I paid more careful attention to group formation and group dynamics. First, I established cooperative base groups of three to four students. Base groups stayed together during the entire course and thus could provide each student the support, encouragement, and assistance needed to progress academically (Johnson, Johnson, and Smith 1991). Second, I had the students work in their base groups both inside and outside the classroom. Inside the classroom, students worked together on problem-solving exercises, where as outside the classroom, students solved problem sets. Third, I established the groups, facilitated the cooperative-learning exercises, and evaluated the outcomes. In most of the previous economic studies, the group-learning component was incorporated in the recitation periods by the teaching assistants. I believe that these three modifications increased the credibility and effectiveness of cooperative learning and thus had the potential to yield better learning outcomes. 1 To test the effect of cooperative learning, I estimated an empirical model where each learning outcome depended upon teaching pedagogy, demographic factors, economic knowledge, and other academic factors. I found that the use of cooperative learning increased academic performance. The point estimates, which were statistically significant, implied that group learning raised combined (midterm plus final) exam scores by five to six points, ceteris paribus. As a percentage of the mean score on the exams, the point estimates translated into a 3 to 4 percent improvement in exam performance. As a consequence, the results of my study suggested that small-group learning could raise academic achievement in economics instruction similar to that found in other disciplines.

COOPERATIVE LEARNING
Cooperative learning is a teaching method where students work in small groups to help one another learn academic material. In the groups, students are expected to help each other find answers to questions, rather than seeking answers from the instructor. Cooperative work rarely replaces teacher instruction but rather replaces individual lecture and drill. If implemented properly, students in cooperative groups work with each other to make sure that everyone in the group understands the concepts being taught. Ultimately, the success of the group depends on its ability to make certain that everyone grasps the key ideas (Slavin 1995). Johnson, Johnson, and Smith (1998) listed five elements essential for successful cooperative learning groups. First, there must be positive interdependence in that members of the group understand that they should learn together to accomplish their goal. Second, there must be promotive interaction in that students interact face-to-face in the group. Third, there must be individual and group accountability in that members are held responsible for their own contribution to the group's success. Fourth, there must be group processing in that members reflect on their collaborative efforts and decide on ways to improve effectiveness. Fifth, there must be the development of small-group interpersonal skills such as giving constructive feedback, involving each member, and reaching a consensus.
Some educational researchers such as Bruffee (1995) have insisted there is a clear and important distinction between cooperative learning and collaborative learning. To these educators, cooperative learning supports the traditional role of teacher as a subject matter expert and classroom authority whereas collaborative learning has the instructor work directly with students to discover and create knowledge. In cooperative learning, the instructor sets the task and has the students work in groups to find the "correct" answer. In collaborative learning, however, knowledge is not set by the instructor but rather is acquired through consensus among students and the teacher (Barkley, Cross, and Major 2005).

Cooperative Learning Application
In my experiment, I applied cooperative learning to a course in intermediate macroeconomics. I used the group-learning method of Johnson and Johnson (1987), which requires the instructor to: (1) make a number of preinstructional decisions, (2) explain the task and the positive interdependence, (3) monitor students' learning and intervene to provide task assistance, and (4) assess students' learning (Johnson and Johnson 1999). I set aside nine class sessions for group problemsolving exercises. The task and reward system for the groups was established at the beginning of the course. The students knew that they were to produce a group answer to each question and that each answer was then presented to the rest of the class. I monitored the task by circulating throughout the classroom and answering questions of clarification.
I established base groups of three to four students in classes that ranged from 22 to 35 students. On the first day of class, I administered a 10-question test to gauge mathematic skills. In the third class, I gave the preexperiment questionnaire and asked for demographic and scholastic information. I asked students to e-mail me their individual preferences for group members. Using the math test results, demographic information, and e-mail responses, I established groups in the fifth class. I attempted to form groups that were heterogeneous in aptitude and demographics but also fulfilled some of the students' preferences.
The students met in their groups both inside and outside the classroom. In class, the groups applied economic theory to a new situation. The new economic situation was presented in the form of additional readings and handouts made available beforehand. Each handout contained a series of questions and was made available beforehand on the class homepage. The groups went over the questions in class. Initially, some students were somewhat hesitant to interact with fellow group members. In certain groups, students worked individually on the answers. However, as the semester progressed, students became more comfortable in the group setting and began to truly collaborate in their groups. At the end of the class period, a representative from each group presented the group answer to the rest of the class. An eight-sided dice was used to determine which group presented which answer.
Outside the classroom, the groups worked on the problem sets. For each problem set, one group member was designated as the leader. The leader was responsible for coordinating the group and making sure everyone understood the answers before they were handed in. The role of leader rotated through the group members. There is, of course, a huge potential for free riding on the problem sets. I had the students report on their fellow group members at the end of the semester but was aware of the limitations of this type of self-reporting.

Experimental Design
I conducted the experiment in spring 2002 and fall 2004. In each semester, students could register for one of four sections of intermediate macroeconomics: Monday and Wednesday (MW), early afternoon or late afternoon, Tuesday and Thursday (TTh), morning or early afternoon. I taught one MW section (control) as a traditional lecture style and the other MW section (experimental) using cooperative learning. Because the early afternoon sections were more popular, the time of the control and experimental sections in 2004 were switched to keep the sample balanced. The two TTh sections were taught by other instructors who chose not to participate in the experiment. The students were only aware of the time and not the control/experiment grouping when registering.
I used the same course organization and content in the control and experimental sections. The course was divided into five main parts: introduction (with national income and product accounts), long-run static model, long-run growth model, short-run IS-LM model (with business cycle theory), and macroeconomic policy. Mankiw's (2003) Macroeconomics was the required text in both sections. In addition, I assigned identical additional readings, handouts, and problem sets, and proctored nearly identical exams. 2 The final grade for the course was calculated as 10 percent for attendance, 20 percent for problem sets, 30 percent for the midterm exam, and 40 percent for the final exam. This breakdown followed the recommendation of Cross and Steadman (1996) that grades from peer group work make up a small part of the overall course grade.
The purpose of the additional readings and handouts was to provide students with a relevant economic situation to apply and extend the theoretical models developed in class. There were nine problem-solving exercises in the course: three for the long-run static model, two for the long-run growth model, two for the shortrun IS-LM model, and two for macroeconomic policy. In the problem-solving exercises, the students first answered objective-type questions and then moved on to application-type questions.
The teaching method practiced in each section differed. In both sections, I developed the theoretical model on the blackboard. The development of the model was then followed by a problem-solving exercise in the form of an additional reading and a handout of questions. In the traditional lecture section, I led the problem-solving exercise. I spent the first 15 minutes introducing the topic and then asked for volunteers from the entire class to answer the questions on the handout. In the experimental section, the students discussed the reading and handout in their groups. I circulated throughout the classroom to encourage interaction and answer questions of clarification, but the students themselves answered the questions in the handout. During the final 15 minutes of the class, a representative from each group presented the answer to the rest of the class. 3

Limitations of the Study
A few weaknesses in the study design limit the generality of the results. First, there is a real possibility that unmeasured biases can occur when one instructor teaches the same course. For example, differences in instructor excitement across the two groups could have contributed to student outcomes. Moreover, the uniqueness and possible awkwardness of being in an experiment could have affected student outcomes. In their meta-analysis of SMET instruction, Springer, Stanne, and Donavan (1999) found that studies with the investigator as the instructor reported significantly greater effects for small-group learning. Second, the application of cooperative learning to one course limits its generality. The students in the sample were typically sophomores and juniors with three prior economic classes, on average. One wonders if the impact of cooperative learning would be the same on a principles class with mostly freshman or an upper-level elective with mostly seniors. Third, cooperative learning may not work for all instructors. For any teaching method to prove effective, the instructor must be convinced of its merits. These limitations can be addressed if the experiment is repeated by a number of instructors; the learning results could then be compared across different teaching environments and instructors.

Student Learning Outcomes
I tracked five different types of student learning results: interest, participation, preparation, attendance, and performance. 4 The first three results were collected from a preexperiment and postexperiment questionnaire. I measured preparation as the willingness to talk to other students, the willingness to talk to the instructor outside the classroom, and the amount of time spent reading and studying the material outside the classroom. Participation was measured by the preference for working together, the regularity in the students discussion of their work with other students, and the number of times they spoke in the last three classes. Interest was seen if the student enjoyed the intellectual challenge of economics, if the student was interested in economics, if the student enjoyed economic theory, and if the student applied economics to real-life situations.
I collected the outcomes for attendance and academic performance from daily attendance records and graded course material. I measured class attendance as the percentage of total classes attended. The midterm exam covered the first three sections of the course (introduction, long-run static model, and long-run growth model), and the final exam covered the last two sections of the course (short-run IS-LM model and macroeconomic policy). I measured academic performance as the sum of the raw scores on the midterm exam and the final exam to minimize the potential measurement error of each exam. Because of the potential for free riding, I excluded the problem set grade as a learning outcome.

Sample
A total of 116 students were enrolled in the four intermediate macroeconomic courses. In 2002, 35 students were in the early afternoon control section and 22 in the late afternoon experimental section. In 2004, 35 students were in the early afternoon experimental section and 24 in the late afternoon control section. In each semester, one student from the control section dropped the course after taking the midterm exam. In addition, one student from the experimental section did not take the midterm exam in 2004. For student interest, participation, and preparation measures, 104 students completed the preexperiment questionnaire, 103 completed the postexperiment questionnaire, and 93 completed both.

Student Interest, Preparation, and Participation
The sample means of the student interest, preparation, and participation outcomes are presented in Table 1, with sets of columns for preexperiment, postexperiment, and the difference between the post-and preexperiment. Each set of columns contains the values for the control group, the treatment group, and the difference between the treatment and control groups.
To test for a significant difference in means across the control and treatment groups, I estimated the following regression equation: where experimental section is a dummy variable recording cooperative learning, β's are the coefficients to be estimated, and ε is the error term. The point estimate (and t statistic) of β 1 measures the difference in mean values, which is reported in each Diff column. The last column reports the "difference in differences" result where the population average difference over time in the control group is subtracted from the population average difference over time in the treatment group (Meyer 1995).
There are some significant differences between the treatment and the control groups in the pre-and postexperimental values but only one significant difference in differences between the treatment and control groups (Table 1). In the preexperiment set of columns, the treatment group was more interested in economics (items 1 and 4) and more inclined to study in groups (items 11 and 12). Similarly, in the postexperiment set of columns, the treatment group was more interested in economics (items 1-3) and better prepared (item 10). However, in the difference set of columns, there was only one significant difference in differences (item 14). As a result, the use of cooperative learning in my experiment appears to have had little to no impact on student interest, participation, and preparation outcomes.
I also estimated a proportional-odds ordered logit model (McKelvey and Zavoina 1975), using the difference between the pre-and post experimental values as the dependent variable. The dependent variable ranged from −4 (strongly agree to strongly disagree) to 4 (strongly disagree to strongly agree). Along with the dummy for the experimental section, I included data for class time, demographic factors, current academic factors, and economic knowledge as independent variables. For the 14 outcomes, I found that the experimental dummy variable was significant only once (item 2). 5 Consequently, the ordered logit results confirmed the difference in differences results of Table 1 that cooperative learning had a negligible effect on student interest, participation, and preparation outcomes.

Attendance and Academic Performance
Sample statistics of the students in the experiment are given in Table 2. Recall that students were only aware of the time and not the control/experiment grouping when registering. There are three sets of columns for the control group, the   Notes: Each entry is the sample mean and standard deviation (or t statistic) of the control, treatment, and difference between the two. The data are drawn from classroom and academic records. * p = .10. * * p = .05. * * * p = .01. treatment group, and the difference between the two groups. The results for the difference column are from estimating equation (1) for each variable. The attendance and academic performance results for the two groups are quite close in mean values. In fact, the control and treatment groups are similar in most characteristics, but there are a few exceptions. First, there were more upper classman in the treatment group. Second, there were more African-American, Latino, and international students in the control group. Third, and potentially important, the students in the control group had higher values for the prior (or pretest) economic knowledge measures: cumulative GPA, economics grades, and principles of macroeconomics grade.
The lack of statistically significant difference in mean attendance and test scores between students in the control and treatment groups does not necessarily imply that cooperative learning had no effect on attendance and academic performance. Consider the difference in the prior economic knowledge measures. The lower value in prior economic knowledge in the treatment group is likely to be correlated with reduced attendance and test scores. If cooperative learning had a positive effect on the treatment group, then that positive effect could have been offset by the negative effect of the prior economic knowledge, resulting in an undistinguishable difference in the mean outcomes of the control and treatment groups.
To control for this possibility, I estimated the following multivariate regression equation: The independent variables X measure classroom factors, demographic information, current academic factors, economic knowledge, and prior levels. I used two dummy variables for students in 2004 and in an early afternoon, smaller size class to capture classroom factors. 6 For demographic information, I included dummy variables for students that were African-American, Asian, Latino, international, upper classman, and transfer. I included the number of concurrent classes and total attendance for current academic factors. For prior economic knowledge, I used cumulative grade point average (GPA), the number of past economic courses, grades of past economic courses, and the grade for principles of macroeconomics. I also included the preexperiment values for all preparation questions for class attendance and the cumulative GPA for academic performance to measure the prior or pretest values of attendance and knowledge. 7 For the OLS estimates for attendance and academic performance (Table 3), I considered three different specifications where prior economic knowledge was measured as: (1) cumulative GPA, (2) number and grades of past economic courses, and (3) grade for principles of macroeconomics. The R squared indicate that my model explained around 50 percent of differences in class attendance and 70 percent of the variation in exam scores. For classroom factors, students in 2004 attended class more often. For demographic factors, Latino students did better on the exams, and African-American and transfer students did worse. For current academic factors, the number of concurrent classes was insignificant whereas attendance was positively linked to exam performance. All measures of prior economic knowledge were positively linked to attendance and exam performance. For attendance, three of the preexperiment questionnaire coefficients were positive and two were negative-suggesting that work outside of class served more as a complement rather than a substitute to attending class.
Turning to the variable of interest, I found that cooperative learning was positively related to exam scores. For attendance, the coefficient for the experimental section was insignificant under all three specifications. For academic performance, however, the coefficient for the experimental section was positive and significant in all three instances. The point estimates implied that the use of cooperative learning   was associated with an increase in combined exam scores of 4.4 to 5.5 points. As a percentage of the mean score on the exams, the point estimates translated into a 5.5 to 7.0 percent improvement in exam performance. Next, I checked the sensitivity of the results to potentially influential outliers. I ran the second specification and excluded either (1) African American, (2) Asian, (3) Latino, (4) international, (5) freshman, or (6) transfer students. The results for the experimental dummy are provided in Table 4. In each instance but one, the coefficient for cooperative learning was positive and significant at the .10 Type I error level. The point estimates ranged from a low of 4.4 (without Latino students) to a high of 6.7 (without Asian students). Although the results suggest that the impact of cooperative learning may differ across demographic and ethnic lines, the positive coefficients argue that cooperative learning is associated with an increase in exam scores.
There is potential endogeneity of attendance in the academic performance regressions. Romer (1993) recognized that student attendance is not exogenous because students "choose whether to attend class" (p. 170). To control for the endogeneity of attendance, Schmidt (1983) used a latent variable approach where time and ability are treated as unobservable variables in the academic performance equation. Devadoss and Foltz (1996), on the other hand, posited a recursive system and then used a seemingly unrelated regression estimator.
I tested for the effects of cooperative learning on exam scores with attendance treated as endogenous in Table 5. I used a two-stage least squares (2SLS) estimator where attendance was instrumented with the answers to the preexperiment questions #8-14 and the other explanatory variables (included in the second stage). I considered the three earlier specifications in columns (1) to (3) and a preferred specification in column (4). I obtained the preferred specification by using the automated Hendry/LSE general-to-specific model selection criteria in PcGets (see  Hendry and Krolzig 2001). The relatively high partial R squared of excluded instruments indicated that the instruments were relevant (i.e., highly correlated with attendance). Moreover, the results for the Hansen-Sargan test of overidentifying restrictions showed that the instruments were valid (i.e., uncorrelated with the error term and correctly specified). 8 Results shown in Table 5 confirm the results found earlier. Latino and lowerlevel students scored higher on the exams whereas Africa American and transfer students scored lower. Attendance and prior economic knowledge was still positively correlated with exam scores. Most important, though, the coefficient for the experimental section was positive and significant. The point estimates again suggested that, all else equal, students in the experimental group scored five to six points higher on the combined exams than those in the control group.

Discussion
Why did students in the cooperative learning group perform better on the exams than students in the lecture group? I discuss three possible reasons for this result. First, cooperative learning raised student-instructor interaction. Students seemed less inhibited about asking questions in the small groups. I observed that students in the cooperative learning class (when working on assignments in groups) asked questions more often than students in the lecture class, even though I frequently asked for questions during lecture. Furthermore, students in the cooperative learning group often came to my office as a group whereas students in the experimental group came individually. As a result, more students from the cooperative learning sections sought help from me outside of class.
Second, cooperative learning increased group studying for the exams. Students in the cooperative learning sections were more likely to develop study groups. These study groups were parts of or the entire base groups used in the classroom. Furthermore, the rapport inside the study groups for the experimental group appeared better than that for control group. As one student stated, "I liked the use of groups in class because it gave me students to study with for the exams." Third, the novelty of working in small groups sparked greater interest in the material. In end-of-course evaluations, several students expressed positive attitudes toward cooperative learning. One student said that "[cooperative learning] was a great idea because it allowed me to learn the material from both the instructor and the other students." Another student found it "very worthwhile and helpful . . . I learned the material much better by discussing it with my fellow group members." However, the three reasons are mostly speculative and, as a result, further study is needed to decipher why group work raised student achievement.

CONCLUSION
My objective was to investigate the impact of cooperative learning in economic instruction. I used a two-group experimental design where one section of intermediate macroeconomics was taught using cooperative learning and the other section was taught using a traditional lecture format. I tracked five types of student learning outcomes: interest, preparation, participation, attendance, and performance. Using multivariate regression analysis, I found that the experimental section scored four to six points higher on the combined exams when I controlled for classroom, demographic, and academic factors.