Evaluation and Research

Eleanor C Sayre

Abstract

Research and evaluation both generate new knowledge, and they can use overlapping methods and theories. However, the purposes of evaluation and research are different, and their outcomes are different as well. While not all education projects require research, all projects – education or otherwise – require evaluation. In this article, I outline some of the major similarities and differences between evaluation and research, and discuss some of the common elements of evaluation for education research projects.

Oh hey!

I’ve been writing this Handbook since 2014. It’s time to finish. In the next few months, you can expect:

formatting changes as I prepare to package the Handbook into book format.
content updates I as clean up chapters to make them more coherent

Do you want to suggest edits or new chapters for this Handbook? You have until June 1, 2025.

What is research? What is evaluation?

Research is a process for generating new general knowledge. What you do education research, you discover new things about how humans learn, grow, interact, and develop. You share your discoveries with the scientific community as well as with your participant populations so that your discoveries can become new knowledge and your new knowledge can help shape future research projects as well as policies, programs, or projects. Your research may be case-based or pattern-based; your discoveries can have small scales or large ones; and the mechanisms for your sharing can vary (but often include peer-reviewed papers).

One kind of research, Action Research, tends to engage in small projects centered on the researcher’s own classroom. It’s still research because it’s developing that classroom as a case of something more general.

In contrast, evaluation is a process for generating new local knowledge, tied tightly to a local context or activity, and for the purposes of figuring out how (and in what ways) the activity meets its goals and meets the needs of the participants. When you grade your students, you’re evaluating their work in light of the work you asked them to perform and how well they performed it. You’re not conducting research on student understanding; you’re giving specific feedback to individual students about their performance.

One kind of evaluation, sometimes called “Institutional Assessment” or “Institutional Research”, focuses on how an programs within a college or university – or the university as a whole – are meeting their goals. Common goals for this kind of work include improving student graduation or enrollment rates, for example. Even though this work sometimes uses the word “research”, it’s focused on generating local knowledge and centers local goals.

In education projects, it can be hard to navigate the roles of research and evaluation. That’s because sometimes education research and evaluation might share data streams or analysis methods. Additionally, sometimes research questions in education share features with evaluation questions. Lastly, in the process of generating new general knowledge, research projects often generate a great deal of local knowledge which might be used to evaluate programs.

aspect	Research	Evaluation
Central goal	Generate new general knowledge	Generate new local knowledge
Impetus	Answering research questions	Evaluating how this program meet its goals
Shared among	a research community	stakeholders for the program

Table 1: Comparing research and evaluation

This kind of flexible overlap between research and evaluation activities can be difficult for emerging researchers to navigate. If you’re coming to education research from a strong background in classroom teaching, it can be frustrating to feel like your expertise in the classroom isn’t valued by education researchers. If your projects are strongly grounded in the classes you teach, it can be difficult to articulate the scope of the knowledge you’re generating.

Case study: Sanjay and Science Club

Sanjay is the faculty coordinator for Science Club, an after school activity for elementary school children ages 6-10. Children meet in Science Club, where undergraduate students are responsible for different activity stations, such as building with Lego or writing code or checking on plants in the garden. Sanjay wants more undergraduates to participate in Science Club more regularly, but he’s not sure how to make that happen given his limited budget.

Sanjay decides to investigate why undergraduates participate in Science Club so that he can develop new advertising materials on campus to attract more students and maybe get a little more money from the Dean. He plans to conduct focus group interviews with undergraduates at the end of Science Club meetings about why they are here. He will synthesize his findings about the value of Science Club in a report to the Dean, and also hang up flyers on campus.

We’ll spend more time with Sanjay’s project in this article, because he needs to make decisions about whether this project is evaluation or research, as well as how he’s going structure his project to get the most out of it.

IRB, research, and evaluation

Some people think that the intent to publish is what distinguishes research and evaluation, but that’s not accurate. The actual distinction is not about whether you want to publish your work; it’s about what kind of knowledge you intend to generate.

If you are only generating local knowledge, then you are not doing research and you do not need IRB approval. If your project is about developing a curriculum, the curriculum development is not research, even if you intend to publish your curriculum later.

What about curriculum development?

Some people engage in curriculum development. Curriculum developers – some of whom are also classroom instructors or education researchers – design, develop, test, and refine materials to help people learn particular topics. While everyone writes materials for their own classes, usually “curriculum development” refers to developing materials to be used in other people’s classes as well. Because of this intent, a fully-fledged curriculum development project usually includes dissemination of the curriculum to other instructors.

Curriculum development is an intensely creative process informed by rich information about how people learn and the subject at hand. Curriculum development projects often include a strong evaluation component to see how people interact with existing materials, and they might include a research component about how people learn the subject at hand. However, curriculum development is not research. You don’t need to do research to do curriculum development, and you don’t need to do curriculum development to do research.

Case study: Rory and Students’ Old Work

Rory has been teaching upper-level quantum mechanics for several years and always scans written homework, quizzes, and exams for their records before grading them. They have decided that they would like to use this written work to explore what resources students bring to quantum mechanics in order to design activities to build on these resources. They hope to publish their findings in The Physics Teacher so other instructors can also benefit from their work.

Because their project is growing out of long engagement with their course and their teaching, Rory is struggling to figure out how to formalize their work and think about doing research in the future.

What are some things that Rory can do to clarify their goals and think more formally about this project?
What would help Rory decide how/if this project involves research?

What needs evaluation? Everything.

All work is better if:

you enter into it mindfully: articulate some goals for what you want to accomplish, and develop a timeline of activities and milestones.
you reflect on how it proceeds: check how/if you’re meeting your goals, adjust your goals in light of your actual activities, and update your timelines accordingly.
you debrief on your accomplishments: reflect on what you’ve learned in terms of the results and the processes of doing this project.

All of these activities – setting goals, developing milestones, reflecting on activities, and debriefing on a project – are evaluation activities.

Case study: Sanjay and Science Club

Sanjay thinks really hard about his project. He is hoping to generate new local knowledge about why students participate in this club so that he can develop new recruiting materials and argue for more resources. These are great goals for evaluation!

What projects need evaluation? All of them.

In a research project, you should evaluate your interim results to decide which avenues to pursue, whether you need more data (or different data), and how your research questions are changing. You should evaluate the human processes of doing research: are we (collectively) proceeding apace? are some people overwhelmed and others frustrated? how are we balancing the load of this research with our individual capacities for it?

This need for evaluation is independent of the kind of research you’re doing: bench science needs it, education research needs it, and the humanities need it.

Not all projects in education are research projects. If your project is about developing new curricula, improving a program to promote student retention, or evaluating whether your undergraduates are adequately prepared for graduate school, then you’re (probably) not doing research. You still need to evaluate whether (and how) your project is achieving its goals. You still need evaluation.

Case study: Sanjay and Science Club

Is Sanjay also doing research?

Sanjay would like to maybe publish in the future, but realistically he doesn’t think he will have the time to treat the information from the focus groups with enough rigor to identify generalizable trends for research purposes. Plus, he really just wants to make Science Club better, not figure out how Science Club is a case study for some bigger problem.

Sanjay decides that this project is not a research project.

What kinds of evaluation are possible?

There are lots of ways to evaluate whether (and how) your work is meeting its goals or how its goals are changing in response to new developments. Across all of these kinds of evaluation, there are some big points to keep in mind:

Evaluation needs to consider both the processes of doing research and the results. It’s tempting to only check whether you have achieved your milestones, but that’s only part of the story. How you make your milestones is just as important as whether.
Evaluation is driven by goals. It’s hard to know if you have been successful if you don’t know what success means.
All projects, even classic bench science projects, need evaluation. However, different fields conceptualize evaluation differently, and different projects will use different emphases on how or why to do it.

Evaluation is goal-driven

Evaluation is goal-driven, so you need to have goals.

If you aren’t sure yet what you’re trying to accomplish, learn, discover, or develop, you aren’t ready to think about evaluation yet. Spend some time with your research questions or design problem, think about your own professional goals related to this project, and engage in generative writing around possible goals for your project.

Let’s look at some of the major ways to conceptualize evaluation.

Formative and Summative evaluation

aspect	`Formative`	`Summative`
when	throughout the project	at the end of the project
intent	learn about processes and interim results	reflect on processes and results
so that	you can make good choices of how to proceed	you can assess success and generalize for next time
academic example	grading your students’ rough drafts	final grades in a course

Table 2: Formative and summative evaluation

Formative evaluation occurs throughout a project to help you reflect on what has happened and make decisions about how to proceed, while summative evaluation happens at the end of a project to help you assess how and in what ways you have met your goals.

Especially for formative evaluation, it’s important that you’re getting enough information to know why and how you are succeeding (or not succeeding). You need mechanistic or causal information because formative evaluation needs to inform how you proceed in the next iteration or phase of your project.

Case study: Sanjay and Science Club

Sanjay decides that he has two formative outcomes: using the focus groups, he’ll identify 3-5 benefits and attractors for undergraduates to particpate in Science Club; and because of the flyers, more students will sign up for Science Club.

Sanjay identifies two summative outcomes: in addition to signing up for Science Club, more students will actually participate at least twice; and the Dean will increase the budget for Science Club.

I’ve presented formative and summative evaluation as if they are exclusive categories, but in practice the lines can be a bit blurry. For example, your external evaluation might include presenting your preliminary results at a conference and soliciting feedback from other researchers. On the one hand, this is formative: you’re using their feedback to make changes in the next stage of your project. On the other hand, this could be summative: perhaps the conference presentation marks the capstone of your student’s research experience, and other students will be taking up the project later. This kind of blurry line between summative and formative evaluation is pretty common, especially for large projects which span multiple people and multiple years.

Internal and external evaluation

aspect	`Internal`	`External`
who	you and members of your team	someone outside of your team
intent	reflect and record	get feedback
so that	you can be mindful of what’s happening	you can connect with stakeholders and community norms
academic example	post-lecture notes to yourself for next time	meeting with your chair about your mid-tenure review

Table 3: Internal and external evaluation

Internal evaluation is work that you (together with the members of your team) perform to assess the processes and results of your project, while external evaluation is work that someone outside of your team performs to give you feedback.

Case study: Sanjay and Science Club

Sanjay’s plan to identify strengths and attractors for participating in Science Club is internal: he’s collecting this information and analyzing it himself, so only members of his team are engaged in this.

Because the success of other goals hinges on this one, Sanjay decides to add an external evaluation component to this goal as well. He decides to check in with Science Club students: he will show them his aggregated interim results, and see if they resonate with the students.

For some goals, whether or not you achieve them is objectively obvious, so it doesn’t matter who performs the evaluation. For example, if your research project has a goal of presenting a poster at a conference at the end of year 1, and you present a poster: congratulations, you have achieved your goal. You don’t need someone external to your project to tell you if you have presented this poster. For other goals, however, you’ll get different information if they are internally or externally evaluated. Consider your poster: you thought that it accurately and enticingly showed off your project, but did the community respond well to it? Did you get good questions that might help you make good decisions about how your project should evolve? Was everyone confused by one aspect of it? This external feedback on your poster speaks to a bigger goal: you want your work to contribute to a scholarly conversation.

What does `external` mean?

How far outside your project team is “far enough” to count as external evaluation? Well, it depends on the scope and aims of your project, and what kind of evaluative feedback you’re looking for. Generally speaking, though, the larger your project is, the more you should engage with people outside, and the further outside you should look. For many projects, there are people at your institution who are outside the project and can act as external evaluators; for other projects, you’ll need to look outside your institution. A co-PI or thesis committee member is generally not external to the project.

What does external mean?

As a rough rule of thumb, someone is outside the project if they aren’t working on it and wouldn’t directly benefit from its success.

Comparing kinds of evaluation

These two axes – formative and summative assessment, and internal and external assessment – are both useful and important for assessing the processes and successes of your projects. You can think of them as making a 2x2 grid, and mindfully select kinds of assessment to fill in each box of the grid. In Table 4, I’ve put together some common (and minimal) evaluation choices for a research project.

type	`Formative`	`Summative`
`Internal`	Lab meetings about research progress	Project debrief
`External`	Quarterly meetings with advisory board	Publication of results in peer-reviewed journals

Table 4: A 2x2 grid of assessment types.

Formal and informal

A third major axis of evaluation is about how formal the evaluation is. Formal mechanisms for evaluation, such as peer review, tend to be easier to identify. They’re often linked to specific milestones or events: peer review of submitted papers hinges on writing papers and submitting them for peer review. They often have explicit criteria by which you can judge your progress: your tenure review, for example, is conducted following specific policies and procedures.

In contrast, informal evaluations may not follow explicit rules, and they may be more relationship-driven. Going out for coffee with a mentor might yield great informal evaluation of your progress with guidance for the future. Asking questions with a presenter after their talk is another informal route, as is checking in with your research students about their progress.

Case study: Sanjay and Science Club

Sanjay’s explicit plans for evaluation are all quite formal. As he thinks about this project, he realizes that he also spends some time each week chatting with Science Club students about their experiences. This informal feedback is really valuable for him to reflect on how the advertising campaign is going.

Ahead of time, it can be difficult to notice and plan for informal evaluation mechanisms. Planning for them in advance can tip them over into more formal mechanisms: a quarterly review of research students, for example, is a formal mechanism to review their progress; checking in with them in the hallways can be more informal (but no less valuable). As you get informal feedback – from students, mentors, community members, or other stakeholders – write it down so that you can reflect on it later.

Do everything

Generally speaking, you will want to build opportunities for all of these kinds of evaluation: formative, summative, internal, external, formal, and informal.

Measurement: what will you measure? How will you know?

Central to the practice of evaluation is the question of measurement: what is happening, and how do you know? Because evaluation needs to reflect on processes as well as outcomes, your measurements need to include both.

Some measures are quantitative: how many students enrolled or graduated? how many papers did we publish, and where? Other measures are qualitative: what do our participants cite as being impactful, and why? Common data types in education research are often applied to evaluation as well.

In general, a robust evaluation will include both qualitative and quantitative measures. It is ok to have only qualitative measures, especially for projects which are small or short. It is never enough to have only quantitative measures: you must contextualize and augment quantitative information with qualitative information about what the numbers mean and why.

Setting up for success

As you plan a research project, you can set yourself up for success by including regular mechanisms for evaluation which span all axes of evaluation: formative, summative, internal, external, formal, and informal.

There are two common avenues here: identifying milestones, and setting procedures.

Identify milestones

Identifying milestones in your project is a calendar-based activity. It’s generally used for formal evaluation, both internal and external.

Starting with a rough project timeline, identify milestones in your project. What will you accomplish and when?
For each iterative cycle of your project, identify what new features you will develop or activities you will undertake.
If there are external deadlines (e.g. for semester breaks or grant deadlines), connect your planned milestones with their dates.
If publication is a goal, where do you want to publish? when do you want to submit?

Set procedures

Setting procedures is about building the practice of evaluation into your project.

If you are working with undergraduates or other team members, schedule regular meetings as a group to check in on progress.
Develop lab practices to support good documentation. What are you doing? why?
If you are working with an advisory board or external evaluator, meet with them regularly.
Set aside specific meetings to check in with each of your project members and major stakeholders about how the project is going for them and any excitements or concerns they may have. Plan to do these meetings regularly.

Research is a human endeavor.

As you think about how to evaluate your project, it’s important to attend to the human aspects of doing research. Are your team members working well together? Do they feel valued? Do they feel like they have the opportunity to grow? When conflicts arise, do they feel safe in articulating what’s happening and trust that the resolution will be fair and appropriate?

These human aspects of doing research are important to evaluate, even if they don’t tend to align well to specific milestones like papers published or students graduated. In an informal sense, you can check in on your team as humans regularly in group meetings and individual ones. In a formal sense, it’s a good idea to have a regular, formalized review process. If you engage in a formal process – and I think you should – then your process needs to include mechanisms for your team to learn the results of the evaluation as well as mechanisms for you (and your team) to make decisions about how to operate together in the future.

Working with an external evaluator is a great choice for getting good formative and summative feedback for how your team works together. They might meet with you about your goals and questions, then meet with or survey each of your team members. Afterwards, they will distill what they’ve learned in a report to you, either verbal or written, so that you and your team can make good choices about how to proceed.

Depending on the scope of your project, you might also engage with an advisory board to bring in outside expertise and perspectives on your project. Advisory boards can be a great element in your evaluation plan, especially to get feedback from external people with a lot of different expertise. Evaluation is often not the only goal of advisory boards, however, so be clear on what you’re asking them to do and why.

History

This article was first written on October 1, 2023, and last modified on May 29, 2025.

Citation

For attribution, please cite this work as:

Sayre, Eleanor C. 2023. “Evaluation and Research.” In Research: A Practical Handbook. https://handbook.zaposa.com/articles/evaluation-and-research/.

Evaluation and Research

Oh hey!

What is research? What is evaluation?

What about curriculum development?

What needs evaluation? Everything.

What kinds of evaluation are possible?

Formative and Summative evaluation

Internal and external evaluation

What does `external` mean?

Comparing kinds of evaluation

Formal and informal

Measurement: what will you measure? How will you know?

Setting up for success

Identify milestones

Set procedures

Research is a human endeavor.

Additional topics to consider

Iterative design

Research design

Design Projects

History

Citation

Oh hey!

What is research? What is evaluation?

What about curriculum development?

What needs evaluation? Everything.

What kinds of evaluation are possible?

Formative and Summative evaluation

Internal and external evaluation

What does external mean?

Comparing kinds of evaluation

Formal and informal

Measurement: what will you measure? How will you know?

Setting up for success

Identify milestones

Set procedures

Research is a human endeavor.

Additional topics to consider

Iterative design

Research design

Design Projects

History

Citation

What does `external` mean?