Unit 2: Assessment validity and deciding what to wear

Daniel Muñoz Acevedo 27 Abr 202027/04/20 a las 18:24 hrs.2020-04-27 18:24:27

Most of the things you are going to read about assessment will revolve around the idea of assessment validity. Validity is the property of an assessment procedure to measure what it is meant to measure. In simple terms, it means that a ruler is valid if it can measure distance, a watch is valid if it can measure time, an applause meter is valid if it can measure applauses, a…. Well, you get the idea.

This reality of measurement and evaluation is simple in its presentation: rulers measure distance, watches measure time and applause meters, applauses. Pretty obvious. However, as you may be expecting already, the problem of how to define and observe the validity of measures and evaluations is probably one of the most fascinating problems in the history of human thought.

No. I am not kidding. Not at all.

In this unit, we will start wih the very simple basics of the problem of valñidity. Let us take a simple example of a process of assessment/ measurement/ evaluation from normal, day-to-day life. This is the scene: you have just woken up, had breakfast and pretty much done all the startup things we all do after we wake up. Now you need to decide on what to wear before going out (or not going out, as it seems to be the norm today). To that purpose, we are genetically endowed with quite a complex cognitive system. To make such a simple and quick decision, that system enters into [evaluation mode].

We can start by getting to know what the weather is like right now. As we know, there are plenty of ways to get to know that. Here are some examples:

We can watch the weather widget in our smart (and not so smart) devices.
We can take our heads out of a window to check what is going on outside.
We can listen the weather forecast to the radio.
We can see if ants in the yard are building walls (Yep, for some people that used to work, too).

Based on the information you got, now you need to figure out what the weather is like right now and how it is going to be during the day, or at least for the period when you are going to be outside. To that purpose, we start making som comparisons. Things we may do make that guess include things like this:

We can look at the forecast in our computer, then look at the sky from our window and see if they match.
We can take our heads out of the window and combine our perceptions of smell, temperature, appearance of the sky, etc. and see if the profile corresponds to a previous of similar weather.
We can look at how people are dressed and compare it to our mental data bank of cloth-weather matches.

Once we have formed a judgement of what the weather is going to be like, we are ready to make a decision. And decisions can go in several ways, too. For example,

You are sure about the weather today and thus you choose the clothing you think will be adequate for the day.
You are not so sure and so you pick stuff that may be adequate for different circumstances (the classic T-shirt and coat combo!).
You have no idea what the weather is going to be like, so you grab just whatever and hope for the best (sunburning and pneumonia can start in the same assessment process, as you can see)..

This is a very good example of assessment/evaluation/scoring processes as it reflects the main features of such processes. Let me mention some of the most important ones:

1. Assessment processes are part of decision-making processes. We assess stuff because we need to make decisions. Such decisions can be minute and without much consequences ("Should I get another piece of that cranberry cake?"). Many times, decisions can be tough and critical to people's lives ("Should we stop the quarantine in this area?"). Many times, also, decisions are just somewhere in the middle of minute and critical ("What score should I give to this oral exam?").

2. Assessment process are very much like research processes: We have a question, we collect relevant data, and then we compare that data in ways that may help us understand what reality is like and so what decision we can make.

3. Assessment is, fundamentally, a process of comparison of two or more sets of data to establish one particular value (the state of the weather, the level of ability of an English language learner).

4. Since they are about making-decisions, assessment processes have consequences. So, assessment procedures can be examined in terms of the consequences they produce.

5. Since assessment processes are about decision-making, and we are pretty much making decisions all the time (I feel uncomfortable, I'm going to adopt a new posture in my sit/ What should I eat for lunch? /Should I quit this Seminar?), then it follows that we are assessing constantly. Yup. Assessment is EVERYWHERE and we are assessing every time, everything. This is so much so that, after this Seminar is over, you will not be able to even comb your hair without noticing that there is an assessment process there (otherwise you would never know when your hair is ok and would comb yourself eternally).

6. Most important, assessment processes are about determining the value of something.

This last point is the one that takes us back to concept of validity. Whatever the assessment procedure we use in the example, the whole purpose of the procedure is get a particular value right: the conditions of the weather.

Such value can be stated in very simple terms (good weather vs bad weather), in more complex terms (A period of rain likely very early then a chance of showers), or in very complex terms (there is a 27% chances of rain). In all cases, some quality of the reality we want to evaluate is selected (personal appreciation and probability, in the examples). Then values are assigned to measure or evaluate that quality (good vs bad, likely vs unlikely, percentages). However it is expressed or conceptualised the quality and its values, the job of the assessment process (measurement, evaluation, etc.) is to provide a value for that quality that reflects how reality really is.

In the example, what we want is that the value that is produced by the assessment process is right. We want the prediction in our phone app and the behaviour of the ants to correctly indicate the weather conditions of the moment. This is what we call validity. We want the ruler to indicate actual distance, weight-scales to indicate actual weight, ants to inidicate rain with precision. A ruler that measures distance is a valid ruler. A scale that measures weight is a valid scale.

Although the idea is simple, validity in assessment procedures or tests is a quality which is very difficult to observe and achieve. In the example of the weather forecast, we can see that the data we can use to get the right value of the weather can be very diverse in nature. The information provided by the application in a computer is very different from the information you gather by just looking at how people are dressed from your window. And then there is ants building walls. Somehow, we know that these sources of evidence vary widely in terms of how precisely they indicate weather conditions.

Question is: How do we know that? How do we know that there are better (more valid) ways to know weather conditions and not so good ones (less valid)?

The answer to this question has to do with our capacity to understand how the evidence we are observing (some records in the smartphone, people wearing clothes, or ants building walls) relate in reality with the quality that we want to evaluate (the condition of the weather). That connection is one of the key problems in the discussion about assessment and measurement and so deserves its own post in the next Unit of this series of posts.

This is it for now.

Please do leave comments, questions, suggestions or any response you may see fit. Since we are not having meetings for a while, this is the way to participate in the Seminar.
Compartir
Última Modificación 27 Abr 202027/04/20 a las 18:24 hrs.2020-04-27 18:24:27
Vistas Únicas 0
Comentarios