> HOW do you measure knowledge? And when you decide how, how do you scale it?
I have experienced good tests and bad tests. I studied in France, tests were open book with no multiple choice questions, only problems to solve. This approach scales badly and is a lot of work for the professor grading but it measures knowledge.
The problems were long, had few questions besides describing the problem and maybe a few questions to guide the student along the path to solving it. We had either 3 or 4 hours to solve those problems.
Those tests worked very well. I'd come out from one of those tests having often learned something new.
I was an exchange student in the US, tests involved multiple choice questions, they were closed books with questions around rote memory. While I did feel that some of the education in the US was valuable and interesting, I hated those tests, they didn't correlate as much with comprehension of the subject matter and more with learning facts that are more or less tangentially related to the subject matter. I still remember in a computer graphics tests being shocked by being asked when Opengl was first released, which companies were involved and other completely useless knowledge.
What's interesting to me is that there's much less opportunities to cheat with the former tests while the later tests are pretty much made for cheating. So, imho cheating is a symptom of bad tests.
I don't know if it's popular in France, but another very simple idea that eliminates cheating entirely is oral exams. They're still done a lot in Italy. I once literally inverted a binary tree on a whiteboard :)
IMO oral exams and open-ended answers are the kinda things that really work better for their intended purpose, and everyone knows. But people still prefer multiple-choice because "they scale". The goal isn't simply measuring knowledge, it's doing so in an acceptable/shitty way, with (edit) limited resources.
As a TA we did something similar: we asked them to self grade their own homework using a provided rubric, and then we spot checked 1/4 of the students (without replacement) to punish lying about what grade you deserve. We didn’t punish for a few disagreements over the rubric, but if it was blatant we checked their assignments every time in the future (and told them). I think if it was bad enough we could have reported them.
This saved a bunch of time on actually grading assignments and made us write a very clear and unambiguous rubric (which required a very clear homework) and also demonstrated to the students that grading was not arbitrary.
Several universities [1] scale out personalized instruction and interactive grading by hiring students from previous cohorts and paying them either in course credit (taking a "course" that involves teaching students in the current cohort) or at a low rate (possibly subsidized by financial aid) comparable to other on-campus student jobs.
How do you justify the fact that only some of the students get the pleasure of an in-person grilling? Or, am I completely misunderstanding the process you're going to be using?
In my plan, each student is interviewed at least once. Ideally more than once by the same teacher, so the teacher can get to know them a little better, spot areas where the student needs more help, etc.
There's still a scaling problem, but I think it makes the ~200 student classes we have now more feasible than 100% autograding. I also like the other commenter's suggestion of coming back to interview certain students each time, if they need it.
Is this about pleasure or about measuring knowledge?
A lot of stuff you learn and the way you learn it isn't necessarily pleasant, but frequently you still have to do it and you really discover 20 years later why it was needed.
No, it's about why only a subset of students get singled out for extra scrutiny, literally arbitrarily, as the selection procedure itself is defined as "random sampling."
random sampling is an effective method for inferring the same information about the larger population that is being measured in the smaller sample, to a certain degree of confidence based on the sample size and known distribution of what is being measured. These concepts are fundamental to statistics.
In college, viva-voce is a significant part of non-theory exams. It’s another matter it was not run well by many colleges but I always loved those chit chat sessions with some of the good professors. Some professors treat it like a boring Q&A which reduces its effectiveness.
I think you might be who the top response is responding to. You seem to have inside knowledge that saving money is the top priority without considering any real-world resource constraints.
The top response is the one that brought the constraint of "scale" into this discussion, and that's what I'm addressing. Maybe you should bring your objections to them rather than to me. "Real-world resource constraints" is just a euphemism for "wanting to save money" in this case. I'll edit it to clarify that I mean the same.
And I'm not passing judgement on the choice made, nor saying the constraints aren't there, nor saying anyone should do anything different. I'm just pointing out that the scalability constraint will affect the test possibilities, which will affect the quality of the measurement. Feel free to disagree with this all you want.
EDIT: Also, I do happen to have some inside knowledge by having worked in higher education for about a decade, starting in the mid 00s. Coincidentally, most of my work was on cost-saving measurements, designing a few algorithms that allowed universities to reduce their teacher headcount (first at a university, then at a software vendor), so yes, the #1 goal there was saving money. But I don't think having done this affects my answer, nor I do think that I deserve special treatment. I'm merely answering to a chain of comments.
Wanting to save money also falls under availability of staffing trained to do this. Considerations of if the massive increase of expense and diversion of people from other economic endeavors is worthwhile.
Good hunch, but in my experience, availability of trained staff was never really an issue in practice. Hiring well trained university faculty was always purely an economical problem. Universities often already have a trained surplus of faculty employees working in a highly reduced capacity. Especially in the last 10-15 years where distance learning became commonplace, a lot of faculty was replaced by low-paid part-time quasi-teachers, which would be more than happy to be offered a permanent position. To further demonstrate that this is an economical problem: those quasi-teachers often have different job titles other than "teacher", depending on the jurisdiction, in order to evade laws and evade the reach of (often very powerful) faculty unions.
Oral exams have an entire other bunch of issues.
Just looking at the professor side, beside time, I imagine it would be very difficult for to grade with same meter an arrogant student, a dismissive one, a smelly one, an eloquent one, or even the first and the last one in the same session.
... a male student, a female student, an attractive student ...
And yes, this is actually a well-known problem in Italy - with (typically male) professors being routinely accused (and occasionally convicted) of favouring attractive (and typically female) students.
I don't agree with this. They have different failure modes, but I believe that in aggregate an oral exam affords the candidate the fairer shot, given the minimal assumption that the professor is in good faith.
If I say something imprecisely or if I make a non-fundamental mistake, an oral setting gives me the chance to correct myself and prove to the examinator that I have a strong grasp of the material regardless.
Written exams, especially multiple choice and closed-answer quizzes reward people who regurgitate the notes, oral exams and written long-form open questions reward actual knowledge.
Of course the "better" methods require a greater time investment, and I can't really blame professors who choose not to employ them. But it's quite clearly a tradeoff.
> If I say something imprecisely or if I make a non-fundamental mistake, an oral setting gives me the chance to correct myself and prove to the examinator that I have a strong grasp of the material regardless
This is just even further proving the point, which is that in an oral context this means that the animosity of the examiner is much more significant than in a written one, which by definition implies that the oral one cannot be fairer than the written one.
You yourself are saying that you "have the chance to correct yourself". This is either because you will self-correct yourself on recognizing a specific (perhaps subconscious) face or gesture from the examiner, or because the examiner will directly tell you that you are wrong. Both cases present ample opportunity for unfair discrimination. In the first case, perhaps a person is less skilled at reading people, or perhaps the examiner just has a better poker face. In the second case, you are now at the whim of the examiner to decide based on your body language whether "you are making a non-fundamental mistake and deserve a second chance" or just "have no idea of the material and don't deserve a second chance". And, compared to the written exam, there is absolutely no record of the context that drew the examiner to such conclusion -- which is also kind of important, since evidently the written exam is also subject to some discrimination.
Nobody expects you to be 100% on point, it's just impossible; it's not like the spoken variant of a written exam. The kind of "correction" I mean is more along the lines of what would happen during a normal conversation. Imagine I was asked to write a recursive algorithm and I forgot the base case. It's not a fundamental mistake, but the professor might interject to make sure I actually know about termination, inductive sets, etc., which is actually great if you understand the material deeply, because it gives you a chance to prove that you actually just forgot.
Obviously this is assuming good faith by the examiner, but if you aren't willing to assume that, there aren't very many examination formats that are going to work very well.
Is not a question about good faith or not. He may be showing completely unintentional bias. But the point is that the oral one gives you a shitton more opportunities to play that bias. If you even try to say that the oral exam is just "a normal informal conversation" rather than something following a very strict protocol you might as well just give up any appearance of fairness. How much role bias would play on such a conversation is just outside the scale.
It's not the examiner deciding "you deserve a second chance or not". In a normal oral exam everyone gets a "I don't think that's correct" or "please explain that to me" kind of response on a wrong answer. They don't silently scribble a note to distract a point from your score or something like that.
How you deal with that is really where your score comes from. Because if you know what you're talking about you'll correct it and while doing so show that you know a lot of related things. While if you have no idea you can't guess yourself out of that type of question.
I don’t know. For example, in music examination, the outcomes change drastically if you blind the examiner from seeing the student or knowing their name. Unless you see something different in the world of music, I’d say the examination is happening at the same level of “good faith”ness.
How would you blind oral examination so that the examiner is unable to distinguish the student’s gender/race/identity?
> For example, in music examination, the outcomes change drastically if you blind the examiner from seeing the student or knowing their name.
FWIW, the study that "proved" that appears to have been a pretty bad study. So, in reality, no: people are not terribly prejudiced, and things don't change significantly when you blind the examination.
All students in a class cannot take an oral exam simultaneously. This means that either:
* everyone gets the same questions meaning later students can cheat by asking earlier students what was on the test, or
* everyone gets different questions meaning much more effort to design the exam and big risks that some students will get easier questions and others will get harder questions
Most of the oral tests I have taken have the questions posted by the lecturer before the exam? I don't understand why it would be a problem for students to say what the question was
> I don't know if it's popular in France, but another very simple idea that eliminates cheating entirely is oral exams.
As an introvert, I am very happy not to have had too many oral exams during my studies (in France ;) ). I think I agree with you in principle, but to me that would have been torture.
You get used to it. I've had the typical weekly oral exam during 2 years in the French "classe prépa", and it was torture at first. I can definitely say that it changed me, made me less stressed about these kind of situation, even years later at work.
I was a student in France and during the two years of high schools I had a bunch of blackboard exams and yeah, you kinda have to learn the material. Of course it also helps to be confortable in such situations, but we had enough of them to get trained in that
You had enough of them to get trained in that. And it might have taken you just few enough to get comfortable for it not to affect your grades in such a way that you dropped out. I had a friend in university that would just completely fall apart in any kind of such situation, even when it wasn't for an exam and even when it was a group presentation setting and it wasn't just him up there. Written exams were completely fine though. Did he not deserve to get a CS degree and just work in some company where he doesn't have to become a team lead or architect where he'd need to speak and present and instead steadily and happily work in his corner, talk to his immediate peers and crank out solutions?
I'm going to go out on a limb here and say that presenting to other humans is (a) a skill that most people can learn (to at least "acceptable" levels of proficiency) & (b) a skill that most people should learn, because it's a huge part of working in the field.
I understand it's incredibly uncomfortable.
I'm a pretty serious introvert and got the shakes and sweat dripping off my hands the first few times I did it. But with exposure and effort to self-improve, it's doable. I didn't like it, but I'm incredibly thankful I was forced to work on it.
Ah yes, the good old fallacy: "I could do it, so it's doable". It's doable by you. That doesn't mean it's doable by someone who is not you, even though they might be otherwise deserving.
It's like LeBron James saying "I learned to dunk, so anybody can dunk!" - but basketball is not just about dunking, and not everyone is LeBron.
Talking to other people is not dunking a basketball 3m into the air.
Frankly, I've been in oral exams, in Romania they're (were?) part of a national exam at the end of highscool. You just have to practice.
If hundreds of thousands of highschoolers in a rather poor country could figure out how to do it (any generally not flunk due to the oral part), for sure university students can do it.
Anyone not able to do it will not really be able to pass any interview, persuade peers that their idea is good, etc.
I've been in plenty of oral exams too, in Italy. That doesn't mean I ever enjoyed them or felt they did me justice.
> Anyone not able to do it will not really be able to pass any interview, persuade peers that their idea is good, etc.
I strongly disagree there. Orals are a situation of complete knowledge and power imbalance between two parties. That is not the case when it comes to persuasion.
As for interviews - yeah, they are similar, and that's why interviews also are seen as very problematic. A lot of people who can be perfectly productive in day-to-day situations, simply don't do well in interviews. We should be striving to correct that, not accept it as inevitable.
I think the way those examinations were set up helped a lot in getting confortable (or at least good enough): like it was a weekly event, just three students and the teacher in one room, each student working on its own question(s); the teachers were more or less helpful, but most would guide us along, not leaving us stuck at our blackboard for the whole duration.
But if one, even if those situation really can't do it, they'd have to switch to a course/class without any oral examination to get their degrees, but I think it's way better to learn as a student than as a professional (and yes, like the sibling comment, I think most people _can_ learn to an acceptable degree)
> work in his corner, talk to his immediate peers and crank out solutions
I think you should quote more of that sentence and then I can say that yes, definitely these do exist:
where he doesn't have to become a team lead or architect where he'd need to speak and present and instead steadily and happily work in his corner, talk to his immediate peers and crank out solutions?
Yes, companies exist, which do not push you out just because you have found your sweet spot of what you can do and are OK with. Of course we're not talking FAANG here and in general I would assume that HN clientele is skewed towards working in companies where this is not possible. However, I can tell you that I've worked at companies personally way back in the past in which I met many such employees that had been in those companies for quite some time.
The big thing here being "talk to his immediate peers". The guy I was describing was completely fine working w/ us, his friends. Put him in front of an audience and he's got a problem. Of course it'd be hard to get a job in the first place, but a lot of places also did exist at least back then where no coding (neither take home, nor whiteboard) were part of the hiring process. Of course you won't make that guy a consultant at Accenture, he's gonna fall apart.
There's only so many issues someone can have until people in general will decide to give them a "fuck off, I don't care" treatment.
You don't have that for verbal communication with other people, but I'm sure if digged far enough you would have the same reaction to something else that other people think is acceptable.
Just how accommodating should the standard test be?
If the answer is "infinitely", I think you won't find any test that satisfies it
I studied engineering in Italy and all my exams were both written (with exercises, multiple choice didn't exist at all) and oral. No way you could cheat or not engage with the materials.
After a class on data structures and algorithms, a white board interview asking you to invert a binary tree is very different from the same interview when you apply for a job.
The only thing they have in common is "assessing". An exam for a course seeks to assess mastery of the subject matter of the course. An interview for a job seeks to assess skills / aptitude for a particular job.
This.
Moreover, an exam for a course is, to some extent, an assessment on how the course was delivered. And an interview for a job has a much larger scope.
I had an electrical engineering final as an in person oral exam. One question. One hour to solve on whiteboard. It was a hard class to begin with and I got a hard question. I did well, but definitely expected to fail.
Totally agree.
I might a bit partial to that, because I tend to underperform multiple choice tests for overthinking, but I've really the impression that open ended questions test knoledge much better and make it more difficult to cheat.
Beside that, having almost nothing to do with cheating, another good thing in the French system is the continuous grading: labs were graded, projects were graded, small intermediate tests were graded, so you really do not study for just the exam (actually often you do not study at all for the exam).
(beware: my experience is limited to a single grande école I attended).
I went to INSA in the early aughts and we didn't really have continuous grading, labs (TPs) were graded but the biggest part of the grades (les partiels - exam week) happened twice a year (or 4 times a year during the first two years of prépa intégré).
I do know that since then they've moved to a continuous grading system. I'm not sure if that's the same with other grande écoles but I do know that my friends in other grand écoles had a similar system of 2-4 partiels a year.
I'm currently grading an open book test as you describe. It turns out that someone put their attempt at answers on chegg.com shortly after I posted the test. The temptation to use chegg is too great for students to resist. When chegg has the wrong solution (which is often the case), students will doubt themselves and will go with the wrong chegg answer.
To be clear, the only goal of chegg.com is to help students cheat. The world would be a better place if chegg and its copies did not exist.
My solution to this is to use version control and have them record an explanation of their work. If they copy from chegg they also have to forge a commit history, as well as explain the code line by line. I’d like to see them do that without learning anything.
I suspect that a "certificate of course completion" (or, if you prefer, "a course grade") does not actually requiring comparing individuals A and B.
It does, however, require gauging that individual X, for any individual X who have taken the course, said individual X have acquired enough knowledge to consider "having passed".
Anything beyond "pass/fail" is merely trying to stack-rack students and impose un-needed competition. But it is good for the gamification of knowledge acquisition, so perhaps not entirely bad.
Yeah I came from uk undergrad to us grad school and was shocked to see that even some advanced undergrad classes, with grad students in them, were tested by infantile multiple choice questions (this was at harvard). It almost makes one wonder whether the us dominates academia to the extent it does because of the foreign influx.
> I was an exchange student in the US, tests involved multiple choice questions, they were closed books with questions around rote memory.
As a US citizen, many tests were open book essay style, especially once we got to college.
In public school however, lots of "standardized" multiple choice tests that were used to grade the school. Some of those tests also include an essay portion.
Teachers in the US aren't paid to do grading, they typically do it at home in their own time, thus very few essay style tests.
I have experienced good tests and bad tests. I studied in France, tests were open book with no multiple choice questions, only problems to solve. This approach scales badly and is a lot of work for the professor grading but it measures knowledge.
The problems were long, had few questions besides describing the problem and maybe a few questions to guide the student along the path to solving it. We had either 3 or 4 hours to solve those problems.
Those tests worked very well. I'd come out from one of those tests having often learned something new.
I was an exchange student in the US, tests involved multiple choice questions, they were closed books with questions around rote memory. While I did feel that some of the education in the US was valuable and interesting, I hated those tests, they didn't correlate as much with comprehension of the subject matter and more with learning facts that are more or less tangentially related to the subject matter. I still remember in a computer graphics tests being shocked by being asked when Opengl was first released, which companies were involved and other completely useless knowledge.
What's interesting to me is that there's much less opportunities to cheat with the former tests while the later tests are pretty much made for cheating. So, imho cheating is a symptom of bad tests.