Sanfilippo Metagov 20210818 2119

Speaker 1 0:00 – 0:00

Okay. So, everybody, let's welcome Madeline. She's here, from University of Illinois or, at Champagne Banana, and, she's a, she's at the intersection of a lot of exciting interesting things, online online sociology and organizing, the whole Ostrom literature around self governance, and a lot of legal scholarship. And we're really excited to to, hear how those things come together in the context of EdTech. Thanks a lot, Madelyn.

Speaker 2 0:15 – 0:15

Thank you. Thank you so much for this opportunity to speak with all of you today. As as mentioned, I'm gonna sort of briefly present a series of projects that are all about sort of co production of governance in the ed tech space. And I'm really excited for discussion because some of this work is is still ongoing and I'm sure many of you have super interesting thoughts. So, my research has really broadly focused on governance and socio technical systems. I've been specifically interested in how formal policies and laws are sort of translated into informal management or into design and architecture. So I still understand sort of how the rules on the books so to speak, are are different from what users experience, are different from architecture as we experience sort of a code as law less sense, but more so focusing on how translation processes can be improved to prevent inequity and sort of mitigate unexpected externalities. So I've often focused on privacy governance. And as our current extended state of emergency began in the 2020, a lot of the governance issues and privacy concerns that I was thinking about became really prominent in thinking about educational technology. We were sort of massively depending on existing and new infrastructures. We were constantly complicating these socio technical systems. And this has really stayed at the forefront of my attention. So I had done some previous work on privacy governance amid natural disasters, for example, and was concerned both as a parent of an elementary school child and an educator that this crisis was really going to be used to justify a lot more data collection, surveillance, and bad practices in the name of emergency circumstances, but that they wouldn't go away in the long term. So governance challenges around educational data are not really new or even exclusive to the current pandemic. There are a lot of thoughtful scholars including in this group who think about student privacy with a few recent papers which are sort of relevant background to to my work noted here. And these folks are thinking about a lot of issues around data structures, what trusts and fiduciary obligations look like with respect to educational institutions, the proliferation for example of data philanthropy in educational context, thinking specifically about what's being currently amassed in terms of a dataset with the Gates Foundation, most notably, and and really the limits of educational privacy regulations in The US in particular. One of our most significant oversights in our failure to address educational privacy at large is really our failure to address educational metadata at all in most of these regulations. In this sense I'm talking about all of the data generated as students interact with platforms, have they accessed readings, how much time did they spend on X, is there evidence they may have collaborated on Y, or do they have multiple tabs open as some of the more digestible examples, but lots of other data generated in ways that might not be obviously parsable to instructors or educational institutions independently. We tend to focus on traditional student records and often directory information, under FERPA, thinking about the constraints we put on specific actors, as were the relevant actors during the primarily analog educational context at the time for both this past. But there's also a massive proliferation of state regulations which actually really complicates compliance and the experiences of individual students or even school districts as the the burden of compliance is is really often put on schools and on instructors to ensure that the vendors they're working with or the platforms they decide to use are adhering to local protections rather than the more or less floor that's that's imposed by FERPA. And in particular, we see issues with the difficulty in making sure that vendors are adhering to what is negotiated contractually in data protection or data use addendum, which I'll come back to. But even when contracting really specific things at this level, they're often still overlooking metadata in most cases. So further closing a challenge around educational metadata is the fact that most people don't understand what this is, its importance or its use, educators don't, schools think this is probably valuable data, universities do this, school districts do this without really knowing its meaning. They also don't necessarily understand the regulations as evidenced by lots of educational technology and educational privacy research, and they don't necessarily have a sense of what accountability or enforcement avenues might exist, much less how to explain all of this to students and their families. So this really brings me to the first study, which was actually just presented at SEWPS last week. Videos from all of this should actually be available on YouTube as well if anyone wants me to share that later on. But I was a fellow at CITP at Princeton last year, and we had a reading group led by Anne Kollbrenner and Ross Teixeira on EdTech. And so at the onset of the pandemic, we all sort of put together this Slack on virtual education that any educator at any level of instruction was welcome to join in order to, obviously communicate about the struggles they were experiencing, but also to to learn more about resources that might be valuable to them as they were attempting to bring their classrooms online. And in particular, a thread on privacy and video conferencing highlighting some of the struggles of instructors but also the concerns their students were articulating about issues of bias or the breach of privacy and asking people to come in into or to look in their their dorm room or their home to potentially see their images repurposed in particular ways led us to put together a series of surveys in order to assess the use of video conferencing and instructor understanding. And our our main findings were, you know, unsurprisingly focused on the fact that stakeholders had really vastly different interests and understanding. But this was actually leading to really unexpected harms. Instructors and students decided to use personal Zoom accounts in a lot of cases because they were concerned about university surveillance or ownership of their instructional videos, they were actually losing the added protections that had been negotiated by many of these educational institutions, universities with vendors because they didn't know they could change default settings. They could record locally rather than to the cloud or or to university servers, etcetera. And further, static and dynamic testing of the platforms they were using revealed that a lot of the data collected and shared wasn't always clearly disclosed or even permitted. Instructors were largely unaware but IT staff and admin in our surveys and in conversations reported a lot of concerns and really unique governance strategies that we actually followed up in one of the other studies that that I'll touch on. But this first paper really got us a lot of perspective on the educators and the workarounds they were developing, such as alternative configurations and obfuscation approaches, and their loud, loud requests for better governance and and more transparency about what data hope is being collected and how it how it might be used. We also got a really good look at what was and wasn't being approached in data protection and use addendums, DPAs or or DUAs, depending on the type of institution, by universities. And this brings us back to the issues around metadata and finally to, co production. So following this study, we sort of have two different two different projects working on with Madiha Choksi and Jan Schwarzenegger and then Jan Schwarzenegger and Noah Aptorpe. But the first is really focusing on grassroots protection and production of alternative governance mechanisms. This is primarily associated with all of the student activism and occasionally involves faculty allies as they're contesting the surveillance and lack of transparency around the use of metadata for algorithmic decision making in grading and student academic misconduct proceedings. We see in particular the the University of Wisconsin case in which metadata being generated was being used not only for participation and grading, but also to administer academic misconduct proceedings based on things algorithms were suggesting about the ways that students had engaged with this without necessarily disclosing any of those practices to students. The other project is really drawing on EdTech developers and admins who are oddly working together despite the fact that often they're at odds in other aspects of the university to try to govern privacy by design or to contest third party plugins on their campuses in light particularly of SEC disclosures about the monetization of student data, particularly with Blackboard. So we can look at various LMS systems to see their metadata generation and its transparency to instructors and how that might differ from what students are aware of. I've personally talked with both Moodle and Canvas with Moodle really surprising me the first time that I used it by flagging suspected cheating based on the timestamps on exams, identifying groups of students that were likely answering things together in ways that I didn't anticipate and certainly didn't want to surveil my students particularly in that I had given them an open note collaborative quiz to complete. And so all of this was not only unexpected to me as an instructor, but certainly to to students. Canvas actually providing considerably more analytics in ways that students are not aware of and that the student protests reveal are really unclear and problematic to them. This work is largely drawing on social media data using content analysis to understand how obfuscation strategies and norms are socially constructed in a lot of times, in a lot of cases for good, thinking about protests against particular types of surveillance, thinking about bias prevention, etcetera. But also in some cases less good as obviously some of these threads are dealing with how do we successfully collude or how do we successfully cheat on online assignments. We're not really normatively judging the the outcomes that might be associated with this in this work but rather thinking about how they are collaboratively producing workarounds and testing the boundaries of these systems in order to find different governance that they can impose in effect when they may disagree. This work largely has addressed LMS systems but is also taking a look at some of the proctoring software and the use of educational algorithms. This work is ongoing and we would really love feedback or suggestions. The other project I mentioned is focused on a lot of the discrepancies between the rules on the books and those in use and how they can be addressed by design, as well as how EdTech staff are working together to govern third party data flows that they're concerned about. Here, we have a combination of surveys and interviews as well as document analysis around the evaluation protocols for plugins and for the use of particular platforms and we're using this to to inform our work. We find that there are a lot of, again, stakeholder tensions, unsurprising, But in particular, that faculty pressure for like the new pool thing following their conferences and business decisions that are being made at the university level are overriding privacy and security concerns, which might be really legitimate. And so some of the first efforts that we uncovered in our work were especially at well funded universities to institutionalize the process of evaluation, which helped with a lot of the faculty pressures. They could clearly say these are the concerns we have about this, but also if you what you really want is this feature rather than this specific plug in, we can develop something internally or we can borrow a little bit from various open source repositories to put something together that will work in the same way. But it doesn't stop business decisions as with textbook publishers or proctorio that are going to be negotiated at a higher level, often without ever consulting them. And so an informal network of ed tech staff and developers have found each other and begun to share more code, actually a lot of open source options beyond this, but working together to overcome some of the resource constraints, particularly as large public universities are working with community colleges or colleagues and friends at other institutions that might have more constraints, to turn to these institutions as leaders and borrow the solutions they've come up with. I oddly came to this despite being at a well resourced university that's sort of a part of this through some of the survey responses in the previous studies and pointing me towards who I should be approaching on my own campus. But it's nice to recognize that these individuals see their privilege and their opportunity to influence practices and prevent both the monetization of student data as a major concern and a lot of third party flows that might be less controlled and not necessarily conforming to normative expectations either in EdTech or by students or instructors. And so I realized that was a little bit about a lot, so I'm happy to just be quiet now and take any comments or questions.

Speaker 1 0:30 – 0:30

Thank you so much, Madeline. Zareyam.

Speaker 4 0:45 – 0:45

Yeah. So when you were talking about the cheating and looking at the discussions of how people avoid these systems, you know, the thing that that really begs for me is the the ways in the data ends up getting used to, say, use algorithms to identify potential cheating and some of the issues you wrote earlier in the talk. But there's, like, a tight coupling between the workarounds and the sort of, like, fairness with respect to honest actors. Because by definition, it's only an exploit if it kinda looks like an honest actor. And so then you've got honest actors look similar in the data to the exploiters, and then there's, like, a really nuanced sort of algorithmic policy problem that arises in just say something as simple as a binary classifier with some probability of is this person cheating, and you get canonical just in prob justice problems out of the box because even those ML algorithms have sort of sensitivity parameters that trade off between, say, type one and type two error, and you just can't get around it. It's like a fundamental epistemological blocker that in this context appears as a, again, like a justice problem. And I'm curious whether people have raised this or is this just, like, I don't know, like, it's obviously present, but I'm curious whether people acknowledge it or deal with it.

Speaker 2 1:00 – 1:00

Yeah. I think I think that, obviously, a whole host of of interesting issues there and interesting things to to pursue. But I think as as we've seen, in particular with the the student protests and the student efforts to to work around, There's some sort of recognition that this may work in their like the bad actors seem to recognize that this may work in their favor that because there are other sort of legitimate things that it is going to simply put pressure on them to stop automating things, which I think in the end is is perhaps a good point of pressure. But I think that a lot of the instructors are not necessarily thinking about it this way and a lot of the university administrators, at least in survey responses, I haven't actually had opportunity to interview many of those folks as of yet down this research path, are not necessarily considering this problem at all. It's it's oddly sort of the the learning analytics specialists, the ed tech specialists who recognize this and I think the the bad actors who who seem to feel this this works in their favor. But an an interesting, perhaps, basis to to to direct questions, I think, as we continue to do interviews. It's not in in our direct questions right now. No. Please continue.

Speaker 4 1:15 – 1:15

Well, I was just gonna say I I've seen it in other applications. Like, basically, anywhere that you have adversarial behavior and data and algorithmic policy, it's like a broad class. And so Sure. I I encountered it in in a different system. But my reason for asking is that I think this is more closely aligned. I was, like, literally, like, kids. So I'm, like, hoping that people would would be able to kinda get behind on on entangling it. So if we're gonna deal with sort of machine learning and AI ethics questions, you know, okay. It's one thing to get pick a fight with Google and Facebook. It's another thing to go into sort of an ed tech and try to just make the the, I don't know. The the selling point, like, you know, protecting kids from exploitive behavior seems a little easier to surface this problem, but it's a fundamental.

Speaker 5 1:30 – 1:30

Sure. Sure. I

Speaker 2 1:45 – 1:45

think it's also interesting because actually a lot of the discussion of workarounds is is not necessary. I mean certainly there are a lot of Reddit forums that are high school students talking about these problems. But we also see a good number of these discussions that are adults that are being you know, surveilled with Proctorio for professional certifications and how they are working around, these sorts of things or or even really innocuous things like, like, hobbyists and the certifications with respect to wine, for example, using increasingly invasive surveillance technologies as they're evaluating that. And so it's very interesting to see the the differences in dynamics based off of what's at stake and and whether this is sort of a a gamification problem in the case of some of these adult hobbyists versus students who really feel that their academic career and and future opportunities are at stake.

Speaker 4 2:00 – 2:00

Well, one of the problems might just be whether we have good measurements of false positives because if these organizations are simply denying credentials or giving you know, taking students to litigate some, like, misconduct process based on something an algorithm spit out. The truth is that there's no ground truth on that. You're looking at the same data regardless of whether they were actually cheating or not. And they're using the algorithm as an objective claim that they were cheating is actually a misstep. And so at least my perception on this was that it's still being treated as though it's ground truth when in fact it's the estimation of the algorithm conditioned on the data with an unknown ground truth about whether cheating actually occurred. So a not a student who wasn't cheating who got flagged as cheating could actually have their future put at risk because some algorithm was particularly sensitive to something about them. And in fact, you get vulnerable populations with high degrees of correlation with some of the flagged behaviors getting underserved, like, structurally by this kind of system.

Speaker 2 2:15 – 2:15

Yeah. I think that's a a really good point, particularly as many of these systems are literally providing red flags on the digital submissions for quizzes and exams without any explanation. It takes a considerable amount of prodding and sort of reading impercible illegible to the average person output to try to understand what on earth produced this red flag and what it could possibly mean whereas administrators say oh red flag that's you know automatically going into our disciplinary proceedings.

Speaker 1 2:30 – 2:30

That that's a nice back and forth. Next, we've got Amy on the queue.

Speaker 5 2:45 – 2:45

Yeah. So my question is actually very much related, to this. Thanks for the talk. I think if I understood you correctly, you said, you were either currently involved in or planning specific research on remote proctoring, and I would just love to hear what those project those projects or project is gonna be. So, yeah, very much related to this, but just curious about the specific research project there.

Speaker 2 3:00 – 3:00

Yeah. So this sort of middle paper, I think, that was was mentioned is in its earliest stages. And we had sort of started by, by thinking about, particularly LMSs and and some of the protests that had gone on there and exams that were being administered there and what Canvas was doing, what Blackboard was doing. But quickly as we started to to look through all of these social media threads and all these discussions among students, there were equally as loud conversations about the proctoring software. And many of them were linking back and forth or directing people or introducing people that they had met on these other threads. And so it became so intertwined that I don't think they're it's necessarily separable in in this case. We're not necessarily, we're not necessarily looking at this with respect to the administrators or the instructor side. This this side is or this project is pretty specifically looking at students experiences and how they're navigating governance they find to be illegitimate, untrustworthy, non transparent, and what ways they're socially constructing workarounds to that. And so there were sort of interesting threads that dealt with things like, for example, the fact that many of these proctoring softwares are not designed with Macs in mind. And so you can have sort of a virtual machine running or something and they have actually no way to to tell that and so students were, you know, at first just sharing workarounds like this but also finding ways to, you know, align devices or or have notes on the physically on the screen and and using a very small window to do this, other other sorts of ways, whether they are analog or or digital. But this is this is still, in the collection stage. So any any thoughts and and advice and and directions are much appreciated.

Speaker 5 3:15 – 3:15

I mean, it just it sounded really interesting. I'm also trying to start a project on remote proctoring, which is why I'm asking. Of course, our questions are always about our own research. So it's just it's helpful to hear you talk about, looking at the the student perspective. I think that's a really valuable approach to trying to figure out how our students actually trying to, like, resist and get around and try to counter the sort of seemingly, like, absurd level of survey. Like, I've just started getting it's just it like, I don't know if that's fair, but it seems like literally absurd, like, how much, surveillance there is and, like, registry access and just, like, surveilling the entire all of the Internet connections that are happening on the network. It's just, like it's kinda mind boggling, like, how much access. I mean, yeah. I know you know this. It's just sorry. It's just, like, very, very disturbing. And then as the first person said, like, we don't even know if the flags are accurate. Great. Like, it's just it's wild. Anyway, I don't have anything, particularly, to add. It's just it's it's helpful to hear about that. And, it sounds like a great project. So hopefully you can Thank you. Keep in touch and tell me more about the the findings, from the the student.

Speaker 2 3:30 – 3:30

Be great. I'd like to hear about your one

Speaker 4 3:45 – 3:45

as well.

Speaker 5 4:00 – 4:00

Yeah. Yeah. Yeah. We should chat sometime maybe. That that would

Speaker 2 4:15 – 4:15

be Yes. Please. Thanks.

Speaker 5 4:30 – 4:30

Good to meet you.

Speaker 1 4:45 – 4:45

On on the student side, my so I'm I'm guess I'm working a little bit off of my experience in, like, in the in the Disney context. You know, studying a little bit Sure. The whole surveillance thing in the context of of being at Disney World, having a a device that falls you around constantly and and and and getting the sense and asking, you know, guests what they think. And they tend, disappointingly, weirdly, bizarrely to love it. They're they're crazy about it. They're impressed by the technology. They trust the company, for and, and the the technology isn't just used to surveil them. It's definitely used to surveil them. It's also used very explicitly to enhance the guest experience, basically, to to give them stuff that makes them feel like it's worth it. They're wearing a wristband. And so and the effect of that is that we tend to care about, you know, guests a lot more than they care. And I'm wondering, you know, how much of a how much of a counterpoint that ends up being in the student context as well. Do they really care? I mean, there's it's it's inherently slightly authoritarian structure, the whole educational system anyway. So they're so often resigned to a lot of decisions being made about their lives. Are they is this really even registering with them?

Speaker 2 5:00 – 5:00

I think a lot of these a lot of these surveillance mechanisms in EdTech are registering more so than they ever had in the past based on the other constraints they're experiencing. And so students talked about the fact that it never really felt weird to, I don't know, to they never really felt like their privacy was infringed when they were going into a classroom and, you know, what you know, their classmates or their instructor could see about their face. But turning on their camera rather in their dorm room or in their apartment was really different, not only because someone could capture their image, but also because it felt so invasive as they thought about the fact that walking behind their roommate or something, they were suddenly in someone else's class and all of these people they didn't know were seeing them. And so this was sort of a I teach an undergraduate class social aspects of information technology in the fall with, you know, 200 freshmen and sophomores, and there they were just so uncomfortable with all of this. We we pretty much opted for cameras off except for mine for the whole semester, and and they shared, you know, course policies that mandated these things and had shared, you know, friends at other universities the types of policies they had that you were, you were automatically consenting to have your roommate or your family appear in in these videos and you you had to have it on and that's just much more invasive than it has been. It's also interesting to me that you bring up the Disney case because in the past year lots of universities have redone their their student ID cards so that they are they include the same RFID that Disney has, which helps to restrict building access in the event that they don't have access due to a negative COVID test or an exposure where they're supposed to be in quarantine. That's true on my own campus, but also you know, campuses with mandatory vaccination, they don't get that access to any on campus space until they're fully vaccinated. And so it's sort of interesting to see how these technologies that like this point, right, you can choose to go to Disney or not. Even at Disney you can choose to opt out of that and request an analog ticket, although most people don't know that they have a right to do that. But there's really not an option to opt out or to be a a Luddite or a conscientious objector in in this context.

Speaker 1 5:15 – 5:15

Yeah. And, particularly, I can see that with Zoom, super invasive, the the the online proctoring, crazy invasive. I've I've, I'm critically definitely been using those technologies. But what maybe I what about CMS? Are you seeing that same kind of resistance or awareness in the context that as our elements?

Speaker 2 5:30 – 5:30

Sure.

Speaker 3 5:45 – 5:45

Are are

Speaker 1 6:00 – 6:00

they taking that more? Or because it's a little more subtle. It's a little more a bit more explicitly in the context of familiar processes. Is there resistance to, like, Canvas and Lightform

Speaker 5 6:15 – 6:15

on those

Speaker 3 6:30 – 6:30

two graphs?

Speaker 2 6:45 – 6:45

Yeah. I think that the the resistance here is, is only emerging. I think, you know, anytime something happens, obviously, students become more aware of it. Some certainly, there are classes that start to talk about it as as in in my case where the first time we we used Moodle, we had everyone was sort of very unexpected. It was it was intended to be a collaborative assignment and all of these people were getting flagged for collaboration, but but you couldn't override the system to prevent that. It didn't care what the rules were. And and I think that in particular a lot of these universities and these student papers that are shining a spotlight on disciplinary proceedings that are going on, on things like the Dartmouth Medical School, exams, on the use of algorithms for various decision making on campuses are shining a light on things that people previously thought were really innocuous. They previously thought, well, this is just an easy way to access all of these things. It's not particularly meaningful. Students weren't really thinking consciously about how they were engaging with them. And increasingly there are discussions on social media where where they're starting to reconsider and they're starting to think like what are good practices? What what are the the ways in which this data might be seen or used by my instructor or by the program or institution?

Speaker 1 7:00 – 7:00

That that's encouraging and timely. I I I guess I see, echoes for increased awareness, turn like, turn it in type systems which which well, I I guess some searches tend to be a little better, but actually at least communicating that that's gonna happen.

Speaker 3 7:15 – 7:15

But still,

Speaker 1 7:30 – 7:30

I I wonder maybe even to turn it on its head. There are so so one plagiarism problem is students taking all their notes and exams and selling those to a company that will, in turn, let other students subscribe to those notes and exams as a sort of personalized cheating service. And, yeah, I'd be I'd be interested in how sort of aware students are of, you know, giving some in some sense, publishing, giving rights to their work to tell these other people all over the place. Other I'd love to keep going on, but let's just see if anyone else would like to speak up who hasn't already had some questions. I know. Greg, it's it's a pleasure to have you here. I know you're you're you're fairly active also on sensitive data and and and governance issues. Are there are there parallels to the educational context, Pete?

Speaker 3 7:45 – 7:45

Hi there. I'm, sorry. I'm, in a grocery store getting my lunch, but this is a fascinating conversation.

Speaker 2 8:00 – 8:00

Hi, Greg.

Speaker 3 8:15 – 8:15

Hi, Matt. Hi, Matt. So, I mean, I think, I also work in the health and human services space, and I think a lot of the conversation among people who are designing and implementing these systems or making decisions about these systems is that the love of, like, individual notice and consent. These problems is just to make sure that that it just strikes me as a woefully insufficient framework in, in most cases because, like, you know, I can set the things all the time that I don't fully understand because I I trust the organization or I don't, but I don't necessarily understand what it is I'm consenting to. And, in these texts, this data is being collected and potentially in ways that even the providers, like the in this case, the educators, like, might not understand. Consent might not even be possible. Is that is that right? Like, that that you're basically dealing with people who, like, who who are in this whether they like it or not on some level. I guess my question is in light of that, like, is there any glimmer of possibility that's being discussed?

Speaker 2 8:30 – 8:30

Yeah. I think

Speaker 3 8:45 – 8:45

stakeholder's journey in any institutional context where where there's, like, coproduction that's not just, like, essentially self produced, but coproduction in

Speaker 5 9:00 – 9:00

a formal way, you know, above

Speaker 3 9:15 – 9:15

the board? Is is that anything is it

Speaker 2 9:30 – 9:30

Yeah. I think, obviously, the the efforts in in university IT and EdTech to work together and to try to systematically address this to develop plugins that can be shared and used that will not facilitate third party data sharing is one way to deal with this. But I also think a really interesting example from the now published project is that the state of Connecticut decided that this was just wholesale problematic. And they were going to start negotiating and contracting on behalf of every school and public university, and college, and community college in the whole state. And so they have much more stringent guidelines that are aligned with, obviously, regulations in this state. But they make really clear protocols. They have little PDFs that give tips on what the implications of particular settings are. It's like a great model for how you can be contextually responsive to state level regulation and try to think through what are the appropriate choices, particularly in the the k 12 setting where teachers don't know, school districts were having difficulty negotiating this. They don't have, you know, armies of attorneys to think through the the contact thing specifically. And that's sort of one very different approach, stark contrast with, I think, what we've seen absolutely everywhere else. Yeah. I can I can share some things in the Slack?

Speaker 1 9:45 – 9:45

Go ahead, Sam.

Speaker 6 10:00 – 10:00

Awesome. Thank you, Madeleine. This is great. I have a question that's, related to the question of kind of proctoring as but maybe steps outside the scope of this a little bit. But the thing that I've been thinking a lot about is ways in which disciplinary technology can reproduce racial inequity in education. And in particular, as we talk about the variance in pushback, from different schools and different, you know, communities, different levels of starting resources and, access to information about how systems are being designed also, purports to, like, make that, you know, inequity worse. So I I wonder if that's a piece of the research that you're doing and and where that, fits in.

Speaker 2 10:15 – 10:15

Yeah. I think these are really important points and and not something I necessarily focused on today, but something that's come up time and time again through throughout the different stakeholder groups, we've spoken with. And in particular, I think what is perhaps most promising is the recognition in the ed tech sector, obviously not on the commercial side, thinking in universities, that this is a real problem. And these cool tools that, you know, are demoed by vendors and sponsors at various disciplinary educational conferences, often are premised on really problematic things like, oh, we can make grading easier by using all of these models about what, you know, appropriate or conforming language will look like in an essay and and will, you know, make this easy to use plug in that's going to solve all these problems, but really exacerbates exactly the types of challenges that that you're talking about. And and so that's, I think, one of the strongest places that, those we've interviewed have had success in pushing back against the business interests. They're not able to push back against the textbook textbook publishers. They're not able to push back against, you know, Proctorio, but they are able to push back on this because the the universities, I think, are especially conscious of what, their liability would be, what would would go on there if they end up adopting these things and then being the the center of, some sort of scandal associated with these things. And it's it's almost like fear mongering that has been the most successful tactic at least by the the responses of our participants.

Speaker 3 10:30 – 10:30

Daniel? Maybe you're muted.

Speaker 4 10:45 – 10:45

Oh, if we're going next, my thing was actually closely related. I actually wanted to ask about how the policy and the sort of data science or analytics expertise are, you know, composed or decomposed and to contextualize it with this past this discussion here, the nuances of of building pipelines and algorithms that sort of address this kind of problem, like historical data that's sort of got either, you know, latent correlates with certain things that are, you know, correlated with past data about, you know, people who were caught or manually caught or treated as having been cheated kind of compounding in a feed forward way drives a lot of the vulnerabilities that we were just discussing as well as, you know, other types of, you know, biases baked into the the data that is being used to actually train these things in the first place. And so you end up with this really awkward phenomena where you can't totally decompose the policy making from the actual machine learning expertise. And so sometimes you get, like, these weird adverse effects where people think that they're regulating it effectively, but they're making it worse because the like, for example, people will cut certain datasets out of machine learning training sets where, in fact, cutting it out just makes it in some ways, harder to see that it's being biased. Whereas, like, if you actually wanna, like, reduce sensitivity to factors like, you know, race, gender, income level, you have to actually both include those things and constrain the algorithms to prevent them from being able to differentiate on those dimensions. And then so there's that level of technical sophistication that presumably the experts who are building the software may or may not be applying, But then the other layer of policy making that relates to going beyond just saying, well, be careful about bias, or you're not allowed to be biased. And what is how does the the jump between, you know, asserting policy potentially at the level of a state and, asserting policy at the level of an individual algorithm or class of algorithms. And, anyway, sorry. Long rant. But question is really, do you see any, even I know you already said that that's not the specific area of research, but do you see, like, a technical literacy in the people who are, you know, at least in the the leaders who are looking to regulate this, or is it still kinda coming down from above and left to the individual LMS makers, analytics experts to actually interpret and act within that policy? Is there any transparency into success in the in the attempts to mitigate this?

Speaker 2 11:00 – 11:00

Yeah. I actually might point you towards work by Alan Drubel, Kyle M. L. Jones, and Chase McCoy. They're respectively at Wisconsin IUPUI and Indiana universities, and they're all sort of thinking really specifically about learning analytics and what that looks like from an ethical and policy perspective. So I think they've dealt with it a bit more, a bit more directly and substantively. But, I think that there's sort of a growing role that people with technical literacies are playing in the policy making at the government level. We see these sort of temporary positions in particular states where academics from computer science and and information science for example are serving in in some of those those roles to advise on that. I know Kentucky and Nebraska I would not necessarily otherwise think of as exemplary in dealing with some of these problems, but have really drawn on their public university expertise to help inform the discussion and some of the choices that are being made. On the technical side, I think we see really big difference in the types of values and interests that are associated with types of actors in this space. And so, like also to tie to Seth's question in the chat about like the open source commitments by Canvas, this is also true for Moodle. We see a lot of these universities who have solved some of these problems themselves making the code available, making the plug in available, and instructing people to run it locally. They do not want the data from these other the some of these third party flows. And in particular the the problems around monetization. It's it's, you know, the plugins from all of these startups that I think are some of the the most problematic in many cases because they don't have a really good sense of of how it will be used or maybe they didn't even intend for this to happen in the first place. So I actually spoke Monday I guess with someone who developed a plug in to be used for K-twelve to report COVID symptoms for the screening check that has to happen daily so parents could do that at home and make sure that all of that was there. It was totally intended to be, like, an exercise. And is this possible? And how would we do this? And not ready to be adopted. And people are adopting it. And he's like, I don't know how to secure this data. I don't have a privacy policy. I do not know what to do with this thing. And and that sort of sort of speaks to this. So

Speaker 4 11:15 – 11:15

There's a quick question in the chat about I use this term latent correlate. I'm gonna try to explain it concisely. I just imagine that you have a bunch of data about someone, and some of it is, like, not cool to use to to predict, like race, gender, etcetera. If there's other data that explains to people, which is highly correlated with those factors, then what ends up happening is that if you delete those variables that you're not supposed to use, then the results actually still end up predicting effectively as if they predicted using those factors you took away by by teasing out the hidden relationships between those factors and ones that stayed in the set because they weren't in the group that you removed for regulatory reasons. And this actually is a real problem precisely because people raise their hand and say, well, we're not that biased on gender. We're not biased on race. Our data doesn't even have that. But what was actually happening is that those factors are correlated with things you do have, and so you get something that's still basically predicting on those factors. You just get to pretend like it's not.

Speaker 2 11:30 – 11:30

I when I teach this in my class, the example I always use is of the, lawsuit against the Facebook or against the Pennsylvania State Troopers, based on their Facebook ads, which were, targeting prospective candidates based on an interest in country music, wrestling, and trucks as proxies for white male, which they couldn't otherwise use. So it's it's, I think, a pretty clean example of exactly what you're describing.

Speaker 6 11:45 – 11:45

And are there any are there any capacities in place to evaluate this stuff? Like, are there humans whose job it is

Speaker 3 12:00 – 12:00

to think about this and talk to other humans? Like or or is

Speaker 6 12:15 – 12:15

it just, like, each institution has their lawyers and their decision makers make the decisions? Like like, what like, given all

Speaker 3 12:30 – 12:30

of this ambiguity, who's responsible for suit sorting through the ambiguity? And maybe nobody.

Speaker 1 12:45 – 12:45

The crickets are a very meta answer to your question. Yeah. Maybe that's the answer to the question, the crickets. I I want I like to end before the hours because because we often go over, and I think that does that doesn't respect anybody's time. So with a little bit of a breather, but before our next Zoom meetings, I would love it if we could all unmute and give Madeline a hand on three. One, two, three. Alright. Thanks so much. Thanks, Matt. We'll have the link posted before long. Madeline has been on the Slack, so we can also, continue the discussion there as well. Thanks, everybody.

Speaker 3 13:00 – 13:00

For those of

Speaker 1 13:15 – 13:15

you who stand after, I'm gonna end the call and then start

Speaker 3 13:30 – 13:30

up. Yeah.

Sanfilippo Metagov 20210818 2119

Top Keywords

Transcript

Listen