Jo Guldi and Brent Hecht: Maps, Computers, and Other Abstractions - Information Infrastructure and Legitimacy
RadicalxChange(s) | 2021-06-28 | 1:16:05
This conversation between Jo Guldi, Brent Hecht, and Matt Prewitt ended up being a wide-ranging discussion that surfaced essential ideas about getting more thoughtful about the boundary between public and private power by understanding what’s infrastructure and what isn’t.
Top Keywords
- infrastructure 0.018
- wikipedia 0.012
- google 0.008
- maps 0.007
- roads 0.007
- index 0.006
- brent 0.005
- public 0.005
- google index 0.004
- private 0.004
- abstraction 0.004
- data 0.004
Transcript
Speaker 0
0:00 – 1:06
This is a RadicalxChange production. Hello. I am joined in this episode by Brent Hecht and Joe Goldie. Brent is a professor of computer science at Northwestern and a director of applied science at Microsoft. Joe Goldie is a professor of history at SMU and a senior fellow with the Radical Exchange Foundation. The seed for this conversation was the question of whether Google's index of pages should be understood as a form of public infrastructure, and if so, why? It ended up being a wide ranging discussion that surfaced ideas about how we can get smarter about the boundary between public and private power by getting smarter about what's infrastructure and what isn't. This could hardly be more relevant as investments in public infrastructure are dominating conversation in The United States. But perhaps we need to broaden our view from physical infrastructure to informational infrastructure, which might indeed be even more important. I hope you enjoy this conversation. I am Matt Pruitt, and this is Radical Exchanges with Brent Hecht and Joe Goldie.
Speaker 1
1:14 – 1:44
I studied with somebody who is into the built environment at Harvard, and then I was so enchanted with that world that I enrolled in a PhD program at Cambridge. I was in the geography PhD, and then I enrolled in an architecture PhD at the University of California, Berkeley just as that department was, collapsing. And half of this went to history and half of this went to geography, and I wound up with a history PhD. Nobody in the history department had any idea why everything was a map, everything was about maps, everything was.
Speaker 2
1:48 – 3:08
What? Well, we we can nerd out about, about that. I love maps are a wonderful reference system upon which we were kind of organized an enormous amount of knowledge. I will say that like to transition us into this discussion, It's I think not random that two people who are refugees from the field of geography, found ourselves at talking about this topic, you know, today, like I credit my geographic education with allowing me to have what has ended up being foresight with regards to the biggest problems in computing. You know, geography is a field that eschews abstractions, right? It talks about the first law of geography is everything is related to everything else. But mere things are more related than distant things. Usually we we pay attention to the latter part as it helps us define our statistical methods and these types of things. But the first part is super, super important. And it's sort of, it's an explicit rejection of abstraction which is a core principle of Computer Science. And that is, being able to think outside of the abstractions that computing has created or created, you know, in 02/2010, when I was doing my graduate education, you know, that that has allowed us to get, some empirical data in front of decision makers, in front of people who are interested in trying to, you know, bend the arc of societal impact of the tech industry towards something more positive, you know, when they're ready for those questions versus sort of scrambling afterwards.
Speaker 3
3:10 – 3:29
So, Brent, I wanna, push even further on that because I find that this idea of a skewing abstraction comes up more and more, when I talk to people who are interested in bending the arc of the computing industry in an attractive direction. I'm curious if you can say a little bit more about what that means to you.
Speaker 2
3:30 – 5:31
Oh, yeah. For sure. You know, I actually recall, so I'm one of the the founding members of the fact ACM fact executive committee, which which has been a wild and just amazingly joy joyful isn't the right word, amazingly meaningful experience. It all also is is joyful. But, you know, that community is emerging as, you know, the primary or a primary publication oriented community in responsible AI and, you know, increasingly is interested in the topics that that Radical Exchanges is interested in. And in 2018, I was in a workshop that preceded that conference where we kind of got into this. I started playing around with the provocation I use a bunch, which is abstraction, the original sin of computer science. Right? Is that why we got ourselves into this mess? And, you know, a vigorous debate ensued. The provocation was successful, just because it is challenging. The core principle that we teach, you know, when I taught my, you know, 200, 300 person, you know, enter program class, that was, you know, a key topic. We discussed a vigorous debate. And then some of my colleagues followed up on that with an amazing paper that, you know, we can put in the podcast notes or whatnot really unpacked, you know, some of the challenges, the challenges of abstraction, you know, causes to Computer Science. So folks who are interested in reading that in more detail, you know, they can look at that paper. You know, broadly speaking though, like we've trained several generations of Computer Scientists to only think within the, you know, very specific box and quite explicitly say that the other things are not their problems. And that's why you get well known computer scientists, in industry and in academia saying things like, well, what effect will this have on income inequality or even worse, like assuming that they know the effect that it'll have and it's gonna be positive because of why, you know, it can't possibly be anything else. And I don't need to, you know, crack a book or anything to try to figure out any complexities along those lines. It's because, you know, like that's a principle to sort of everyone sort of stays engineering wise, everyone's supposed to stay in their own lane. And, that has, you know, to extend the analogy to lunch that that lane has taken us
Speaker 1
5:32 – 7:19
or those lanes have taken us to a place that that I think the three of us find very concerning. I agree with everything that Brent is saying, and I'll just add to that. I I was this morning, I was in a conference of digital historians who were talking about the study of nineteenth century newspapers. They had computer scientists among them, and the same debate was going on there. Neural nets can imitate. They can imitate our photos, they can imitate our writing, they can pretend to be humans very effectively, but can you ask a neural net to analyze Shakespeare and tell you have a conversation with you about what what's distinctive about Shakespeare? Can you ask the neural net to show you its homework and what features of Shakespeare was analyzing when it makes that assessment? That's much more difficult. And there's a lot of frustration in in that community of people who are interested in text analysis in a level of abstraction that is so abstract from the text, so abstract from the actual data that it can be a challenge. So in a conversation about infrastructure, you know, we you started off by talking about the map. The map is kind of the primordial infrastructure. You draw a t and an o and look, it's the rivers of creation and the major continents. It's Europe and Asia and Africa. That's the original t o map. Anybody who's seen a logo for the subway system or for the toll road has seen some rendition of that map. The interest in the map is an infrastructure for thinking about the big categories that make up our world. And the real categories really is in Europe and in Asia and in Indian Ocean and symbolized by that map. So this conversation is about the kinds of infrastructures that matter today, and we should probably be as concrete as possible. What is infrastructure today? What is the infrastructure in the the twenty first century? Is it all is it wires? Is it all programs? Bring it to you of a definition of infrastructure.
Speaker 2
7:20 – 9:44
Oh, boy. That's interesting. You know, from a from a computing standpoint, you know, it's it's almost like it's almost like infrastructure. It's it's infrastructure all the way down, like it's turtles all the way down. So, you know, it it it actually closely ties in with what we were saying about abstraction. Right? So one reason why abstraction abstraction is important is everyone can kind of focus on their own layer of a technology. And then as long as folks know the interface between this layer they're working on and the next layer that you layer on top of it, then, you know, everything's great. All of the information that's possibly needed, can pass through that interface. And then, you know, we're building infrastructure on top of infrastructure. And I think one reason why abstraction can cause so many problems is then you have like this compounding means by which certain considerations are not are not taken into account that can have just ripple effects, you know, throughout the the tech ecosystem. So to use the example that you were just talking about one one piece of infrastructure that's in that's increasingly important, is, you know, what Google calls the knowledge graph, what other companies call, you know, usually different types of graphs. And actually, you know, with regards to, geographic applications, you know, this is for me, you know, one of the smaller problems with infrastructure and computing right now, but it is still a problem. There's an assumption there's these systems are are designed. So they've created attraction such that there is a great deal of certainty associated with things that don't have that degree of certainty. So an example actually that Evgeny Gaborovich at Google talks at least talked about, you know, a number of years ago with regards to knowledge graph was like, does every country have one capital? Right? So like you assume like, yes, that's the case. But in fact, the notion of the capital is blurred or palomatized in at least two countries. And that creates, you know, then downstream, you know, if you're using the knowledge graph as infrastructure, you know, how do you inference on top of that? You know, what's the capital of South Africa? Well, you have to explain something outside of the outside of the simple, you know, one to one mapping abstraction, you know, something similar, in Bolivia as well. So, yes, I've been talking a lot. I should pass things off to the next person who's gonna define infrastructure. So I think that's you, Matt. Well,
Speaker 3
9:45 – 10:38
this is a I think we can we can do more to, you know, come up with a really useful definition of infrastructure infrastructure here. But first, I want to resolve an interesting thing that has emerged, which is that on the one hand, we're saying abstraction just something to worry about. On the other hand, we're saying that infrastructure is good. Okay? And then we're also saying that maps are infrastructure, but maps are also abstract. Right? So in a way, a map is, like, the perfect illustration of what an abstraction is. So okay. So what are so what are we really trying to say here? Are we are we saying, you know, do we want to make our abstractions more accurate? Or do we want to resist the temptation to turn everything into abstractions? Super good question,
Speaker 1
10:39 – 12:12
Matt. So, you know, it helps to pay attention to what the words used to mean. And this is something that I know because of, the Yale map historian, Bill Rankin, who wrote wrote about this and his wonderful history of twentieth century maps and how different map making comes in the era of satellite. Right? And Bill says that infrastructure was first a French word, and it was a French word from Marxist. He used it to play on the Marxist term superstructure. So a superstructure is an ideology, and it's distinct from a base. The base is all of the kinds of exchanges, exchanges that make some people rich, to make some people investors, which make other leave other people poor. So you have a superstructure on top of it that's ideology. And then you have infrastructure. And infrastructure is what happens when the ideology channels certain kinds of capital into structures. So that could be the road system or the train system or the factories. They're actually heavy. They take a long time to build. They tend not to go away. They tend to have some resistance in time. So the house that I the house that I build today, you know, 20 families could move into my house ten years from now, but it it it continues to structure who gets to live where, where capital is invested for a long time. So an infrastructure is yeah. I think I think Matt is really, really, really clever in trolling our attention to the fact that an infrastructure like a map can be simultaneously an abstraction and something very concrete, something linked to the world that's in the nature
Speaker 3
12:12 – 12:48
the nature of this material world around it. But we could look at a map as infrastructure to the extent that it's like a a common resource that a bunch of people are using. Right? Yeah. Other examples of infrastructure, like a network of roads, seem a little different to me than a map. Right? Now you could use it you could look at a network of roads as a map, but when a network of roads emerges to connect people for some you know, to connect settlements for some useful purpose, it's not initially conceived of as an abstraction
Speaker 2
12:48 – 15:23
in the same way that a map is. Well, this this might get a little bit too pedantic, but but I I think they're actually they're they're if you're looking through the infrastructural lens and thinking about what types of abstractions infrastructure creates and infrastructure affords, I think they're they're they're more similar. So with regards to, you know, maps are designed first as as infrastructure for specific purposes. Right. So you, you know, I'm looking Joe has a map behind her that's that's a design probably for more just like our historical purposes. You know, we use Google Maps designed for, you know, specific purposes. And it's great infrastructure for for, you know, many of those purposes and in many cases. With regards to roads, it's it's somewhat similar, right? So like there are certain, you know, I'm thinking about infrastructure through a computer scientist lens and there are certain interfaces that our road creates. Right? And and certain assumptions that are made behind those interfaces. So we, you know, for instance, here in Washington, there aren't a lot of sidewalks next to the roads. And so they're just assuming that roads are for cars. Oh, and in other places, you know, roads have are designed for all sorts of, different purposes. And, you know, there's a lot of cleverness that goes like one thing that roads are designed for is like safe even in in the interstate, right? It's it's designed for safe travel at a reasonable speed, for automobiles. And like, it's it's really well, like, actually there's an argument that one of the most important public health measures that that, we've ever undertaken as a country is the interstate and all the innovations they have. The the curves on interstates are, designed intentionally, to increase safety and interstate itself, dramatically reduce the number of people who died in in, real accidents. But it but it like, you know, it's confounded in in certain ways when those abstractions fail. Right? So, you know, people can't travel from point a point a to point b if there's a protest on the road. You know, these types of things in the same way that I can't tell, you know, I don't have a good view into some aspects of the human geography of the places I've lived just by looking at Google Maps. I don't understand, you know, average age and other demographics, economics of the area, which should really help me get to know the area and then the lived experiences of the people who occupy, you know, the various regions. So actually there's a view that both of these are infrastructure upon which you that assumes certain purposes and upon which we can build more infrastructure or take
Speaker 3
15:23 – 15:28
specific actions. Okay. I'm sold. So roads are infrastructure and roads are also abstractions.
Speaker 2
15:30 – 17:23
Roads. Yes. Or roads make abstractions in the same way that maps make abstractions. You know, Matt, I think you're making a more important point though, which is yeah. So maybe we can decide if we wanna leave that that a more philosophical discussion in there. But, the more important point is, yes, you're exactly right. In in, you know, I'm I'm primarily interested in making computing less harmful, making a AI paradigm that produces less fewer harms. And I should say this is from a net standpoint and and more benefits, right, For a very large group of people. And this is when I talk to my in my responsible AI course tech in Northwestern, this is something I talk about. Chances are doing exactly what we've been doing is not the optimal solution. Chances are throwing out everything and saying we're not gonna use computer science anymore. That's not gonna be the optimal solution in creating in terms of creating net benefits. Right? It's gonna be somewhere in the middle. Right? So and and therein lies the the joy and the quest and the, the challenge, right, is trying to figure out in, you know, in a detailed way, usually. Thinking through all the problems and thinking thinking through how we might mitigate all of those problems and designing a system that works better, for more people. So with regard to maps, for instance, like there is an amazing set of best practices, many of which, you know, can be ignored by people outside the field. But there is an amazing set of best practices going for everything from the map projection you use for a specific purpose to, you know, for instance, when you're talking about thematic maps or what are commonly known as heat maps, ways to make sure that the reader you're not accidentally or intentionally lying to the map reader in terms of the spatial distributions that are trying to be visualized. And that took a long time and a lot of research and then a lot of teaching and a lot more research. But, I'm generally speaking quite optimistic that that we can do that with some of the problems that that Radical Change is interested in.
Speaker 3
17:24 – 17:34
So, Jo, how would you define infrastructure if we wanted to zoom in on this and come up with a workable definition? What what would you say?
Speaker 1
17:34 – 23:01
Well, I'd say that one of the one of the key features of infrastructure is that it's connected and that it has a material reality. So we could, you know, we could fancifully say that something like Western philosophy is an infrastructure that allows certain kinds of thought. But let's say that infrastructure has to be a material logic, and it has to be something that connects other people. So the infrastructures of the past that made enormous innovations include the grid system, laying out your city in a grid as the easiest way to plot land, to sell land, to make land fungible so you can have grow a city on the frontier, lay down road networks, and you can connect the economy, lay down subways or trains, you can speed up the time of exchanges. And then you have now you have a conceptual problem, which is so what about other things that speed up exchanges? So something like the post office. The post office and infrastructure, it's not the same thing as a road map. Well, the post office also requires buildings. It requires physical post offices where there are letter sources. They have maps, and they have ZIP codes. They've got a knowledge infrastructure that helps them to divide up all of those ZIP codes, and then they have actual mail trucks. So, sure, that's an infrastructure. What about the stock exchange that speeds up exchanges? Well, the stock exchange also has physical systems of connectivity. So if you've ever seen the movie, the the Wolf of Wall Street, there is that scene scene where he's in front of an old TRS 80 computer, and he's gazing into the terminal, and he's seeing the the numbers of stock prices flicker. And the point is that in the nineteen eighties, if you wanted to be a great stock exchange person, you needed an information infrastructure and some part of that was physical. A physical connection where you could get those the term on your terminal, those prices of exchange, that was part of what made you a trader, made you somebody serious that you had to deal with, is that you were physically at the point where you could get the information about the the stock exchanges in a more quick fashion. So what does that tell us about our own time? That means, we would wanna pay attention to the kinds of infrastructure that then there are enabling exchanges or more rapid exchanges today, and we'd be interested in some of the the physical connections that make that possible. So that kind of infrastructure usually falls onto a heading of broadband or net neutrality or undersea wire system that connects some places on the globe better than other places. If you have a bad fortune to be a former colony, you probably have less action to the under less access to the underwater, the undersea network of connections. Maybe that's something that Brit can tell us about. But then we have other kinds of half material infrastructures that enable other kinds of exchanges, and these include the online world of portals that make maps available. Now this is a little confusing because it's, partakes of both the material and the immaterial. So I've got a map. I've got a heat map from Google of all the places nearby where I can I can order a taco, a taco heat map? Is that infrastructure. Well, it's taco infrastructure. It helps my exchange of tacos. My exchange of money for tacos go more rapidly, which is very important to the quality of my life. On the other hand, it depends on this other infrastructure of wires and broadband, which is mostly managed by private and semi public entities, which allow us to be connected. So this this begs important questions about what an infrastructure is and how we make those exchanges happen more rapidly when you have something like the Google search index, which indexes all of those taco shops and tells me what the best one is. The Google index is a private index. It's a kind of knowledge infrastructure. It's held in one place. Google made it. Google provides it. It helps me find my local taco shop, but it's owned privately unlike the system of roads or sidewalks or the post office or many other infrastructures in the past, which have enabled more rapid exchanges. And so I we got interested. Matt and I got interested and started having a conversation that we invited Brent into after about a month ago. The New York Times published an article about the Google index and some of the activists in Silicon Valley who have started to make an argument about how the Google index, it should not be something that Google has the power to exclude other search engines from. Should an infrastructure be private? Should an infrastructure be public and shared it? Is the Google index indeed an infrastructure? So that's what we wanted to talk about today. And I just I would just say provocatively, maybe Brent will agree with this or push back in it, but as a as a starting position, let me say that the Google index is very much like a map. It has the same effects as materiality. You can build other things on top of it. It makes other exchanges possible. You have to be connected to it in order for those exchanges to work. Is the Google index and infrastructure? Yes. And at the moment, it's a private infrastructure. And so that brings us into another subject of conversation, which is the history of infrastructure of public and private. How do they work? And this is a big topic of debate and argument among economists and historians and other people who care about exchanges. Should your sidewalk be public or private? Should your road system be run on toll roads or public roads? Should you have a public interstate highway system? Should you have net neutrality and broadband that's operated in the interest of everyone? So next up, should the Google index be privately held or public infrastructure?
Speaker 2
23:02 – 25:27
It's a really good question, you know, and, Joe, I'm I'm already enjoying learning a lot from you. So, one thing I'd say is like from a computer scientist standpoint and your your knowledge of this, you know, the theory behind this, you know, exceeds mine. And I think that's actually critical that we as computer scientists listen to you and folks like yourself, given how much you you can teach us. But I can just tell you sort of the the experience of a computer scientist is a bit of time as a practitioner and then also, you know, as a as a researcher. The way I view infrastructure is, you know, anything I I need to rely on to to, you know, get my current job done. And in in in that way, one can make the argument actually that computer science is in some ways like one of the more infrastructure dependent or infrastructure focused fields, you know, that are out there. I have to think that, you know, that I won't I won't stick by that bet and bet my, you know, my salary on that or anything. But, you know, it is interesting to think like any any like my thesis, for instance, dozens and dozens and dozens of in in if you go all the way down to, like, the ones and zero, so so many people, you know, contributed to the infrastructure that I relied on to add a little bit of knowledge, you know, to the world now and increasing the number of years ago. So the, you know, the short answer is, like, yes. Anything that you can build on, build an application on strikes me as infrastructure and that would include search indices, that would include knowledge graphs. Trivially speaking, that would include the geographic version of the search index, which is, which is, you know, Google Maps or Apple Maps or Bing Maps. So, so yes, for sure. And I'm super excited to hear from you about that means with regards to how we can achieve, you know, as I mentioned, optimal outcomes for society with respect to ownership. One thing I think I maybe wanna jump into this conversation. I'm not sure if we'll get to it later, but it's interesting to reflect on Wikipedia in this context. And Wikipedia. Speaking of infrastructure is the central to my thesis. That's one it's essential to a huge percentage of people broadly speaking the AI worlds. Theses and and the research and the products that go, you know, more generally. And that is it's not publicly owned, but, it is a nonprofit and it is created by volunteers. And I've always been intrigued by the distribution of resources and concerned about the distribution resources between those who rely on Wikipedia as infrastructure and,
Speaker 3
25:28 – 27:11
the folks who who maintain that infrastructure and and the non optimal outcomes that that may be resolved. So one thing that I would like to to draw out into the open because it's sort of a a subtext that we all probably have that listeners may not is, you know, there's when we say something is infrastructure, there's a little bit of a hidden implication that it should be in some sense then controlled by the public. Right? You know, and then there's a whole array of of interesting conversations to have about what that looks like. Like, what what does it actually mean for control over infrastructure to be broadly distributed in the proper way? And there are very different conceptions of this, and the the differences between the different conceptions matter a lot. Yeah. But in in the context of Wikipedia, one thing I wanted to ask you, Brent, about is so Wikipedia has made itself open to anyone who wants to use it in in quite a radical way and has created an enormous amount of value that we have all enjoyed, me more than most. I I think I've got to be in the in the most intensive group of Wikipedia users over the past twenty years or something. But the at at the same time, your research points out that there has been sort of a problem of appropriation, private appropriation of the value of this collective resource or this shared resource that Wikipedia represents. And I'm curious, you know, how how you evaluate how do you look at Wikipedia historically at this point? I mean, did they did they succeed or did they, you know, can you tell us a little bit more about, you know, how it's been appropriated and how you see that? Sure. And I'm I'm super curious to, you know, hear,
Speaker 2
27:11 – 31:05
hear from from Joe about this too, but I'll I'll try to reflect a bit quickly. Broadly speaking, you know, we were interested in in a, you know, a very challenging and important and, extremely difficult to address in a short period of time research question, which is how can we shift some of the power that's concentrating in large tech firms towards a more distributed outcome, among the general public. And our hypothesis is this is this will be broadly very useful for the general public, as well as tech companies, which are currently facing many, many challenges associated with the concentration of power. You know, re recent news about Facebook, for instance, concentration of power within, you know, their their boardrooms and and their product teams. And so we decided to start asking this question by focusing on Wikipedia's relationship, to a lot of the, you know, the technologies that we use today, and and hopefully measuring and making people aware of the value of that infrastructure in the context of this conversation to all of these magical tech products that seem so amazing when you don't sort of don't know to open up the hood, and see, you know, what kind of infrastructure is inside just totally scramble all of our metaphors so far today. And once we measure and make people aware of that value, build tools to help people take collective action around that value, as well as help people understand, how that value can help them motivate policy actions. So for instance, you know, we hope our research will will show, and I'll talk a bit about that. You know, now we're in another question. Show folks that like large tech companies are super reliant on this public resource. So it's not all that unreasonable to ask for regulation or to ask for different distribution of the technological dividend and these sorts of things. And again, our hypothesis is this will make a healthier long term tech ecosystem even for the companies. So one example, one one example, study that we did, which I like a lot is we actually built a Chrome extension that silently hid just Wikipedia links. Like Wikipedia is super important to all of these AI technologies that are underlying search and, you know, underlying information tree one more generally. And we decided, look at it, start simple. Let's just look at the links that people actually click on. And we built a Chrome extension that hid those. And we had a treatment group and a control group, the treatment group, you know, I'm simplifying a bit, but had essentially used Google with a house Wikipedia for, you know, really one of the first times, you know, since Google existed. They were born, you know, roughly around the same time. And then we had a control group, you know, to make comparisons against. And what we found is that Google is a much less effective search engine for a large percentage of queries when it can't surface Wikipedia data. It's just like when Google's about addressing user information needs. And ultimately, a lot of the way it does that is just surfacing this infrastructure for people, not doing any sort of real magical, AI, you know, with the accession of the relevance assessment and these types of things. But ultimately, people, you know, click on these links. You folks know Nick Vincent. He's done some research trying to understand what percentage of queries are search engines reliant on on Wikipedia. And it ends up being like Wikipedia They're more reliant on Wikipedia than based on what we can tell. And it's hard to measure these things given that a lot of this is locked up behind closed doors. They're reliant on Wikipedia more than any other piece of infrastructure, you know, content infrastructure on the web. And that to me is, you know, a great way to measure. And then hopefully, you know, through this podcast and other means making people aware of that relationship, that relationship to an application and in the context of this conversation infrastructure, help them use that as fodder for public policy discussions. And then also, you know, this is an interesting There's just some news today about this and I need to read up a bit more before I can talk about it with the expertise that hopefully I'm talking about other things. But also help Wikipedia get the resources it needs to be the best infrastructure
Speaker 1
31:05 – 32:15
it can be. Yeah. So that's totally just totally fascinating. The data for Google is public. I mean, you you put it in much more precise terms than that. Google uses Wikipedia's data, but it's crowdsourced human built data that Google used to make its search results so good. Yes. Just to draw that out a little, I mean, the implications of that are enormous. Can you build something private on top of something public? Can can I erect my house in the middle of the the public highway? No. No. I would get into lots of trouble if I tried to build a house in the middle of the highway. And it brings up, it it also brings to mind creative commons copyright. So I I would take a photograph of my favorite Taco Shack and I put it on Flickr. I can choose how I share it with the public. Do I share it as a as a private piece of property, intellectual property? They need to use it or is it public? But if it's public, they have all of these options for making it share and share alike. You can use it, but you have to share. So what we're learning is that Google has effectively been using publicly shared share and share alike, but they haven't been sharing alike. And that shapes my moral sense of right. I'm not the lawyer. Matt is the lawyer, but it's
Speaker 3
32:15 – 33:07
it's question of justice. Many people would be like, wait. You're not supposed to build your house on top of the highway. What are you doing, Google? Yeah. But I I think there's there's just this interesting question of whether the sort of open source movement made a mistake by allowing commercial use. You know? Super interesting question. To illustrate the point that you both have just made vividly, I almost have a distinct memory of somewhere around 2001, 2002. Wikipedia was my home page, and then I realized that Google searched Wikipedia better than Wikipedia's search function searched Wikipedia. Yeah. And then I made Google my home page. And, you know, this is the basis of the entire empire, basically. Right? If Google hadn't been able to build its walls around that house where the primary value of the Internet to me was, which is to say Wikipedia.
Speaker 2
33:07 – 34:20
Right? If they hadn't been able to swallow that up, I wouldn't have helped to build their index among other things. Right? Yeah. I should note, like, this notion did did the open source movement adequately consider the implications of commercial use. I think it's an interesting question. I actually, after having thought about this for a while, think that it's important for us to be thinking about other approaches or ways that we can try to get some of the benefits of the open source licenses that are available while at the same time, making sure that that doesn't mean we're establishing power structures that exacerbate existing inequalities in power. But I should know that, you know, there are gonna be many Wikipedians who think of this as a massive success and don't see anything wrong with it, right? That essentially their work was scaled. They created this this public infrastructure. And we have to remember it's not like a forest in which the forest goes away. Right? The data is available to anyone and is infinitely replicable. They've created this this like permanent forest that, you know, the the trees grow right away as soon as they're as soon as they're, you know, sawed down. And through that lens, it is amazing. Right? So, like, they can take a lot of pride and and, a sense of accomplishment in helping to create some of the most amazing technologies that humans have ever created. Where my concern comes in is that, again, I think there are externalities
Speaker 3
34:21 – 35:36
to that relationship that, go beyond pride and and accomplishment and and are becoming more concerning. Sure. And I you know, this just goes back to that question of control over over infrastructure. Right? So if we build these these things that lead to other things and then flower and become super amazing, and then we all become reliant on them in the same way that we're we're reliant on road networks or maps or whatever, The the, you know, the question then arises essentially how we share control over the infrastructure that we all depend on. This is where it connects back up to the question of abstraction. Right? So when we build infrastructure, this and also when we create abstractions, there are ideologies that go into that in terms of, you know, what we what we represent on the map or what we don't. And, essentially, when there's, like, private or concentrated control over infrastructure, the exact same thing is going on. Right? There are particular interests that are being represented in the infrastructure. But, you know, so I'm the first to acknowledge that a lot of people in the open source movement would call that a success. I think I would push back, you know. I think I would argue that in spite of the many amazing things that have happened, part of the picture has been missed in whatever abstraction we were using to evaluate success. And,
Speaker 2
35:37 – 38:08
that's what we're trying to deal with. Right? We're trying to figure out how to share control over Yeah. The analogy from, you know, the old geography days is that, like, you know, the open source the open source movement is building roads. It's only gonna empower people who have cars. Right? And I feel like we're getting a bit of that dynamic going on with the current to the licensing infrastructure. One sort of interesting case, you know, Joe, maybe I'll tell things to you to reflect on this as well. Because my biggest question for you actually is like, when is it good for the public to have sort of full control over infrastructure, some kind of a partial control as as, you know, which models infrastructure are best for for which types of outcomes in which context. I'd love to, you know, learn more from you about that. One sort of interesting case here is Google's dependence on Wikipedia. And I should note that Wikipedia is also dependent on Google because of the position everything's in. The paper was called the mutual dependence and would be in Google colon, something I can't remember exactly. But the one case that's been concerning me lately is there's a community and repository spatial data called OpenStreetMap. And it's very often called the Wikipedia of Maps. Incredibly useful dataset for all sorts of things. And, if you look at the bottom right of some slippy maps that you'll see on the web or on your phone, you'll see, you know, the attribution to OpenStreetMap contributors that attribution to OpenStreetMap contributors that it does power a growing number of technologies these days. Their licensing situation is even well, that's a maybe that's a can of worms we we we wanna open. But, Google, which can use OpenStreetMap, you know, if it chooses to, it has historically and is now once again asking people to actually submit edits directly to their map. Right? So they're essentially, they have a huge amount of traffic and that traffic allows them to build better infrastructure. Right? So essentially like, you know, I For my wedding, I remember doing this right. The one of the roads was not correctly encoded on Google maps, to get to the site of my rehearsal dinner. And it put me in this mind because at that point that was five plus years ago. That's a good I can remember that because every year I get a get an N plus one there. And five plus years ago. And it put me in a bind because I knew that everyone would be using Google Maps as infrastructure to get to where the Herschel didn't realize. But I also was beginning to think deeply about these types of situations like, well, wait a second, if I'm gonna be doing some work, shouldn't I at least put it, do work into the public, you know, help out the public infrastructure, which is OpenStreetMap. So I actually attempted, I identified this early on, the edit OpenStreetMap, and hope that it propagates to these other these other platforms. It didn't work. I ended up having to make a suggestion, you know, for Google Maps as I recall.
Speaker 1
38:09 – 38:27
Yeah. So go ahead. Personal dinner, you were you were editing OpenStreetMap. You were so concerned. Were you about the public on your way to get married and you're not adjusting your bow tie, you're adjusting open street map. I mean, mad respect, Brent. That's that's wonderful. It was
Speaker 2
38:28 – 38:51
not super far outside of something I would do. In this case, I've noticed it early. It was able, like, a couple of months ahead of time. But that is definitely something that would be a noise. I think one one another way to frame that is you could say, well, Brent, you were working on editing OpenStreetMap instead of, for instance, you know, helping I had to stay up late to help out with the invitation lesson and all these types of things as well. But, yeah, Joe, teach me in that what you're thinking.
Speaker 1
38:51 – 46:41
Let me teach you about what I'm thinking. Give me that beautiful invitation. Good. So I'm somebody who thinks a lot a lot using historical parables. You know, what was infrastructure like in the past, and how do we have to regulate it in order to get to the common good in the past? That's how I that's how I process most of these. That's gonna be my value added. So let me let let me say that, you know, one position that we could find in the past and in the present is a kind of hardcore libertarian approach. The hardcore libertarian approach just disbelieves that we need to have any public repositories for anything because people are selfishly motivated. They need to see that they're having a reward in order to introduce new innovations. And so these people would look at the history of toll roads and say toll roads were a great moment in westward expansion of roads in the early colonial United States. Toll roads were a great moment in, building up the infrastructure of the early industrial revolution in Britain. Toll roads are great. We have private companies today that are laying broadband and charging you for the quality, and they should be doing this because this incentivizes them to do it more. Google built infrastructure that nobody else had built. Maybe Wikipedia should be a quote or a profit service. Maybe that would solve everything. So, you know, there's a seed of truth in that just like there's a seed of truth in everything, but let's look at some of the other models. So another model is sunsetting. So in sunsetting, you take a system of private rights in which rewarded somebody for being the first person to invent something amazing, and then you say, but maybe you shouldn't benefit from this forever. So this is the opposite of the Disney paradigm in which Disney keeps on extend magically extending its copyright forever and ever and ever so that they're doing they can keep on getting rich with Donald Duck. In sunsetting, that's not supposed to happen. That's supposed to expire, and Donald Duck is supposed to go back in the public domain. And sunsetting is a big phenomenon in the history of infrastructure. So in eighteenth century Britain, when they start making the roads and then they improve the post system, which was previously something that only aristocrats could afford to use under really existence circumstances. It was expensive and slow and irregular. After 1785, the post system is standardized, and they have standardized horse drawn carriages departing on time, arriving at time, punctuated by a special bugle call, special carriages. Okay. So they've got the post coach system is regularized. One of the components that makes the regularization of the post office possible is a map. And a map has all of the information about which roads to use and which roads usually have don't get plowed for snow, how long it takes to use different road systems. And that road is compiled by a man named John Carey, and he does it at the request and getting paid by the post office. But in return for compiling that map, he also gets a copyright on it for twenty years during which he gets to be the only person who sells the map of Britain, which is correlated with the postcode system. So he gets to sell it from London, and does this make him a lot of money? Yes. It makes him a lot of money. Then the copyright goes away, and then everybody else can make maps, and then maps become super cheap. And maps become something that you can get printed on a handkerchief or cheap paper edition. And we have travel maps for the same for the first time, which is a big innovation. Because maps used to be like this huge heavy volume that you keep on your coffee table and never move because it was heavy and expensive, and there were only a couple of maps become cheap and popular because Carrie's copyright expired. Maybe that's a good metaphor. Okay. So we have hardcore libertarianism. We have sunsetting. We also have severing. So in severing, you say, this thing that we thought was one piece of infrastructure is actually two pieces of infrastructure. Let's pull these apart. So the railways the rail the private railway companies build some railroads, and then they have cars. And the way it used to work is that the railroad company cars go on the railway system. But then what if you want to do a cross country cross country travel and you have to use multiple rail systems where you have one set of cars and you don't wanna unload it when you have to get on another set of rails. So you decouple those, You sever the different interests and you say the private rails are going to be held in common and the cars are going to be separate. So think about the way this works today. Taxi cabs are private companies that every cab driver has a medallion, and they are completely different than the public entity that paves the road. If it didn't work that way, it would be extremely confusing, and you wouldn't have nearly the independence or freedom to operate or to create Uber. Okay. So you can sever the different life and resources. You can sunset the rights. You can sever the rights. And then the the fourth perspective that's really crucial is maybe the oldest of them all, and that's owning things in common. What we know, of course, thanks to the research, Valenor Ostrom, is that owning things in common doesn't mean that it's a free for all. It doesn't mean that Gerrit Hardin is right, and it's a tragedy of the commons because everybody's gonna bring in their cow and graze in the commons. No. No. No. Commons have rules. You know, there are rules about who can participate. There are rules about what you do when you participate. So think of the highway as a commons. Not everybody can drive their car. You have to get a driver's license. You have to have your wagon license, so it's not gonna break down. That's that's one of the first thing is, did you know that in the eighteenth century, there were driver's license and license plates on wagons? There were. It's an ancient institution because you have to protect your commons as soon as you have it. So you can have a commons type system, which is run on behalf of the public. And in the modern world, we tend to assign states public entities to run that with accredited forms of experts like engineers. If you're going to be a civil engineer and work on the road system, you have to pass a state appointed exam and get your engineering stamp. Otherwise, they don't let you near it because you don't know all of the functions and the bridge you build over the highway system could fall down. We have many, many, many checks on the driver, on the car, on the engineer who builds the bridge, on who gets to do the engineering before you can go anywhere near the interstate highway system, and then it serves everybody. So hardcore libertarianism, sound setting, severing, and common systems are four of the approaches that we can use to approach public infrastructure. And so, you know, to return back to the question that started this conversation, the Google index. We see that the Google index is built upon the data collected by a public entity, which is through Wikipedia, and then the Google index advances that some somewhat. Is that do we think, like, hardcore libertarians that Google, well well done, wave for being innovative, You should get to keep all of your profits. It sounds like an argument that I might make if I were a corrupt entity. Like, if I had a lot of Google stock, I might make that argument. Do you believe in sunsetting? Like, okay. So Google helped to make Wikipedia indexed, and that was a good thing, but you get to do that for five years, ten years, and then and then that index needs to return to the public so other people can build on top of it. Can it be severed as in Google's map of Wikipedia or Google's index of websites in general can be severed somehow from a different part of the data structure? And maybe this is where Brent's computer science might help us to understand what that severing would look like. Or is it all one big comment, and we need some public entity to come in and mediate this and say, only accredited computer scientists can create data structures. I I don't even know what that would look like in this case. But what is the comments? Is the in text itself a comment? Is Wikipedia looks a lot like a comment that's made by the hands of many, but not everyone can make an edit to a web page. You have to be accredited, and you have to go through many, many checkpoints before you can edit the page on the holy bible. You can't just say, no. I believe xenon gave us
Speaker 3
46:42 – 48:41
gave us the word. So can I try to put a gloss on that and you can tell me if I'm getting it wrong? Yes. So it seems like there you know, when you think about those different approaches so on the, the first three approaches are all different variations of basically locking things private or locking them open. So in other words, the the libertarian approach is everything is private property. Everything is, you know, fully proprietary and control over whatever. The sunsetting and the severing approaches are limits on that proprietary and approach. So in other words, we have, you know, things are things are private until they're not, and then they're open. And the fourth approach, the sort of common ownership approach, to me, that what that's doing is that's that is exploring the territory between those two ends of the spectrum. So they're on one end of the spectrum, we have locked private. The other end of the spectrum, we have locked open. And, there's this whole territory in between them, which is what Alstom is exploring, which is which is really what the entire idea of governance, basically, like, that whole subject matter, to me, is exploring that space in between locked private and locked open. And there's tremendous richness within that. But, yeah, to me I mean, I'm just I'm really interested in governance, basically. I'm in I'm interested in that space in between those two things because as I signaled before, like, I kind of think Wikipedia, by locking itself open, is great in a way. It can do great things, but it can also be a mistake because it can allow things to get appropriated. So in in other words, if all if all of the value in in the world was just laying out there open for anybody to pick up, then it would be like whoever has the biggest vacuum cleaner to vacuum it all up will end up vacuuming it all up, which is which is what people worry about with, you know, data processing and and AI effectively.
Speaker 1
48:42 – 53:59
Yes. I think that's right, Matt. Let me just let me just add one more provocation. But I think Yeah. I think the in the world, if I think about, you know, people who are coming and going from Silicon Valley, I think, you know, you know, there's an appreciation for these different modes in a very general sense. If you if you're somebody who is engaged with the debate about creative commons, they know what I'm talking about about, share and share alike licenses. If you're if you're somebody who uses the R repository, the CRAN repository, then you're used to finding code, which is held like a like a comments, managed like a pop comments and is often, you know, rationalized in a way in which Python isn't, because of the CRAN repository. One of the things that we're not really good at thinking about right now is, the question raised by Ladonya, Kian Mepaga, who's gonna pay me? Who is gonna pay me for doing this work? In the world of open source software and even open source maps or Wikipedia, it's presumed that you have a world of voluntary effort. Now think again to the interstate highway system. Those engineers, those civil engineers who were who were improving the roads are not doing that out of the goodness of their hardware because they're video gamers and they're taking this on as a quest for knowledge or it's their private art practice or it's a hobby or anything else. They're doing it for pay. And they're doing it for pay essentially because civil engineers were very clever and they unionized early on. They didn't call it a union. They called it the Royal Society of Civil Engineers. They did their old school. They asked the queen to bless it. So it's not strictly a union. It's a little bit more hoity toity, and that means that they have a journal and a conference. That's the only difference between the union and royal societies as far as and the queen blesses you. But, otherwise, it's a union because we all stick together and you say we have to get paid. You have to do certain things to be a member of us, and then we get paid. So the one of the reasons why this question about the index is so important and how the index is imagined is, you know, sunset is gonna sever the is it a comment is, you wanna imagine a world in which the people who are improving the index or improving the index deal with the PDL or OSM are somehow getting compensated for the work. And at the moment, the only way for anybody to get compensated is is by working for Google. So, you know, if I step back for a moment and, you know, I'm open to other interpretations, but I really, you know, step back and I think, well, what's gonna happen? Well, the reason why you want the state involved with roads and the state wasn't involved for, you know, a hundred years before the state starts improving the post office. The reason why you want the state involved is that the state is very good at saying, okay, fine. We'll hire people and we'll pay them. We'll pay them what they tell us to pay them. So it's the state that hires the civil engineers after years and years of private entities hiring whoever they hire. So if you wanna imagine the next iteration of what happens if an index is a public vehicle for anybody to build a search engine on, then you need a state or a state like entity that would be like a mega nonprofit, a mega Wikipedia corporation to to say, okay. We really care about making sure that good data is available, and we're gonna do this. We're gonna hire people who are really good at this job, and we're gonna hire them at the right rates, and we're gonna continue to build and sustain and maintain and even improve upon this public infrastructure. Now what would make them do this? Right now, I mean, you'd have to be a crazy zealot on behalf of the commons. I mean, like me, but I don't have any money. You'd have to be a crazy zealot in order to make that happen. What historically, the only reason why you've gotten the birth of civil engineering, is that you have some sector of underserved people who are not being served very well by the way the infrastructure was set up, who say, you know, in it's it's nice that you have roads over there in that half of the country, but over here in Wales, we can't get to the market. Over here in Ireland, we're stuck here for good. You know what would be awesome? If you connected Ireland to the rest of Britain, connected Scotland to the rest of Britain. How about people like us? And remember, in the eighteenth century, these are racial categories. I don't know, guys. We're all well educated here. Is there anybody in America who is not being served right very well by our broadband infrastructure? Or do you have we heard any complaints of that kind over the last decade? I don't know. Any, you know, initiatives to collect data about Black Lives Matters or women's experiences, any complaints whatsoever about Silicon Valley companies, probably not. Probably everybody's super happy with the way things are are paying off. So there's no reason why people would have come together and start to imagine a larger over overarching either state entity or independent nonprofit that would pay people to build an infrastructure that really did serve everyone and make the legal case that the index should probably be a public infrastructure because states have always been very comfortable taking toll roads, taking railroads, and taking, and taking any maps and any other form of infrastructure and commandeering it for the public good when there were underserved communities who could not be connected to the economy on their own terms.
Speaker 2
53:59 – 56:22
So, Joe, my my head is exploding. That was amazing. Thank you. So one one thing I I wanna say is, like, you know, that for, you know, anyone who's listening, I think what Joe did there was a fantastic example of something that I get to benefit from through my interdisciplinary, you know, approach to computer science, which is easily understandable. Amazing frameworks and knowledge that can help me, you know, orient, to provide infrastructure or reference system, for some of the challenges, that we are facing and and provide an amazing, you know, set of knowledge relationships, you know, that can, again, overuse this conservative infrastructure for further, and deeper collaboration. So that was that was wonderful. You know, what you just said about infrastructure, what you just said right now in terms of payment infrastructure, the it is it's the Wikimedia Foundation is exactly this. Right? So, like, I cut my teeth in computer science doing one part of computer science to computer science's first sort of deep look into what we now understand this algorithmic bias. I was trying to understand effectively, like, given that Wikipedia is so important, at teaching, you know, it is the way that so many AI systems learn about the world, a way or leeway. You know, what are the, you know, long term, you know, implications that where is it gonna be effective? Where's it gonna be less effective? And that led to a lot of content bias analysis. And, you know, we know that Wikipedia, we know actually by If you just look at the different language additions, so Wikipedia, you see like dramatic differences in what gets covered. So you only can imagine what other differences exist that don't cut across language boundaries, right? So we have, you know, Wikipedia is predominantly edited by men for instance. So the Wikipedia Foundation is in many ways trying to do this, right? They take, you know, my donation dollars, they take everyone's donation dollars and try to get initiatives going like hackathons to focus on women scientists and and these types of things is exactly the dynamic that you're talking about. The concern for or one concern for us and one reason we again, we started this work and we thought about mostly in terms of power structures, but it fits exactly what you were you're discussing, gets me very excited is that, because we could the infrastructure is suffering because Wikipedia gets such a tiny, tiny, tiny, tiny, tiny share of the benefits of the infrastructure that it's building. You know, I think there's an analogy to taxation or something like this, right? Where it's like, if you spend no money on the road, you know, it's gonna be it's, you know, it's gonna be a problem. So that was just fascinating. It's just fascinating.
Speaker 1
56:23 – 58:59
I mean, just to to follow-up on that last thought, I mean, it's it's true. The the civil engineers could have formed a civil engineering foundation analogous to the Wikimedia foundation when they got started in 1820 and started building roads on their own. The only problem is that all the civil engineers can do on their own is show up with their hat in their hand and say do you want to pay me to be open? And often not right now, I don't see how I'm gonna I don't I don't know why. But the state can the state can decide, well, we're gonna pay it to be awesome because we can tax people. If we build something in the name of the common good and it's gonna connect all of these populations who weren't previously connected, it's gonna satisfy their their interests. They're gonna become active in the economy in a new way because there's gonna be a search engine which isn't isn't implementing racial profiling or gender bias or showing, you know, showing black black 10 year old girls who Google the phrase black girls' pornographic images. I mean, that's a terrible thing. Right? Once you see that, you can't unsee that important or 10 year old girl. We don't want that on a so we don't want that in a good society. Who wants that for our nation or our world? The state can say I'm gonna tax everybody. I'm not gonna tax you very much, but I I'm gonna need a dollar from everybody so that we can build something where when 10 year old black children Google black girls, they don't see pornography. Great. Okay. So then then, you know, then the Wikimedia Foundation becomes a really important chrysalis for something that could come out of it that would be bigger. I mean, an index of world knowledge, an index of culture, something that's not just built on AI, but it's built on the actual contributions of human beings, including some of our values. Right? And this is, you know, about part of the libertarian fantasy about AI and the way that I take it, and maybe Brent, you can draw this out for me, but part of the the fantasy is that the AI is just gonna figure that out. It's gonna figure out how to think about getting more that our economy is gonna be the right people, but, you know, it turns out that the AI sometimes reproduces biases, for example, in the Google search engine. So so Yeah. So Wikimedia Foundation, I think you've done a beautiful job of, connecting for us how important the Wikipedia is as a player in the space. What would you imagine if there was, like, you know, a son of Wikipedia, a daughter of Wikipedia the Wikimedia Foundation that got really big, like a UN the UN Yes. Larry's Wikipedia. What would that look like?
Speaker 2
58:59 – 62:56
You know, that's such an interesting question, Joe. And then, you know, I'm I'd have to make form of connections to existing ideas and new ideas, you know, on the on the fly here. So I'll try to reflect a bit. I'm laughing. You you found the wrong computer scientist to defenser, but a libertarian view of, you know, artificial intelligence, you know, that that's that's not gonna happen. You know, the the views of the views of AI that that you encounter when you hear when you hear that perspective are views that are heavily rooted in abstract and failed abstractions. I'll put I'll put it that way, to to bring us back to the beginning of the conversation. The the thing that I like one point and then then, you know, reflect more broadly in your question. Computer science is so dependent on, you know, this is something I was thinking about on on your holding things in common model. And a lot of what I'm talking about. I mean, we we we do reflect on this, you know, our research through this lens is is thinking about what's happening through the dynamic of the tragedy of the commons. So let me just name some some sort of the holy every Like things that are in the commons. So there's some complications with some of these that sort of are just essential to everything we do on computers. So, well, one, of course, everything that happened during World War II, with The United States and computation, but we can zoom forward a bit. So the internet was a commons project. GPS, was something that The United States developed for military purposes and then made available in a very like discreet way to the rest of the entire world, not just The United States as infrastructure to support any application that involves facial referencing. So Google Maps, everything is hugely dependent on that investment by United States taxpayers. Now increasingly also taxpayers from from other countries too who are who are setting up their own systems. Wikipedia, we already mentioned, tons and tons of open source code, all these things kept down in the comments. And I think it's important that we in computing recognize how tied we are to that model, if only to provide an altruistic incentive, and other related incentives to encourage us not to behave in, ways that will will damage the comments. You know, with respect to what what a Wikimedia foundation might look like that takes on these types of this this type of broad role. But again, I'd say it actually is is serving this capacity is just under under resourced. You know, something that does come to mind and this is this is all sorts of interesting downsides, but, you know, we literally have a research paper that, you know, with respect to the road example, showed that Wikipedia has better quality articles about places in urban areas than in rural areas. And, if we're thinking about that data as some kind of infrastructure for, you know, that that what that means is all these AI systems have better quality understanding of, of urban areas and then rural areas. So one might think about what are ways that we have served rural areas in the past and, try to replicate that. It, you know, that is, you know, very nation based, but, you know, I'm again, sort of thinking out loud here. One can imagine a, you know, department of data infrastructure. I mean, this is it build it off the census because the census kind of already, you know, has been a world leader in many cases, you know, in providing other types of data infrastructure that does try to create, actually come to think of it. The census does make up the bulk of many rural Wikipedia articles. It's a template generated from census data. So maybe they can write a little about the history and these types of things as well. Thinking out loud, I don't see too many differences in that relative to doing the public works projects that happened in the thirties. And arts funding, you know, and and and these types of things. So that's the you'd imagine sort of a state actor that is participating in these these sort of knowledge infrastructure communities, OpenStreetMap, Wikipedia, and many others you can imagine as well. And that would fuel, you know, a ton of innovation and and making sure that that innovation is then
Speaker 1
62:57 – 64:23
taxed or other resources flow back into the state to make to make sure that everyone's benefiting from it. I'll just throw out three more quick references just to bring things around the history of, the history of reaching out to rural communities. Absolutely rural free delivery for the US post office was a big deal. Rural electrification was of course part of the new deal. We're the first it's it's amazing to think about it you know it seems like everybody has electricity but if you were hanging out in you know in rural Alabama in 1930 it did not seem like everybody had electricity it seemed like that innovation was never ever ever going to get to rural people like you until the government stepped in. For studies of, of how the algorithms have and indexes at the moment, jeopardize black communities, that googling black girls anecdote that I pulled out is from Cynthia Noble. Her book algorithms of oppression is a must read on how Google's will and so on. Data could be structured otherwise, hasn't been structured anyone otherwise, in part because the institutions have not been incentivized to serve all communities. I mean, just think about those rural communities and communities of color and be and being on the same page. They're underserved. And then I would add to that Lauren Klein and Catherine D'Ignacio's book data feminism as, bringing in the the question of of data literacy and women's participation in these knowledge infrastructures as earned in the required reading.
Speaker 3
64:23 – 66:44
Matt, what else do you have for us? What else should we talk about? Well, I'd like to suggest a thought complimentary to what you were just saying, Joe, which is that I think that one of the so I'm going to risk, you know, is bringing in a more sort of legal or political lens here because Go for that. I think that I think that sometimes when we when we think about you know, institutions that are in the position of governance of infrastructure, we have this tendency to talk a lot about how they're incentivized. In other words, are they incentivized to serve all the right people? There's another word that kinda gets at the same thing, which is, how are they legitimated? I think this is sort of an under examined angle. So in other words, when we think about, you know, the source of a the of the legitimacy of a government's power, A government's power is legitimated through the sort of social fabric. Right? In other words, it's just like but it's like bubbling up through the social fabric. And this is why the, you know, the apparatus itself is, like, constituted by the social fabric and and so it's therefore legitimated in the right way. And then if we think about a private institution, right, the the the legitimacy of the power their private institution wields is just the audit. Right? It is just an abstraction. We just say that private institutions are able to do whatever they want with what they own because that's what it means to own things. And what you're suggesting, Joe, about, you know, the Wikimedia Foundation or a lot of the work that doing at Radical Exchange about thinking about these kinds of other sorts of intermediate civil society type organizations relates to this. It relates to this idea that there's a kind of public flavored power that can sort of pervade society, which, you know, which isn't necessarily state power, but it can be like the kind of power that unions exercise or the kinds of or the kind of power that, you know, publicly responsible nonprofits like Wikimedia Foundation might exercise might exercise. And that sort of that sort of power is, is legitimated in a different way than the power of, of private organizations. And I think that that, it's it's it's simply the other side of the same of the same coin, but I think that that language and that analysis is a really important complement to the language of, incentives.
Speaker 1
66:45 – 67:12
That's helpful, Matt. And could you just help to connect that for us? Unlegitimated it's I think I take it to be saying that the legitimate the legitimation behind the Wikimedia Foundation gives it more recognizable moral purpose in our world. Is there a way in which, you can connect that to taxation and the pay the compensation of labor that would build a different kind of index? Can you help us to see that?
Speaker 3
67:13 – 68:16
Well, you know, the the government has the power of taxation because, you know, so, you know, similarly, organizations like unions, exercise whatever power they exercise if and only if they are they are legitimate. And the this is just this inescapably is sort of a philosophical inquiry about the nature of whatever institution we're talking about. I don't know whether Wikimedia Foundation you know, I think it's beyond the scope here to talk about whether Wikimedia Foundation could levy taxes. I mean, you know, no. But there are, but, you know, you you see what I mean. Like, I think that we do need to be a little bit more creative about the kind of power that we, that we want organizations like Wikimedia Foundation or like unions or whatever to, to exercise instead of getting stuck in this dichotomy of state power on the one hand and, and narrowly defined private power on the other.
Speaker 1
68:18 – 68:49
I mean, fair enough. The union says we've got to raise wages. The company has to be the one to raise the wages, and then the union officers are going to get a cut so that the union survives. So part of what Brent has been hinting out here is a wider role for the Wikipedia Foundation, perhaps one in which big Silicon Valley, compensates Wikipedia Wikipedia and Wikipedians for their work. Is that is that something that we might that people are actively imagining, Brent?
Speaker 2
68:49 – 71:27
Yes. Actually, there's news today where they the foundation and we've, you know, like, full disclosure, we've had conversations with them. One of my former graduate students works at the community foundation. So, you know, take what I'm saying with that grain of salt. And I have to learn about the the final decisions that are associated with the announcement today, but they did launch a a paid API for enterprises, to consume the information. And I should stop there because the facts before I before I formed the opinion. But I I also I what I say can't say for sure is I'm glad they're thinking about ways, to make sure that they have the resources they need to keep the infrastructure moving along and and, you know, up to date and and and very, very high quality. You know, Matt, I I actually think I'm, you know, one maybe a nice bow tie for us is that actually this term this term legit, legitimating, right? Is one that perhaps is it describes my research pretty well. When I say we wanna measure and make people aware of the value that we're going to AI technologies through the data that they create in many ways, we're saying you are legitimate stakeholders Right. In this ecosystem, you have more power than you think. Right? And so you're the There are certain actions that are legitimized when you understand you have that power that aren't legitimized when you don't, right? So, you know, we've, you know, this is all news for you, Matt, but we talk about, these techniques called like conscious data contribution, right? Where you can be very careful about where you decide to send your Send the data that you generate and don't like the frame of your data for a number of reasons. One of which is the Some of the most valuable data that we all create for AI systems isn't that all about, sits about the world and Wikipedia is a great example of that. But yeah, this is this notion of effectively like if we can legitimate some of these institutions, like what Joe is talking about, whether that is this, legitimating the state's involvement in these matters, or, other institutions. So like some intermediaries that you're talking about, I was actually thinking we should tell them all to launch a society and talk to the queen and call them Or to launch a journal and talk to the queen, we can all call them societies. And, whether it's those types of organizations or the state itself, I think that's ultimately maybe what we're talking about is the people outside the tech companies have legitimate power over these infrastructures in ways that maybe, you know, aren't commonly known. And one thing we're trying to do is is make sure that that folks understand that and give them the tools they need to to take that legitimate action.
Speaker 1
71:28 – 71:46
Are there concrete steps? I mean, if I was a dissatisfied employee at a large a large search engine company, Hooke, who had volunteered my labor, how is are there steps that I could take to, help my employer to see that I am part of the force that legitimates their success?
Speaker 2
71:47 – 74:33
So employees are different set of stakeholders. I think we're actually I'm optimistic about what I'm seeing with regards to their understanding of of their power in the ecosystem. And this aligns, you know, I think it was Matt Iglesias or something alerted me to the notion that probably tech employees are the ones who would benefit most from a union because there's so much leverage there. It's these are the paid employees. Actually, one thing I do in keynotes when I talk about this type of, you know, this general subject is I like, I actually printed out fake badges and like just hand them out to people. It's like you are actually employees of all these companies because of we talked about Wikipedia. But when you do when you search for something, you click on the seventh point instead of the the first link. Well, you're doing a lot of the you're you're teaching me, search engine a lot of what I know is about how to do relevance ranking. So everyone's everyone's in a everyone's in a with regards to that, that's actually that's that's where, you know, Matt and Nick Vincent's work is is headed, which is thinking about how can we build a language and a framework and a set of tools to help people take collective action. So you can imagine someone like that. It's like, say, you know, I'm fed up. I'm gonna stop reviewing things on Amazon. And I really want Amazon to recognize that this is a collective enterprise and that I have legitimate power in sort of saying Amazon just change your privacy rights. You should pay these taxes, etcetera, etcetera. So Nick has identified sort of three types of collective action strategies. There's the data strike where you organize a bunch of people, to stop, interacting with Amazon or rating and reviewing products. That one, you know, we sort of understand the statistical properties of machine learning and how effective and and how big that would need to be. That one would have to be pretty big, but it would be pretty damaging. Another thing that you can do is you can engage in conscious data contributions. So you help stand up a competitor. And that's actually we haven't talked about common crawl yet, which we probably should do when we're talking about webinixes, but but, common crawl is is in some ways an effort to to do that. But, in the open, you can also imagine giving it to another private company. So for instance, you're frustrated with Google Maps or Bing Maps. You give you decide, I'm gonna just gonna remove you for Yelp or or vice versa. And then the third option is pretty interesting. It's a data data poisoning. And in that case, you sort of poison the comments or, comments that's much more complicated. So we should leave that aside. That one has a lot of discussion associated with it. You can poison the data that you submit to these private repositories. So for instance, poisoning the submission to Google Maps, you know, in the way that, you know, I was describing earlier such that, you know, people will go in loops forever or something like this. That's another way to to sort of take action, and let folks know that your power is legitimate and that, you know, you should be listened to.
Speaker 3
74:34 – 74:44
Well, thank you guys so much. This was an amazing conversation. We got a ton of, a ton of good thinking and talking done. So, super grateful for your time.
Speaker 2
74:44 – 74:55
Matt, that was Matt, Joe, that was really fun. Joe, you taught me a bunch. I'm gonna be, you know, noodling on what you just said for a while and excited to be talking to my PhD students about it and these types of things. So thank you.
Speaker 1
74:56 – 75:18
Likewise. I'm gonna be I'm gonna be scratching my head about data strengths and data processing for some time. This is great. And I feel like I you know, I feel like we could we can easily do another follow-up conversation about other themes and infrastructure sometime. Matt, let's let's sketch our head about another another time to hang out with Fred. But, Fred, I know that you're super super busy, but, but this was too much fun to not do it again.
Speaker 2
75:19 – 75:28
It was tons of fun. That went bad by super quickly. Thanks, Matt, for putting us together. Thank you so much.
Speaker 0
75:30 – 75:49
Thank you so much to Brent and Joe. That conversation was a huge pleasure for me. Thank you to everyone supporting Radical Exchange Foundation. This wouldn't be possible without your support. You can continue to support us at radicalexchange.org. Thank you also to the producers of Radical Exchanges, Jennifer Marone, Leon Erickson, and join us again soon, and have a great weekend.