Champion Metagov

Speaker 1 0:00 – 0:00

Awesome. PhD

Speaker 2 0:15 – 0:15

oh,

Speaker 1 0:30 – 0:30

thanks, Nate. Are you the one recording it? Anyways, Kaylee is a really cool PhD candidate at at University of Washington and incidentally also in my research group. I'm super excited to have her here. I think she has a lot of really, really amazing work. So I'm just gonna let her kind of go into her talk now.

Speaker 2 0:45 – 0:45

Alright. Thank you for the intro. Let me see if I can get this Zoom thing working. Alright. How does that look? I can't see you, so an audible thumbs up, Seohyun?

Speaker 1 1:00 – 1:00

Yep. Looks good. I see it.

Speaker 2 1:15 – 1:15

Excellent. Alright. Hello, Medigov. I'm Kelly Champion, PhD candidate at the University of Washington, Department of Communication, sort of pre proposal phase, getting there. Benjamin Mako Hill is my advisor, and thanks so much for the invitation to present here. Many of your names are familiar to me through your work, so I'm just super honored to be here. What I wanna do today is offer you two governance challenges that have come up through some different angles of approach I've been taking, doing empirical work in the realm of information public goods, including work done by some colleagues. And what I hope to do is kinda give you that full overview, maybe fifteen minutes, and then open up to reflections and discussion. Alright. So public goods in a kinda classic two by two formulation here from Elinor Ostrom, nonexcludable, nonrivalrous goods like security, the environment, public health, and infrastructure. Information public goods, if we kinda turn our attention there, I'm particularly concerned about those that have come to act as infrastructure in our society. So those in thing include things like Wikipedia as a knowledge base, and software like GNU Linux. And Wikipedia and, Ganoo Linux were both created through commons based pure production, an innovative process developed by production communities as Yohai Binkler observed, kinda characterized by decentralization, a variety of governance models, diverse motivations from participants, and kind of orienting their interactions around creating a a commons. And I mentioned the concept of infrastructure. Yeah. I laid out here some expectations that we might have of infrastructure, serving needs, available for use, reliable, maybe invisible. And I would argue that Wikipedia and Linux play this role. They're an information public good in the form of digital infrastructure. So what happens when pure production meets infrastructure, peer production of infrastructure? What we observe in practice is that is a combination of these two bundles of ideas. Peer production is a powerful way to produce infrastructure, but there's some disconnect between these two bundles. And one such disconnect falls under the category of decision making and governance broadly defined. And the on the ground reality of these projects is that although they incorporate some, like, kind of government sponsored effort, or they may have kinda internal governing councils, they're built by a mix of devoted volunteers, firms, people passing by, entrepreneurs, and their governance is generally not subject to the public in general. This picture takes a moment to load, but it'll load. It's a rough analogy to be sure, but imagine, if you will, that every morning, all across the world, every road construction worker wakes up and decides whether they feel like building or repairing or tearing up the road with what tools, whether they'll cooperate with some kind of shared design, or if they'll fight and so on. In physical space, that sounds like total chaos. But in cyberspace, totally normal. That's just Wednesday. And yet despite that, it seems to work. We have incredible technologies to communicate, transact, and learn about the world. So through the hard work of volunteers, self assigning tasks, self organizing, these communities have unlocked an incredible amount of creativity and energy in their participants, so much so that these information public goods have become essential to modern life, but there are some gaps between how these goods get made and how these goods get used. And I wanna focus on just a few pieces of this overall map that I've sketched out here in red and blue, exploring what happens when pure production creates information public goods that become infrastructure. And given this map, you might ask any number of questions about how this left side kind of relates to this right side, but I'm just gonna focus on a couple of spots. On the left side, this work is kinda oriented to two aspects of peer production. On the sort of governance model side, the openness of the organization, reliance on process, not individuals. This refers to what sometimes gets called meritocracy or duocracy, where participation, though, is being judged on merits rather than, say, the identity of the contributor or so we would hope. And from the kind of diverse motivations zone here, I wanna focus on self selection of tasks. On the right side, two aspects of infrastructure, serving public needs and reliability. Alright. So in purple here, I've kinda dropped in some typical governance questions that are implicated as we take up some of these different pairs, who's involved, what gets made, who is served, how are decisions being made. Just a little bit more framing on the information public good side. There's two interesting qualities of information public goods that not all public goods have, and I'm pulling from Fulk et al here. So they're connective in that they materially connect people and actually increase in value when more people are connected to them. Therefore, leaving people out diminishes the value of the good, represented here in my little photo collage is kind of networks. Information public goods are also communal, so they're built through participation, exchange. The benefit for participating is, in some ways, indirect. So what you get out is not what you put in. What you get out is what someone else put in, and, therefore, there's some kind of uncertainty about motivations, benefits, risks, reliability. I've added in the kind of communal dimension here with kind humble potluck as well as a rowing team. Alright. So that's my framing information public goods produced through peer production, serving as infrastructure, connective, communal, nonrivalrous, nonexcludable. Alright. So that's a framing overall, and here we are. We're gonna I mentioned two angles of approach I've been using and that this effort has some implications for folks thinking about online governance, and that's where I'm gonna go next. So one line of work is with respect to the value of anonymity. Online environments have some powerful affordances to support privacy and surveillance both, together with colleagues at Drexel and NYU. We've been taking this kind of value of anonymity angle on information public goods, and our work has revealed a couple of interesting observations. Certainly, minoritized and vulnerable individuals use privacy tools to protect themselves and engage online. But for some folks, being less identifiable is a vital way to protect themselves so they can simply live their lives, and for others, privacy is a personal choice. Less identifiable folks need not be people taking strong intentional measures to hide their identity. They might be new or very casual or in whatever way less engaged or less forthcoming with information. In general, we find that people using stronger forms of anonymity online participate in ways that are comparable to newcomers and casual participants. However, we also find some evidence for unique value that anonymity seekers bring, particularly with respect to engaging with marginalized or taboo topics, engaging in activism, and critique of the status quo. Service providers, platforms, communities, etcetera, often focus on avoiding negative interactions, and the needs and interests of the anonymity seekers are often not articulated in governance discussions. The sort of long tail characterizes online communities in many ways, but that's not where governance generally lives. Community governance often lives in the hands of the top contributors and the highly identifiable. I have some references here, and I'm sure folks can just kind of pull that off of the slides later. There's no need to kind of peer at the eye chart part. So, the problem statement overall, anonymity seekers are valuable, yet anonymity seems to always be under attack from surveillance capitalism, from government regulation, from the march of technology to identify people, and from within communities seeking to protect themselves from bad actors. So I argue that it's vital that we understand what is lost when we push for greater identifiability, that we seek solutions and approaches that respect anonymity as itself a public good. So I have kind of an overall question here. How can communities make decisions that attend to the needs of anonymity seekers? But just to make it a little bit more detailed, I wanna add back in a couple details from the framing that I sketched out. Let's consider that these are information public goods. So, excluding people harms everyone. The target of the empirical work has been articulating the value and experiences of anonymity seekers, but how might we do so even if their value is not quantifiable? As Ty Nguyen expressed in this forum previously, quantification can drive conversations in ways that ultimately run counter to our values, and the best response may not be better quantification, but rather to accept that justice may challenge us to embrace even those populations whose value we cannot quantify. So I hope that gives folks, something to mull over, but as I mentioned, I have two angles to talk about. The other might sound really different, but the relationship, I think, will be clear by the end. So, let's talk about maintenance. Second angle on challenges posed by information public goods, digital infrastructure, and how to sustain it. So this line of work has been conducted together with colleagues as well as with marvelous support from development communities themselves, And my work here is built on a basic expectation of infrastructure. It seems very reasonable to expect that the most important pieces of infrastructure will also be the highest quality. We hope that the bridges that carry the most traffic will also be the strongest and that they'll be regularly inspected and so on. So here's a model of that expectation. We have quality along the y axis and importance along the x. And in the upper left, where quality is high and importance is low, we've labeled that overproduction. Maybe it's wasted effort. In the lower right here, importance is high, but quality is low. And we've labeled that over we've labeled that underproduction, because that's an area of of risk for communities, where we'd like to see the world, perhaps, is lined up through the middle, given kind of scarce resources. Let's all be proportionate in how we allocate them, so that high importance and high quality go together. But what I found in testing this expectation, highest quality for the most important components against a body of digital infrastructure is that it is violated to a very worrying extent. The project that I did here, was focused on Debian Linux, which is the backbone of the web and the cloud. And of the about 22,000 components that comprise Debian, I found more than 4,000 that violate this standard. About one in six components were strikingly low quality given their relatively high importance, which means that they are underproduced. Low quality, high importance packages. This is a source of risk to our shared digital infrastructure. This is a heat map of those packages. You can see the problem area here, highly important, low quality packages. But this same type of analysis could be applied to any information public good. I was inspired here by work from Morton Wernicke Wong on Wikipedia, where he and his colleagues found that such topics as mental health are underproduced. Again, top line result is that we see substantial cause for concern in terms of risk to digital infrastructure. And in the conceptual map slide, I kind of emphasize self selection of tasks. It's not quite as bad as the construction chaos photo, but, again, we see a peer production process, volunteers doing their best, corporations pursuing self interest, government may be oriented to security, scientific productivity, NGOs trying to find and fill those gaps. There's no central planning office or shared decision making mechanisms for importance, although the Linux Foundation is kinda seeking to build some consensus there. So how is quality measured? There is a movement called evidence driven software engineering, but it turns out it's far from reaching any consensus about how to measure quality in the first place despite two decades of work at least. And those measures that are in use, such as in analytical tools and metrics platforms are often not validated empirically or give contradictory results. And for this finding here, I conducted a a tertiary literature review of software quality measurement studies. That's currently under review, but you can pick it up on the archive. Alright. So, my target in future work is to identify how information public goods come to be neglected and how we can mitigate and prevent neglect. But the open question from a Medigap perspective is how can online production communities maintain digital infrastructure in ways that better engage with the needs of the public? And, as before, I'm gonna make the picture just a little bit worse by adding in some more of the framing from before. Information public goods are communal. There's this beautiful complex social process. Creative freedom has really helped these projects to grow and flourish. So how might we engage with the needs of the public without threatening the kind of self determination of creators that enabled these goods to be produced in the first place? So in conclusion, information public goods pose vital governance challenges with respect to what gets created, for whom, and by whom. And although I described two really different angles on information public goods, one oriented to anonymity seeking, the other oriented to digital infrastructure. I think that there's a kind of underlying question here that stretches between the two cases. How can online production communities engage with a public good? Oops. Not that. Alright. Thank you so much for your attention. I'm really eager for questions and discussion about these kinda open questions that I posed, or with respect to the empirical findings. I know I flipped through things really quickly. It's like four years work and fifteen minutes. So thank you so much for your attention. And this is me.

Speaker 1 1:30 – 1:30

Hey. Thank you so much, Kaylee. It's really partially seeing everything kinda come together because I have seen these, like, in separate pieces over the years, so it's really nice to see it all put into one. Can we do, like, a round of applause? Like, unmute and do a round of applause? Okay. Cool. There are some questions in the chat, Kayley, if we we can just, like, kinda go through them. I think some of them might correspond with some of the figures you have or, like, the slides you had. Sure.

Speaker 2 1:45 – 1:45

Alright. Let me get the chat going here, and, yeah, I can kinda reshare things.

Speaker 1 2:00 – 2:00

Yeah. I think I think, Zargan, I think your question was the first one in the chat about, like, the network science framing. Did you wanna

Speaker 3 2:15 – 2:15

Oh, sure. I don't think it's a huge deal. I was more of a just checking the sort of interpretation that with these information public goods that the value scales with the edges as opposed to the nodes. So if you think of there's a couple different ways to tether the concepts of node and edge, but that's been the framing I've seen and used. And I was just checking if that kind of jived with the interpretation of the value creation through connectivity.

Speaker 2 2:30 – 2:30

So that's interesting. I think that the framing of this in kind of terms of networks dates back a little bit before network science discussion was quite as, like, focused as what you articulate. But you think about something like a phone network, there's value in everyone having a phone and being callable even if you never call them. So there's value in the edges, but there's also value in the nodes. There's a there's value in letters being delivered to everyone's home even if people certain people rarely get letters.

Speaker 3 2:45 – 2:45

I agree. But I would say that the the framing in terms of the incidence of the events isn't the graph. The the graph that I would be interested in is the same thing that you're describing. It's the it's the abstract graph that produces all possible interactions, which scales in the edges, which is why it explodes. You start adding people and you get for each, say, three people, four people, it's growing super linearly beyond the scale of the count of people, but actually in all of the potential interactions that have been made possible.

Speaker 2 3:00 – 3:00

Great. Yeah. That's a great observation. Alright. I see Parker asking about anonymity and privacy. So I didn't dig as deeply into kind of the con the concepts there. I think that privacy refers to a multiple states that people can be in. They can be in a private state even when they're with someone else. There's different kinds of privacy or privacy states. On the other hand, anonymity refers, I think, to a certain state of identifiability in a given context. So there's many dimensions of identifiability, and the sort of diminishing of any of those dimensions does make someone sort of less identifiable. But I think of it more as a continuum than a binary here. I don't know if that's helpful, except to say that, some of the framing that we pull here is from Marx, and that the multiple dimensions of identifiability are really significant online when you think about, if you speak of someone as anonymous because they're identified only by their IP address. Sadly, I would like to inform you that your IP address does not typically make you anonymous. But in some communities, referring to someone as their IP is makes them a non, Wikipedia, for example. On the other hand, you could give your real name and still be quite unidentifiable if there are many John Smiths in the world. Elizabeth asks, are there different online governance cultures that are more friendly to anonymity seekers? So, here, I would say that, in some environments, anonymity seekers built that community, and they're extremely friendly. And anonymity is baked in from kind of the it's in the bones of the community itself. In other places, especially as a shared sort of store of value accumulates and is sometimes under attack either by hackers or crackers or vandals or, you know, misinformation people, misinformation like bots and agents, People can start to say, hey. We wanna know who these folks are, who are tampering with our software supply chain or who are, shaping the articles that we write and that we put out there for the world. So I think that, definitely, cultures differ, and we you might find also, like, when people are seeking psychological support or social support, their, anonymity seeking, is sort of a, also, like, baked into the the culture of the community from the very beginning. Let's see. Saffron comments on crypto communities. Yes. Definitely. Elizabeth is sort of throwing things in here. I'm sorry. I didn't follow the chat as I was speaking because I have to do one.

Speaker 1 3:15 – 3:15

I saw Elizabeth and Parker, both of you unmuted a little, though. Did you want to say anything? Or

Speaker 4 3:30 – 3:30

No. That was just in case I could thank you for making your comments. That's just courtesy.

Speaker 1 3:45 – 3:45

Alright. I've got my eyes open. So

Speaker 4 4:00 – 4:00

But I appreciate that.

Speaker 2 4:15 – 4:15

Oh, so so, Tianna, you you were asking a desire to be brought into the fold of community governance decisions or they take it as a result of their anonymity and not care. So, I think that especially when governance decisions affect them, there's absolutely some advocacy, some protests, some pushback from anonymity seekers at their exclusion. But, also, there's a I think that, you know, folks understand that there's sort of maybe some kind of limit. But, in general, if you're seeking anonymity for human rights purposes, for example, you might indeed want to be quite engaged and wanna be a deep participant. You just simply can't disclose things in a way that that community desires or expects. So, absolutely, I think that that folks wanna be brought into the fold. There's many, I'm sure, who don't, but then again, that's the case with any community. And I I think in general, our work finds so many places where anonymity seekers are so similar to so many other kinds of users, this idea of treating them as in a sort of exotic, strange, bad actors. You know, any of these, like, kind of ways they get characterized turn out to just be not sustained by the data. It may be sustained in our feelings about them, but it's not sustained in the data that we see. Alright. Kinda scanning through here.

Speaker 1 4:30 – 4:30

I think Nick's question is Nick slash Marcus's question is next, but Josh also has his hand up. Josh, can is is your hand up for the question that's following? Or

Speaker 5 4:45 – 4:45

Why don't you Marcus, why don't you go next?

Speaker 6 5:00 – 5:00

Sure. Should I rephrase?

Speaker 1 5:15 – 5:15

Feel free.

Speaker 6 5:30 – 5:30

Well, you you mentioned the well, first of all, thank you. And I really liked your talk, and I appreciated the visuals that helped see your points quickly and clearly. The the heat map felt as I tried to reason about what would it feel like to be in a community that contributes that way, I found myself wondering if in fact that contribution pattern is natural and even desirable. Like, I I I I was wondering how you feel about pushback to the story that there's a problem here. What about the possibility that discerning what's important comes first? And then once people have learned to see what they want to have improved in the community, they begin to climb the the quality hierarchy of contributions, and they make their way into that other cluster in the heat map of high quality, high importance.

Speaker 2 5:45 – 5:45

So I can share that back up just real quick here. Yeah. So this refers in particular to software packages. So this is not just contribution quality, which is sort of a separate line of analysis, but actually package quality within Debian itself. So the situation here is not just a matter of, like, finding bugs that are good for newcomers, but rather patterns of systemic neglect of certain packages. The same with Wikipedia, patterns of systemic neglect of certain topics, articles, etcetera. There's definitely opportunities for people to say, oh, well, this is really highly needed by the community. Let me jump in there and do that. There are many mechanisms in Wikipedia or in in source forges that people use to try to direct attention, but naturally sort of expecting people to, on their own, find the most important things and find the most neglected pieces does not seem to happen kind of spontaneously. And there might be really good reasons why these items, are low quality. They might be, packages that are simply very hard to maintain where a high level of skill is needed. They might be ugly and painful and difficult and problematic in various ways. Or with respect to articles, it may simply be that high quality information is not readily available, or the kind that's available is not in line with what Wikipedia say expects from people when they contribute to that topic. So I think that there's it it may be that this is a natural behavior of these systems, but my guess is that in terms of a kind of broad public good kind of perspective, in terms of good of the public, it's probably not a desired behavior in most cases, at least when the the system is relatively mature. Early on, sure, people might, you know, be pretty excited to find neglected corners and jump right in there. But as a a system matures in particular, I think that these are these neglected areas become real causes for concern.

Speaker 6 6:00 – 6:00

Yeah. That clarification was very helpful. I I see what you were illustrating better now. Thank you.

Speaker 2 6:15 – 6:15

Thank you for the question. Alright. And some water.

Speaker 1 6:30 – 6:30

Josh, I think your question is next.

Speaker 5 6:45 – 6:45

Sorry. Great. Great talk. Question on similar graph that you just showed. Do you have a sense of how that's the approach in Debian compares to the, you know, what happens in, like, a corporate version? I suppose, like, Facebook or Microsoft is trying to produce Debian or, I mean, they did produce a and or several operating systems. Mhmm. Do you have a sense of how the sort of that sort of underproduction sort of, like, distribution works in the corporate setting, which is, as I understand, the main alternative to pure production, right, for software?

Speaker 2 7:00 – 7:00

Well, corporations are often heavily engaged in pure production. That's what, like, most of the web and cloud is produced on. So they're also engaged in this process, but I I take your point that in house, there may be a little bit of a different picture going. That that said, if you talk to the system administrator of and I only have kind of anecdotal evidence. I have not analyzed a commercial operating system. But I would say that if you talk to any system administrator or any IT department anywhere about neglected corners, about the dirty things they don't wanna talk about, they will say, oh, yeah. Let me tell you about this one server. Everything relies on it. It runs our licensing. It runs our timing. You know, it runs this. It runs this. It runs that. Don't touch it. Never maintain it. Don't reboot it. The guy that set it up left ten years ago. When it crashes, we're all here for the weekend. You know, everyone has those of neglected corners that often become point pain points because they're not maintained. And we see in some of the vulnerabilities that have emerged in software that these these neglected components come back to bite us again and again. So I think that, neglect of key components is probably, far beyond pair production, and extends to the corporate world as well, but it's true that the analysis is really scoped just to Debian in in this case. Oh, and then Mako says he has an answer, so maybe, Mako probably has a different answer.

Speaker 4 7:15 – 7:15

Well, I mean, this is a much more, like it's just a direct sort of personal experience. So I so, so full disclosure, I'm so I I've worked particularly on that paper, so I'm in so I know a little bit about the that one. But I also I think more relevant to this point is that I, before becoming an academic, helped start a company which was building an operating system built on Debian. It was Ubuntu.

Speaker 5 7:30 – 7:30

And I can

Speaker 4 7:45 – 7:45

tell you what we did then, which is that we literally made a list of packages that we thought were important, and we and we paid people to work on them. So we just we we used money, in this case, resources from some eccentric South African billionaire, to, like, you know, pay attention to the things that we cared about. So we just divided Debit into two pieces, and we allocated resources and attention towards the stuff that we thought were important. Now to what extent are the things that we think are important, are those the things that are underproduced in this analysis? We could look I mean, we could look at this, and I don't I don't know. Like, I don't think we have. We did pay a lot of people to work on things that are sorta towards the top of the list of underproduced packages in deviance, stuff like, you know, like, integrative sort of, like, key desktop components that are used by lots of people and are hard to maintain because they require in some sense, it's a different governance challenge. Like so, I mean, I don't know. The the least was Gnome Power Manager is, like, the worst one. Right, Kaylee?

Speaker 2 8:00 – 8:00

That's true. In the in in the more detailed, gory detailed version of this talk, I highlight some of the neglected component. And Gnome Power Manager, the poor maintainer. That one's at the very top of my list of the least maintained. And and every time my laptop shuts down without warning me, I I shake my fist at Gnome Power Manager. But, you know, that is indeed the the worst and most neglected package in w.

Speaker 4 8:15 – 8:15

A lot of the packages are kind of like can own power manager in the sense that they are these really kind of, like, complex sort of integrative things. So can own power manager is the you know, it's a it's your power manager applet. Right? And it's this, like, kind of like there's a there's a there's a set of specific governor challenge governance challenges related to modularity in the context of something like Debian because Gnome Power Manager involves, like, you know, you have to reach all the way down into the the kernel and the, like, the the the most sort of, like, fundamental aspects of the system, and you've got a desktop application running on the person's, like, you know, graphical user interface and translation and localization problems. You have, like, the full range of problems, and fixing problems in good empowerment manager often involve reaching into lots of different other parts of the distribution, which generally speaking are maintained by different people. So, I think that that, that in Ubuntu, we solve that problem by essentially getting rid of individual package ownership. We pay every these are the thousand packages that we're gonna take care of, and everyone has a 100% access to everyone. There's no name on anybody's package. And so there's, like, a different governance arrangement and a different sort of, like, funding arrangement, which allows people to address problems in a really different way.

Speaker 5 8:30 – 8:30

So I

Speaker 4 8:45 – 8:45

think that, like, my sense is that's also what other organizations that were working in this space would do as well. But, I mean, I my my my knowledge is more limited to that to Ubuntu and, really, Ubuntu in, like, the first eighteen months. I left pretty early.

Speaker 3 9:00 – 9:00

Can I comment on this? I think one thing that you highlighted here, though, is that the the dependency graph is super important. So, like, the topology of the underlying dependency graph is probably highly correlated with these scores. So, you know, without the context of the dependency graph, you probably can't totally understand or interpret these, these maintenance relative, levels of maturity, because of the effective operational cost to actually and, governance cost of actually mutating these things. And, what I was actually gonna ask bef when I first raised my hand was whether you have experience with this type of analysis on things that aren't operating systems because my experience is mostly in the Python open source community. And things are a lot more modular, so we don't have the same kind of deep dependency graphs that would cause some of the holdups that you were just describing. Though there are obviously heavily used under maintained things that's not gone, but there is a higher degree of mixing, I think, between the pure pure pure production where people show up and develop a package and the corporate funded, hey. We think this is important work stream. But, it kinda as you went through this discussion of the difficulties with, say, the power manager, I realized that, you know, we just don't have the same level of, like, the the the denseness of the connectivity and the dependency graph is much lower, which means that you won't have quite as harsh spikes of this is fucking impossible and expensive to change. Pardon my swearing.

Speaker 2 9:15 – 9:15

Yeah. I would be super interested to see what the results look like, with respect to Python. So we should talk if you are also super super interested in what a production analysis of Python might look like. My first suspicion is to think about, okay. So what about when Python libraries reach into kinda under maintained c libraries? And I'm speaking in part from my own, like, compilation hell from, like, trying to get Python modules installed that rely on out of date c libraries. But those even what seems like modularity or may seem like independence can ultimately come to kind of run into these under maintained components.

Speaker 3 9:30 – 9:30

If you ping me, I'll introduce you to one or two of my engineers because we recently were trying to refactor a Python library, open source library that we built to have a a a to use c libraries and to accelerate, and it was an absolute mess. And I'm I'm pretty sure they might have abandoned it and just stuck to pure Python, and they're building pure Julia for performance and, like, abandoned the c integrations because of that problem that you're describing.

Speaker 1 9:45 – 9:45

Josh, you wanna ask your follow-up question?

Speaker 5 10:00 – 10:00

Oh, sure. There were kinda, like, two follow ups, well, maybe the jury monitoring one is just more for fun. It's like, if I was some conniving, you know, architect of this community, you know, and I was like, I had control over the package dependencies or I don't know. Maybe I don't have control. Well, obviously. But, like, could you actually, like, sort of stick certain packages together to order in order to basically, like, get the community to refocus on certain things? Like, I'm gonna take this really unsexy one that's really annoying and put it in with the really hot one that, like, all the people are really, you know, like, on.

Speaker 2 10:15 – 10:15

Sort of a an eat your vegetables before dessert kind of a kind of approach?

Speaker 5 10:30 – 10:30

Mhmm.

Speaker 2 10:45 – 10:45

I have a feeling that most software devs would see right through that. But, it's true that that if you have a conversation around in in the community around, like, hey. We know this is unfun, but we need to do it, and here's how we're gonna altogether, as a group, kind of make it a little bit more fun for the rest of us. We're gonna do it at a hackathon at our next, you know, big meetup or what have you. You can gerrymander the community in in ways that attends to these types of neglect, instead of making it just a matter of, like, guilt and punishment. When I presented this work to back to the Debian community, most folks were excited, but some folks were like, oh, great. Someone else reminded me about something that I didn't do. So I think that shaping the community around kind of releasing that kind of neglect guilt and tackling it in ways that are positive, I think, would indeed be be really productive.

Speaker 5 11:00 – 11:00

And I guess a very similar question slash follow-up to that is that's slightly more serious is, I guess so this is, like, a question that's very, like, relevant to me right now thinking about, like, the design of Medicup itself, like, the the package that other members some of the members of this group are kinda working on, which is that, like okay. So if our expectation and that's currently it's currently not. Like but if the expectation is that eventually, you know, this is a project that will be sort of maintained through pure production, does that have implications for how we should choose the design of our architecture? So should, like, a very, you know, very sort of, like, small dependency graph short dependency graph? Or because that's just, like, what pure production is, like, good at. Otherwise, you introduce blockers. You know, are those, like did your research kind of have implications for that question?

Speaker 2 11:15 – 11:15

So I think that the kind of history of peer production definitely pushes toward that, like, notion of modularity and trying to make pieces, at least to some extent, independent. I also think that the ability to refactor, rework, throw away pieces without throwing other things away can be pretty powerful as well as kinda ease of documentation when things are turned over. So it seems like you're on a a good track there. Let's see if there's other kind of implications I can think of kind of vamping here. I would say having conversations about what's valued, about importance, about neglect in an overt way in terms of, like, surfacing those needs, making sure that that information is in front of people, and thinking about how the voice of the public, how the voice of usage can be articulated back into the community in in signals that folks will sort of attend to would also be really valuable. Something that I noticed in the Debian analysis, when we were analyzing the rate of bug fixing was that the the system for triaging bugs divides bugs into release critical and non release critical, and that's a sort of an emic signal that comes from that community and from their priorities. And I'll tell you that release critical bugs got fixed much more quickly than non release critical, and this was born out over a very long history. So the community is responding to its own signals of importance and prioritization, and I think building in the kinds of signals that you think make the most sense that folks can then orient their kind of cultural choices around, I think is also just kind of that this this work kinda speaks to that as well. And folks are putting you into relevant work. That's great. There's definitely some of that.

Speaker 1 11:30 – 11:30

I think we're at the end of the question here in the chat. Did people want to, like, not put their question in the chat, but they have one? Or Max, is that a hand up?

Speaker 7 11:45 – 11:45

Yeah. Yeah. I was I was wondering if there's if you thought a a little bit about the dimensions and information public good might vary. Like, what types of information public goods might there be? How do you develop a taxonomy of that that's useful?

Speaker 2 12:00 – 12:00

Types of information, public goods. Well, I think that there's definitely enough of an analogy to public goods as we've identified them in the sort of physical realm that you can think by analogy there as a basis for ontology. Certainly, Yochai Binkler's exploration of pure production is full of examples and typologies and so on. So that, I guess, would be, like, if someone at wanted to ask me to put a typology in a paper, that would be where I would reach, and I would look I would read through wealth of networks one more time and think about what categories things fall into. So that would, I think, be my my first go to. What do you think, Mako?

Speaker 4 12:15 – 12:15

Sounds like a good answer to me.

Speaker 5 12:30 – 12:30

Yohai has all the answers.

Speaker 4 12:45 – 12:45

Well, I mean, I think that there's a lot of different I mean, we were talking I mean, Kayla and I have been talking about this, even just in terms of thinking prepping for this talk, and I think that, you know, there's a range of kinds of information public goods that are not pure produced. Right? Like, I mean, we're talking about the, you know, food competition data, like like, the from The U USDA or whatever. Right? Like, important sort of, like, example of an information public goods, not pure produced. Right? Like, maybe involves many contributions from various places. Maybe we could think of pure produced way of ways of making it, but I think that, yeah, there's a there's a there's a huge range, and there's even, like I mean, I I mean, I think for me, it's like, let's let's let's try to enumerate dimensions upon which we might categorize them, not just the not just the way in which we would group them. So

Speaker 3 13:00 – 13:00

Does it make sense to reason not just about the modes of production of the information public goods, but also, I mean, what constitutes their maintenance? Because to the highlight here earlier, you know, at least from my experience, there are places where I've been involved in the production or maintenance of what you would consider information public goods. And, actually, you know, while the initial creation gets a lot of attention, the the actual persistence of those things depends a lot on maintenance, and that maintenance tends to get nowhere near the funding or attention that initial production does. And so just from, like, a any sort of taxonomy of them seems like it needs to have a a pretty significant, attention to what time, energy, attention, expertise, etcetera is required for upkeep.

Speaker 2 13:15 – 13:15

Yeah. I I think that grappling with maintenance is a is a key part and an a neglected part of of creating a a project. I was reading a study that talked about how from a a corporate perspective, creation of a new software package or an app or what have you is maybe 20% of the cost, and that the remaining 80% of the cost is often in the maintenance and support. But that the emphasis, as always, kind of goes to the initial creation. Anybody who's been involved in budgeting around, say, a university, I think, you know, the capital cost of a building versus the ongoing maintenance cost of the building. It's easy to get funding for the capital building. It's very hard to get funding for the ongoing operations.

Speaker 3 13:30 – 13:30

How do you see the peer production net being possible as the as the ongoing maintenance even in the case where the original information public good was centrally produced? So datasets are a good example of that where you see networks being created and spun up once that public data is released where people actually mirror it, store it, content address it, make it discoverable, and, like, index it, whatever. Because then all of the operations and maintenance is pure production even if the original information public good was produced by a government agency.

Speaker 1 13:45 – 13:45

Mhmm.

Speaker 2 14:00 – 14:00

So sometimes, if something is badly maintained enough, peep a community will swell up around it and and decide to maintain it, and sometimes it will be allowed to die. I think that we perhaps don't always recognize when it's time to let something go. How do how do you kill off a digital a piece of digital infrastructure? How do you kill off it is very hard. How do you get rid of a digital or information public good? How do you quiesce those things? And how do you recognize when it's time? I think that we're not as good about, knowing when it's time. But the some of the work that I was doing about software quality based on public datasets, those public datasets were written by NASA in the seventies and eighties, and they're used for modern machine learning today for assessing how to write good quality code. Do you want the NASA engineers of the eighties telling you about how to write great quote code in twenty twenty one? Probably not. And yet, that's where the public datasets come from, that are used for for building some of these, some of these tools. So maybe it's time to let the the NASA datasets die.

Speaker 1 14:15 – 14:15

I mean,

Speaker 4 14:30 – 14:30

I think that's a good question. I also think it's an open, like, it's an open area of research. The both the links that I sort of put in the chat are sort of about this in some sense. They're both involve analyses of examples of pieces of software that were released that were originally written in a proprietary form and then were released publicly afterwards.

Speaker 5 14:45 – 14:45

So let me think of, like,

Speaker 4 15:00 – 15:00

think of, like, Mozilla or something which are of his Netscape demonstrations. I think that's, like, one of the examples. But they have a bunch of these things. There's some other work by Abhishek Nagaraj on seeding in the context of OpenStreetMap. So they look at tiger data, for example, and how that is released into OpenStreetMap. And the finding is pretty interesting. What is it's the generally speaking, the places that get the better tiger data end up with less active community contributions around them. People come in and they are engaged in maintenance in ways that cause them to, in the long term, to participate more. It's a pretty cool paper. I'll try to find that one and link that one in there too.

Speaker 1 15:15 – 15:15

Seth, you you threw a question in the chat. Yeah. I can ask you. Is a centralized coordinated approach important, or do you anticipate this kind of decentralized solution could bring resources to maintenance?

Speaker 2 15:30 – 15:30

At Kaylee? So I think that, the success of peer production to date has been with respect to decentralization. But I do think that the the question is what do you centralize and what do you decentralize? And I think shared values, shared data, shared understandings of importance is much more powerful than, like, any kind of command and control. We've seen just a flourishing of creativity, when people were kinda united around some shared vision or at least a loosely shared vision around what they're creating. So I think that distributed, maintenance is absolutely possible, but the building of kinda shared understanding and shared prioritization is is critical to making the the maintenance then succeed in that mode, if that answers your question. Oh, there's a question in here too from Christina. Regenerative organization.

Speaker 8 15:45 – 15:45

Yeah. I'm just, familiar with I'm an ecologist and I'm familiar with, organizations and approaches to teaming, teamwork, and organizational design that are from communities very much, not technology based, who are really leaning into and trying to apply in a in a very, you know, scientifically grounded kind of way, the principles of living systems design and making, health the metric by which they they look at that. I know that, like like you just said, what do you decentralize? What do you decentralize? Like, the structural approach is much more common in the software world, and I'm wondering if there's anybody doing interesting work on looking at indicators of community health in software.

Speaker 2 16:00 – 16:00

There has been a lot of work on measuring what people take to be a metric of community health in terms of participation, engagement, diversity, inclusion, etcetera. So there's a lot of work on community health from that perspective, but it's typically oriented to, I guess, observed ideas of what we think a healthy community looks like rather than being outcome oriented in terms of if a community is healthy, then it will look like this. Do these things actually indicate health in that respect? So it it's folks saying, okay. We think that a really healthy community will have people who are using really cooperative language rather than contentious language, and we're gonna run a sentiment out analyzer against the the way that they're articulating in their commit messages or what have you. But then does that actually lead to better software, say? Does it lead to happier engineers? I think that the the pieces have not been kind of all plugged together there in terms of of how that analysis has been done. I'm sure many of those factors do matter. But what how do they matter and and what do they drive versus what do they not drive, I think is less clear. Alright.

Speaker 1 16:15 – 16:15

Okay. I think we have about a minute left, so maybe we can, like, wrap up here. And if people have, like, follow-up questions, I think Kaylee had her email shared on the slides earlier and whatnot. But, yeah, thanks again for coming out, Kaylee. Oh, she put her email in the chat just now. It was really great, and I had a lot of fun listening to the the session. Anything can you stop the recording, or can I just hit stop?

Speaker 8 16:30 – 16:30

I'll just

Speaker 1 16:45 – 16:45

hit stop.

Champion Metagov

Top Keywords

Transcript

Listen