Chan Metagov 20231115

Speaker 1 0:00 – 0:00

Hi. Hello. And welcome to another Medigov seminar. Today is 11/15/2023. I'm Seth Poston, the community lead here at Medigov. And I'm very pleased to welcome Joel Chan today, who will be talking to us about socio technical infrastructures for interdisciplinary scholar synthesis. Joel's work, was shared with the community by, I believe it is B. Cabela, and is a really great example of someone finding some really interesting work, happening, out in the environment, and then that work finding its way into the context of MediGov. Joel is a professor at the University of Maryland in the Information School, and was previously a postdoc at CMU's HCII, working on topics like finding analogs between research papers in order to draw insights from one field to make discoveries in another. And he's broadly working now in the space of creative knowledge work and creativity support systems, zooming into individual cognition, and zooming out to organizational structures, particularly in areas like design and scientific discovery. Joel will present for twenty minutes, and then we will follow that this that presentation by thirty minutes of moderated discussion. I'll be moderating the discussion today. The way that we conduct ourselves during the discussion is if you have a point that you'd like to raise, you're welcome to post it in the chat during Joel's presentation or after. Comments, questions, etcetera are all welcome. If you would prefer to simply contribute to the conversation with your voice, you can simply type the word stack, s t a c k, in the chat, or you can raise your hand, and I'll keep track of who is in order to speak next. Super. Okay. With that, I'll pass it over to Joel, and I'm looking forward to your presentation.

Speaker 2 0:15 – 0:15

Wow. That's that's such a great like, I love that intro, and I love the the governance in the discussion. I never heard that before. It's that because it's it's pretty cool. Yeah. Thanks for inviting me. I'm really excited to to be here. I thought I would recognize nobody, but it turns out I I know Ronan really well. So how's it going, Ronan? And I'll be honest. Like, when I first got an invitation to talk, I kinda skim skim through the list of talks. I was like, this seems very different from what I do. But as a prepared talk, I was like, wow. This is actually I was surprised by how easy it was to kind of frame the connection. So I'm curious to hear from you. I see a pretty tight connection, but I'm curious to hear from you. So what do I wanna do? So in this kind of big part of my work, focusing on removing barriers to effective synthesis as a creative support system. Right? Because I want to enable scientists to ask better questions faster, but perhaps more of direct interest to this community, enabling better decision making and innovation, over tough, policy problems, governance problems, and so on, where we may need to innovate as opposed to just decide what to do. So some terms in the beginning, what I mean by synthesis, it's this idea of, like, putting things together into a new whole and getting something that's more than the sum of its parts. So we had this idea from Striker Posner of this, like, new coherent intellectual whole that has, like, a degree of conceptual innovation, and it clarifies and resolves. It doesn't, like, sort of hide the the warts in the the stuff that you're synthesizing, and it enables, like, progressive problem shift. Why do we need synthesis for innovation? I'll give you a quote from one of my participants. This is an applied science VC firm, venture capital firm, and they wanted to know what are the known constraints, both technical, commercial, and the things that we care about. What specific pieces of knowledge exists with it as constraints? How might we uncover counterintuitive or non obvious intervention points, places that they can intervene in in that that problem space. They really wanna understand the the opportunity landscape, and for that, they need a synthesis. What are some examples of synthesis? Technological road mapping might be one. You may have heard of this, may have seen this, kind of mapping out a landscape of constraints on brain activity mapping or and so on. It's really, like, tying that to a synthesis of what we currently know about this landscape. Another example is something that was actually used in the Department of State for quite a while. It's a policy oriented synthesis on what we know about how to respond to violent extremist organizations. This was amazingly useful, but it was so difficult to make that the the group that made it told me that they changed their policy for accepting contracts such they would never do this again because it was so difficult. But it's amazingly useful, immensely useful to have this synthesis in terms of, like, how credible the evidence is and, like, where the bards are, kind of as we as we saw. As I alluded to, this is really hard. Just one data point, systematic reviews. One subtype of synthesis takes a really long time, and often it's just never updated because it's just so difficult to do. And I think of systematic reviews and meta analysis as just one lower bound on the cost of doing synthesis because it's super narrow. So you're addressing a single question, typically, population intervention compared to outcome, very narrow. Whereas, typically, you'd sub need something much more complex like a landscape. So the big question is how to effectively how to accelerate scientific discovery and collective governance by lowering barriers and synthesis. So today, I'm gonna give you three I was talk about three ideas. Right? The first is, kind of describe the promise of discourse graphs as a protocol for accelerating synthesis. I'm gonna talk about the the problem of the missing social dimension of infrastructure if you want to actually have an infrastructure for synthesis. And I'll introduce you to the idea of integrated crowdsourcing as a building block for growing this infrastructure that we might want. K? So first, let's talk about discourse graphs. What are discourse graphs? They're a kind of graph structure where the notes are discourse moves, like claims, evidence, questions, and the edges are discourse relations between them, like support opposing forms. So for example, you might say, are bans an effective way to mitigate antisocial behavior in online forms? Question. There might be various claims on that. Like, no. It can't scale. Or, yes, it's an effective response and sub claims that support or oppose them. And then you wanna tie that to evidence. Right? Specific results or empirical evidence from papers. Like, users of subreddits, bad hate speech, did not engage in hate speech in new forms they joined. Right? It's contextualized to a specific data point so it can inform, you know, various claims and questions. Right? So pretty basic. It's like, you can think of it as an argument map. Yeah. It's not new.

Speaker 1 0:30 – 0:30

I'm

Speaker 2 0:45 – 0:45

not the first to come up with this. This is a long line of information models. Scroll onto back in February, micro publications 2012, 2016. And these all have roots in kind of the theories of argumentation discourse like Steven Toulmin's the uses of argument. So there's a pretty mature line of thinking about how to model discourse and thinking about scientific argumentation as a kind of discourse where it'd be useful to model that. Why do you think why why do you think this is interesting? Why would you care? Two things, I'll say. Like, one is it actually matches the information needs that you have for synthesis, and second is a potentially more expressive grammar for structuring evidence based deliberation. So the first point, if if you think about, like, reflect on the information needs you have when you're trying to do synthesis or use synthesis, you're asking questions like, what empirical evidence supports or opposes claim a, or what's the reflective evidence bases that support theory x versus theory y, alternative claims, consider result, and so on. These are not questions about papers, which is the predominant unit the data structure and unit of analysis. I think of that as, like, iTunes or Spotify for papers where, like, you're really just trying to manipulate documents, but it doesn't match what you actually want. Right? You actually wanna be able to query and compose this course notes and relations. Example of this is coming again from systematic reviews where if you actually estimate and break down what is the work of doing it. Right? You've got a bunch of, like, administration and planning. You got searching, and then you've got, like, basically fighting the data structure. Right? Where are the claims and evidence they care about in the papers? What are the claims and evidence to synthesize? Right? All this work here is essentially fighting the infrastructure's unit of analysis, which is papers. This helps explain why effective synthesis is so hard. It's not an infrastructure for synthesis. It's infrastructure for evaluating people. It's an infrastructure for disseminating papers, but it's not infrastructure for synthesis. More on this later. Right? Another point, more expressive grammar for structuring evidence based deliberation. You can sort of use the the ontology to kind of dig not just into supports, but also warrants. Like, this evidence supports this if you believe that this measure is appropriate, and here's some evidence for that. You can sort of dig into that to really kind of structure your argumentation. Right? You can also think about con con confounds, alternative explanations, right, replication, novelty, significance, all these kinds of things that are actually implicit in lots of discussions of literature or otherwise, but are just basically hidden. They're not available for computation. K? So I think they're great. I think if we had discourse graphs as a sub infrastructure, we would do synthesis a lot better. The problem I'll discuss now is the missing social dimension. And let me say what I mean by that. So, again, like, my assumption is we actually want an infrastructure for synthesis. I mean this specifically in terms of, like, by analogy, things like roads and so on. They help us get things done reliably, sustainably. You don't have to fight with the system. We actually enables us to do what we want to do. We actually have parts of this already. Like, again, I'm not the first person to work on this. There's lots of great technical work on this building technical standards, warehouses, and platforms. They all so much of this is great. It's part of the answer. What is missing is that they're mostly empty. Right? You you want, like, an ocean of nano publications. By the moment, there's no more than a puddle. Turns out if you build it, they won't come. You have to help them come. So, like, from a social science of infrastructure perspective, this, like, makes a lot of sense because infrastructure is social technical, not just technical. Right? There's all these technical parts of, like, you know, the standards and so on. But it's, like, social part, right, of, like, the routines, the forms, the people, all the history of infrastructure demonstrates that you really need to have both. This is probably not news to you as the mega golf community. Right? So one way to ask this question is who does the work? Where and why? And can this be done sustainably? We need to integrate, build infrastructure. I think lacking this model makes you come up with solutions like crowdsourcing in paid models, like specialized curated models, which requires a lot of money, a lot of training, and it's not integrated into any, like, routines or social organizations, and it often tends to fizzle out. One example is Mark to Cure, which was, like, kind of doing biocuration of genes, diseases, and drugs. It did a great job until they ran out of funding. This is a pretty standard story. You may say, what about AI? I think my my position has changes a lot, but it's roughly it's still I don't think we are ready to do it fully automated. I don't think we ever want to. Just like some basic recent, like, checks on the subtasks of extracting summarization of research papers is still really hard. We have, like, roughly 30% upper bound accuracy end to end, very carefully. It gets a lot easier when humans give you parts of the gold standard. Right? So I I'm very interested human and loop style things. It's not clear to me that scaling helps it. It may actually hurt, may make things less truthful. Although, I'm very interested in discussing more mixed initiative models. So hold that thought if you're thinking, what about ChetGPT or LAMA or Galactica, whatever? Let's talk at the end in terms of, like, how this integrates. Okay. So we need the social dimension of infrastructure. So now let's talk about one path to get there. So I'm using this term integrated crowdsourcing to have this idea of, like, integrating this, like, collectively valuable work, right, into individual collaborative synthesis practices. This term integrated crowdsourcing comes from my previous work in the kind of crowdsourcing space with collaborators at Harvard. We kind of had tested this idea of, like, if you want to structure ideas in collective innovation and you have, like, individual whiteboards, it turns out people do semantic judgments for free by arranging things on their own whiteboards that gives you information about the relationships between ideas. And that is a kind of crowdsourcing that you can feed into semantic models that give you idea maps and other kinds of ways to give recommendations. So this kind of idea of, like, integrating usually tedious semantic judgment work into intrinsically motivating activity for the common good is a design pattern that I love and I'm trying to apply here. So what is the intrinsically valuable activity? Well, you we all read papers all the time. So much. Right? On the scale that is comparable to the number of papers that we actually have. Right? We often hear it's, like, explosion of papers. Right? Like, you know, number of publications and so on. It actually is not that different from not not as much of a mismatch in terms of how much we actually read. So we're actually doing this work already. Like, we're we read papers. We are trying to make sense of their takeaways, their claims, evidence, and so on. It's in our informal notes. Right? So this is a intrinsically valuable activity for us. Right? The problem is, is it even possible to bridge from this kind of local, personal, contextual work to something that's more shareable, more general, more transferable, more interoperable? So is it socio technically possible to integrate authoring of shareable discourse graphs and discolored practices and workflows? The my sense of the answer is yes, and I'll show you how. So in previous version of this talk, I gave a live demo, but we don't have as much time. So I'll give you, like, snapshots of what the tool looks like. We've built this prototype lab notebook extension that enables, research groups and researchers to integrate this course graphs into their everyday work. K. It works like this. So we create this course notes kind of like annotation of our existing notes when you're ready or needed. So here you can have, you know, like, a snapshot of what you might see in everyday everyday notes of, like, taking notes on a paper and, like, you say, there's a main result here. Right? And then what do you do? Instead of, like, having a kind of super structured thing where you can only make discourse notes, you simply highlight it, and you tag it as a discourse note of type evidence. And then it becomes an evidence note, which you can then integrate into your writing and outlines. If, say, you're trying to understand how susceptible you're unsure of COVID nineteen given a good exposure, you can integrate all these claims and evidence notes and write. Right? You can sort of, like, tag supports, opposes, kinda outline your thinking, which is useful for you. But in the background, we are parsing it into a reusable, shareable, explicit discourse graph where you have nodes and relations. Right? How do we do this? We have this kind of configurable grammar under the hood, and we take advantage of the fact that we integrate into hypertext notebooks that have a the atomic graph database as a structure such that these kinds of patterns of indentation correspond to, data log query patterns. So she can say, if it looks like this, then we have a claim that's supported by evidence, and we draw on that edge. K? This allows you then to get benefits from your in lab discourse graph. You can view the discourse context of a piece of evidence and see other pieces of evidence that are consistent with it, other claims it supports and opposes. You can sort of retrieve that information for yourself while you're making sense. You can do structured queries to say, I want to find evidence for a lower susceptibility by location and testing regime, for example. These kind of rich queries are enabled by having that discourse graph in your own notebook. And then that enables you to do downstream technically interoperable sharing. Here's a network x visualization of a discourse graph from one of our participants. And here's mine. This is our lab notebook. This is in a markdown notebook in Obsidian, and it's also published over here. So you can look it up if you're if you're curious. So we've tested this over the last two years, two and a half years and counting. We've got a lowball of, like, 30 average daily active users. There's churn. There's, like, probably upper bounds, like, a couple 100. About 1,000 nodes created per user, 1,000 pages created per user. On average, people have written dissertations, projects, research monographs, grant proposals. It's seeding lab cultural opening path to discourse graph and native micro publishing. Just give you some snapshots. So one of our kind of main users is a cell biology lab. One PI, six students, grant, undergrad, and lab manager. They're trying to understand where life comes from. How do you transition from nonliving molecules to living cells? They do a mix of wet lab work and simulations to synthesize cellular data into models. What do they use this cross cross for? They synthesize literature into claims and evidence to guide computational model parameter choices. Like, what is the binding rate? Or what's the length of this myosin? Or what's the value of the binding radius? Turns out these can be grounded in claims and evidence from the literature. And for ones that you can't ground, those are free parameters. Right? So they they, like, appreciate the ability to think more carefully about their modeling work, kind of bridge the lab work with the modeling work. They use it to structure their journal club discussions, so you actually have, like, concrete takeaways of, like, claims and evidence and new questions coming out of their discussions. They use it to structure their ongoing research. So here's a map of the kind of claims and evidence and questions for an ongoing research project. They've extended the grammar, so they have conclusion as a variant of claim and result as a lab specific variant of evidence, same function. It's just that's their stuff. And then they have, they have issues as well, for a kind of request for experiments. Quickly fly through, additional snapshots. We have a medical imaging PhD student using it. We have a criminology PhD student. And to go full circle back, the applied science VC firm also used it to kinda structure it thinking about constraints and solutions and connect it to evidence and then build out their kind of map ins of the constraint and solution space. Not gonna walk through this in detail just to kinda give you a a flavor of it. We're also seeing in the field study over the last two and a half years organic spread to other tools. Tinderbox has an implementation. Logseq has a currently still open bounty of four k, some development. There's, like, a web publishing use as well, so we got evidence of the kind of interoperability. K? So I know they've got tons of time for discussion, so I'll just throw up two things that I hope we'll get to discuss, but you're also welcome to give any feedback whatsoever. One is I'm curious how you think about this cross graph protocol. If you had something like that, how might they integrate with collective systems of deliberation and governance? My sense is if we had a bunch of this, we could enable more evidence informed deliberation, where if you think about opinions and positions, they often connect very well to claims you're making about the world. And it would be great if that was informed by evidence and we can sort of, like, deliberate over that. So I would love to see more of that. And then back to the question of what about AI. In general, I'm excited about AI assisted authoring, mixed initiative workflows, but also anything even the the the arguments I'm making about the match with synthesis information needs also apply to RAG workflows, for example. Right? One one current system, consensus, has in what I what I see is a desire path for discourse. Right? They kind of summarize papers into claims that they then synthesize into a summary. So if we had discourse graphs, it would improve these kinds of systems is my sense. K? So that's the end of my presentation. I don't know if we did twenty minutes. I'm not keeping track of time. But this is a summary of the talk. I'm very grateful for all my collaborators and core team and my funders. Yeah. Here's my contact info. I'm curious to hear more from you. Thanks.

Speaker 1 1:00 – 1:00

Wonderful. Thank you so much, Joel. And, yes, twenty minutes on the mark. Perfect.

Speaker 2 1:15 – 1:15

On the mark.

Speaker 1 1:30 – 1:30

Yay. Excellent. Incredible. Very nicely done. Great. We already have some comments coming in. There's a question here from Steve about integrated confidence percentage. Maybe we can turn to Steve for that. And then I think the next question possibly aimed at you is from Eugene. So why don't we go, Steve, and then

Speaker 3 1:45 – 1:45

Eugene? Would not unmute. Alright. So, I mean, I just didn't see any numerical parameters in there. Yeah. No no sort of trust network integrations or or, you know, just priors and you know? I mean, just

Speaker 2 2:00 – 2:00

Yeah. Yeah. Yeah.

Speaker 3 2:15 – 2:15

All sorts of stuff like that that you could throw into arguments that would help help it kinda tweak in one way or another just to kinda give context for the whole discussion, you know, to have sort of numerical context.

Speaker 2 2:30 – 2:30

That's a great question. I think about this a lot. I wonder if my shared discourse graph actually has what is it?

Speaker 3 2:45 – 2:45

I have also thought about this a lot if that wasn't obvious from my question.

Speaker 2 3:00 – 3:00

Yeah. Yeah. No. It is it is very obvious.

Speaker 3 3:15 – 3:15

Okay.

Speaker 2 3:30 – 3:30

Contention and evidence. Yeah. So I've got some thoughts on this here. I think it's a fraud. It's a very complicated question. Right? Because the numbers are so useful, right, for making things faster, giving you more nuance ways to visualize things and so on. I'm very nervous about, like, ungrounded numbers. So something that we thought about, and we prototyped some of this. Right? I didn't get to show this. If you have a discourse graph and you have, like, claims and evidence of different types, you can distinguish the kind of levels of support for a claim, for example. So for example, this one, just based on counting, for example. Right? This one has 15 pieces of evidence that support it. You could define let's say, if you think this evidence mildly supports the claim, you can sort of define, like, a grammar like that. Maybe you can say, like, zero one two. Then you can do some math over it, some some soft math, right, to give you some some ranking information that's grounded. And if it's got a source, I'm like, yeah. It's kind of, like, supported. If it's got, like, empirical evidence, maybe I care a bit more. And then maybe you have, like, more nuance this this descriptions of, like, levels.

Speaker 3 3:45 – 3:45

Then you can have another layer down where there's the debate about whether or not that evidence is actually valid.

Speaker 2 4:00 – 4:00

That's what it's saying. Exactly. Yeah. So you can do the war. Right? Whole hierarchy. Yeah. Yeah.

Speaker 4 4:15 – 4:15

So Are

Speaker 3 4:30 – 4:30

you aware are you aware of of Eli Ziyadkowski's orbital at all? No. Okay. Well, that's that's that's a deep that is currently defunct because he's running around how AI is gonna kill us all. Uh-huh. Pieces I don't disagree with.

Speaker 2 4:45 – 4:45

Yeah. But,

Speaker 3 5:00 – 5:00

you know, he's not really actively working on it at the moment. But yeah. And, anyway, so, you know, we we can have a discussion on this.

Speaker 2 5:15 – 5:15

Yeah. So, yes, I'd like to see know about priors. Priors, you can operationalize priors maybe as, like well, I don't know. So, like, one way to think about it is, like, you know, in the super forecasting community, for example, like, the forecasting community, they have, like they require synthesis to think about how they're gonna operationalize their predictions. So they translate their sense of from what they can see from the inside view and the outside view, right, into some, like, numerical estimate. And so that can be a way to ground their priors, for example, to show their work in super far classes do this. I expect that having a better synthesized discourse graph makes you better able to think through what your prior is. So it's not a one to one mapping between the discourse graph and the, the numbers, but I expect that it would this will be a hypothesis to test. Right? If you got a group one group of forecasters that have a discourse graph and one group of forecasters that don't, how does their, their forecasting accuracy and, what's the number the prior BRIAR score? How's that BRIAR score change between those two groups? That'll be really interesting, I think, to test. Yeah.

Speaker 1 5:30 – 5:30

Cool. Thank you, Steven. Let's turn to you, Jake.

Speaker 5 5:45 – 5:45

Yeah. Thank you so much for presenting, and I'm I'm I'll definitely be following up with you after this. Just for a quick note of context, I've been very interested in sort of the social systems needed to be able to consistently generate discourse graphs and integrate knowledge across them, more about how to connect, and raise up what are the most relevant research problems to fund for research funders. And how can we help them direct to, like, scientists who bubbled up, opinions of the most important questions, and they can kinda pick the nodes on a discourse graph to fund. Yeah. But I guess as part of that, I I had a two part question for you. One is just with the folks that the labs and the individual PIs that you're working with who are using this, how much of this is driven by just they see the value in discourse graph based breakdowns? And so they just get super excited at a new approach towards potentially integrating that versus do you see any other kind of incentives driving people to wanna use this? And the second part of that is kind of based on how those interactions have been going for you. What do you see as, you know, potentially accelerating the adoption? Is it kind of just building the right communities around research questions or research topics or domains? In which case, like, could MediGov be a test place to do this around governance? Or is it important to link it to something like funding? And, obviously, I I can't imagine we'll get the NSF off the bat. But, like, if some major funding agency requires this, then that actually builds the critical mass to use this. And if that is relevant, we are also gonna be applying for an NSF grant around governance research networks. So happy to potentially integrate some of this into that.

Speaker 2 6:00 – 6:00

Yeah. You can still see my screen. Right? These are great questions. Yes. So I think so the answer I think one of your first questions is, why are BI people using it? And the answer is, because it helps them. So that was my goal because I I care about being able to apply this to context that don't have money, where, like, there may be less fashionable topics or, like, they're sort of, like, being researched by people with lower resources. So the ability to do this sort of intrinsically motivated is, like, really important to me. That was, like, my main hypothesis test. I don't think money will hurt. So I think there's easy integrations with with funding. Up of my screen here is, like, Polyplex is the only other one of the only other, like, current discourse graph. So they use the micropublication model as well. This is from DARPA. Is it DARPA or ARPA? One of the one of the two. And what they do is they've got funders and researchers in the same community, and they make these micropublications, as a early way to scope out, like, hypotheses to test and so on. So that works really well. Right? Because, like, the people who come up with really interesting hypotheses that are grounded evidence, they immediately get access early access to funding because of the way, like, their DARPA models work. So I think yeah. Absolutely. This, like, integrating with funding is, like, a really good idea. I think it's it's a natural fit. Maybe a more potentially equitable way to get access to, like, senses of questions and evidence beyond just, like, asking people. We should still ask people, but I think it'd be good to have, like, a additional, like, way to assess the landscape. And this connects really well with, like, the information needs of our VC applied science VC firm, where they have have similar goals. They wanna know, like, where are the where in the landscape is right for investment, where is it like the flip side is they want things that are a little bit less derisked a little bit more derisked, But scientists may be interested in, like, super interesting questions that have lots of contention and debate, and that's, like, a right opportunity for, like, additional investment. You can operationalize that as a discourse graph query. Right? What are some sets of questions that connect to claims that are of high significance but have a lots of debate in terms of the claims and evidence? Lots of controversy. So, yeah, if you wanna play with this, like so the second question is, like, what's needed to accelerate this? You know, this this is the current implementation. This is in Rome research. It is early adopter territory. Right? This is not, like, user friendly Notion Google Docs level. So we need to get to that level for it to have much broader adoption, but also we need communities. Right? It kind of really comes alive in a community, but it turns out in the community, then you have to solve the collaboration problem. So that's what we're working on right now. So we're trying to get some funding to work on, like, a smoother experience to kind of get up get them running really fast. Rome has, like, a learning curve behind it. But yeah. So that's what we're focusing on now. It'd be amazing if we could, like, have a Discourse Cloud plugin for Notion, for example. Notion is not particularly open to extensibility, so maybe we'll do Google Docs or something. But, so we're kinda walking away away from the niche tools to, like, more mainstream tools. That's currently what we're doing. And having public examples of discourse graph, I think, will help as well. So got lots of stuff in the chat. Cool.

Speaker 1 6:15 – 6:15

Thank you, Eugene and Joel. I think next step on Slack is Ronette.

Speaker 6 6:30 – 6:30

Yeah. Thanks. Yeah. I really I really like this, as you know. My one question I had here was related to sort of it seems like the focus was kind of around teams and, like, smaller teams, maybe, like, academic labs and and things like that. I wonder what your thoughts are on on, larger sort of distributed efforts creating discourse graphs, for example, like social media. Because, you know, we see researchers are are so active on Twitter and Mastodon. So they're already kind of doing some of these things, but they don't really know they're doing it. And is there ways are there ways we can kind of get them integrated into discourse graphs through social media maybe?

Speaker 2 6:45 – 6:45

Yeah. Yeah. I think so. I think that's this kind of variant of the design pattern. Right? Like, they're already doing this on social media. If you're interested in collaborating, I would love to explore it. Right? Because, like, for example, I played with this. Where is it? So, like, for example let me see. Actually, I was gonna message you, Ronan, about this. So there's a group of people that do, like, open annotation on hypothesis for preprints. Arcadia Science is one of them. I can't find it right now, actually, thanks to my notes. I'll just show you since this is, like, a nice open thing. Wow. Right. My annotations are Arcadia sites. So I I parsed it. Right? Because it's on on hypothesis. So I just, like, wrote a little script to parse parse out, and you get all the little annotations. You have to, thing. And you can sort of, like, parse these into discourse moves. And so that is an additional source of, like, you know, annotation on top of literature. What's the scale here? Where is it? I think I didn't see it was, like, a lot. Right? There's a lot. Like, Arcadia does, like, does this, like, weekly, I guess. And there's other orgs. I think, like, PLOS does it. There's other research groups that do it. So I think, like, integrating into that is probably amazing. And, rather than, you know, the the neuroscience competition neuroscience community, like, they use this they they do preprint review. Neuromatch. Neuromatch. Right? So that's why you read, like, bio archive papers, and they do, like, preprint reviews. These practices, we can integrate to as well. Because they want it to be open. They want the scientific record to evolve this, like, low hanging fruit of, like, just adding a little bit of structure. Right? Or post post op structure. Yeah.

Speaker 6 7:00 – 7:00

So I guess definitely talk. I'm thinking about, like, activity pub as well. Something that has you can add structure to to you know, it's already kinda open and extensive also.

Speaker 2 7:15 – 7:15

Yeah. Totally welcome all these experiments. I think, like, that design pattern of, like, where's the work happening? Can we integrate this first graph? I think this first graph is way easier to integrate than, like, a knowledge graph. Right? Like, annotating the entities and relations, like, at that super formal level, that's, like, way higher bar. The state of the art is way lower there. So yeah. Excited to

Speaker 7 7:30 – 7:30

to talk about this

Speaker 6 7:45 – 7:45

more. Thanks.

Speaker 1 8:00 – 8:00

Yeah. Thank you, Vernon. Nimbus, if you're around and wanting to chat a little bit, I I know you had made reference to a post that you made in Slack.

Speaker 8 8:15 – 8:15

I just want to add some context and really exciting. I think the upcoming policy kit and Oracle Collective integration could meet something like this in their decision making on how to do policy on how to spend resources. So I think this is exactly what was missing in my opinion, and now you are here, and this is great. That's all. No question. That's Excellent. Yeah. That's the response to my question. Right? Yeah. That's the response to my question. That's that's Good.

Speaker 1 8:30 – 8:30

Cool. Yeah. Thank you, Nipis. So we have some more time, and there aren't any current questions in the stack. Nicholas actually is just coming in with a question.

Speaker 3 8:45 – 8:45

Oh, yeah.

Speaker 1 9:00 – 9:00

If we have some more time, maybe at some point, we can also get a demo since we we there wasn't time for the presentation. But let's let's go to let's go to Nick.

Speaker 4 9:15 – 9:15

Yeah. I was just gonna pop in a quick comment with on kind of connecting to to PolicyKit and and that side of things.

Speaker 1 9:30 – 9:30

Mhmm.

Speaker 2 9:45 – 9:45

I

Speaker 4 10:00 – 10:00

would I'm super sorry. I've I've I'm working on that project kind of.

Speaker 2 10:15 – 10:15

Oh, cool.

Speaker 4 10:30 – 10:30

Or I I have been and and need to have a bunch of to dos to do.

Speaker 2 10:45 – 10:45

Yeah.

Speaker 4 11:00 – 11:00

But I I'm personally so a thing that we found in a lot of the early experimentation so what policy kit is do have you have you heard about Yeah.

Speaker 2 11:15 – 11:15

I know I know policy here. I'm excited. Yeah.

Speaker 4 11:30 – 11:30

Yeah. Oh, yeah. Yeah. You know you know Amy pretty well. Yeah. I I saw she mentioned she she was advertising the seminar in in her Slack, so that's it was a reminder for me to come here. Yeah. So part of the challenge there has is we've been building this kind of app in PolicyKit called Collective Voice, which is supposed it's specifically focused just on on doing sort of retroactive funding using Open Collective for, like, mutual aid groups or open source software groups. So it's like a very hyper focused thing. It's like picking voting procedures to pay for coffee and groceries and expenses and stuff like that. You could imagine it being sort of more broad. And even in that context, the thing that we've really struggled with is making it super duper simple, like button click simple or, like, simple enough that it could be integrated so that people could use this in, like, WhatsApp instead of, like, even trying to get them to go to Slack is is kind of a bridge too far.

Speaker 2 11:45 – 11:45

Oh. So

Speaker 4 12:00 – 12:00

I'm that's, like, one point that I have in my head. And then another point that I have in my head is I personally am, like, a huge I I have a problem with trying out these Roam and Obsidian and Notion plug ins and integrating stuff and writing scripts so that I have a markdown file on my local machine.

Speaker 2 12:15 – 12:15

Wait. You have a problem in a sense you do it too much?

Speaker 4 12:30 – 12:30

I spend too much time on this. Like, more than more than I actually get benefit. Right?

Speaker 2 12:45 – 12:45

Yeah. Yeah.

Speaker 4 13:00 – 13:00

Or sorry. Ordering on a problem. And, like, I would consider myself a person who, like, spend a lot of time thinking about this. Right? And it's still, like, I haven't even been able to design a a sort of an interface that works well for me. So I I'm really care like, this just seems like a really exciting challenge. Like, is is it possible to try to bridge sort of the the not even necessarily the socio technical practices, but just like the sort of individual ritual practices

Speaker 2 13:15 – 13:15

of, like, how do I

Speaker 4 13:30 – 13:30

Obsidian vault? You know, how do I make it coherent and have a consistent sort of, like, tagging and and grammatical pattern and stuff like that? And then extending that all the way to, like, can now someone participate in a discourse about about funding, which is very different than than, like, science. Right? But I I think that there's some parallels. So I don't know. I just wanted to comment that this seems like a super well, I don't know. This seems like a ten year research agenda or something like that.

Speaker 2 13:45 – 13:45

Yeah. Yeah. Yeah. Yeah. Yeah. I think that's right. So if I understand your your commentary correctly, there's a core connection here to the question of, like, how usable and intuitive can this be? I so we intentionally so I'll say first one well, number one, the choice of discourse graph as the as the model, I think, was was very intentional. Right? You can imagine other ways to kinda structure your ontology that would be, like, way more complicated. Discourse graphs at the base, like, just three things. Right? Questions, claims, and evidence. And I would say I'm pretty confident this is solved in Roam. It is solved in it is solvable in any hypertext notebook because I feel confident. Like, you give me give me an uninterrupted, like, two weeks. I'll have a template ready to go for Obsidian, for example. I'm actually using it for my own data analysis right now in Obsidian, and my student is as well. So, like, in my lab, we have a shared, like, room graph. That's what I just showed. And we have the discourse graph in it. It's just like it's it's in the background. It's just like, that's life. We just we just use it now. Same with, like well, Matt's being MA, like, the cell biologist is, like, more ambitious. He wants to, like, transform more of the practice. For us, just, like, taking notes on literature using discourse graphs is already done. Like, it's solved. So I think it's it's very solvable in the hypertext world. Yeah. My other student uses Notion. She used it to structure her lit for her dissertation proposal. Very simple. Right? All you gotta do is, like there's some things that are claims. There's some things that are evidence. I can have whatever other notes I care about, but, like, I can retrieve those high value units. So I think, I think just work to be done. Right? To, like, just zero in, like, on those usable, like, templates. Like, you know, in the community, we have, like, existing practices already. Right? Like, Zettel, Casten, or whatever. Like, those are things that people do already, and it this kinda maps nicely. Like Yeah. It is yeah. So and, like, the the slide I wanted that I kinda breeze through. Right? There's some other tools that are, like, basically, Discourse Graph native already, where you already you you don't have to extend the tool. It's just it does it already. How to get to, like, WhatsApp and, like, Slack and, you know, Google Docs, I think is more of a, like, a for me, the technical uncertainty is restricted to how do we do it as opposed to, like, can we do it is my current sense. You just need I just need money and developers to throw at the company. Yeah. Yeah. And my confidence level is high that is possible. It's just I have to do it. Yeah. Which makes me think I think, going back to Ronen's question about integration with social media and, like, federation, I wanna be careful not to suggest that the stuff that people make in the individual graphs is already, like, usable in the collective sense. Because often you have, like, shorthand or, like, you know, some, a bit of messiness in your personal graph where it's still useful to you. So I think of it more like there's a concept from social science of, like, boundary objects. Like, they're they're units that can be transformed to be usable in social discourse. It just requires some additional work, but it's, like, more scoped work. It's like it's more like, alright. Let me clean this up to submit to as data or notes for collective discourse. And then you can imagine, like, workflows for that. But that's, to me, way better than, like, you start with a discourse, and then you have, like, alright. What do we know? And then you go back to try to, like, wrangle all your stuff that's not already structured in the discourse graph. I think that becomes a a lot harder. So I think just wanna make clear like this. It's not like a I I publish directly from my discourse graph. This it makes it easier, but there's additional work to be done, and there's, like, some additional, like, interesting problems to solve that are that I wouldn't have thought of if I kind of didn't didn't do this.

Speaker 1 14:00 – 14:00

Yeah. Okay. Thank you, Nick, and you and Joel. We have another question coming in from you, Jamie.

Speaker 5 14:15 – 14:15

Yeah. So another question that comes to mind is also just what you think or rather if you think events or anything more live can actually be useful for periodic creation of these. Because what gets me really excited, at least from what I'm hearing of your approach, is kind of to integrate this with researchers' workflows. That way, they can continuously update these, and that that is amazing. I'm also wondering at a slightly higher order, you know, does it make sense that every three months, six months, you know, arbitrary point in time that seems reasonable to actually gather folks to create, say, a governance research road map focused on a even more narrow set because I think governance itself can be such a huge domain. And then, like, what is the right scope of that domain? But to generate it and use that as a way to kind of surface those problems, build connections among people, have more kind of social pressure to start using the more day to day application of it and integrating it into their actual research workflow. But, yeah, I don't know if that's something that you personally thought about or gets you excited. But, yeah, that that was just something else that that came to mind.

Speaker 2 14:30 – 14:30

Yeah. I think that's really cool. I think there's a lot of energy. Like, there's a kind of, like, lab level analog to that of journal clubs. And we tried experimenting with, like, doing discourse graph enabled journal clubs, and you saw, like like, a snapshot. Everything we're seeing there is, like, cleaned up. Like, the pace of discussion even right now, like, in our discussion right now, like, we just can't capture all that. So my my thinking is, like, yes. There's a lot of energy, and I wouldn't want to force people to, like, be super thorough in their discourse graphing. But you can imagine, like, it's useful to always have a facilitator or, like, some kind of, like, scribe. And if you kind of do a discourse graph light structuring of it, you can probably hand it off to, like, you know there's always somebody who, like, cleans up the notes. Right? That could be super useful, right, to kind of have, like, a discourse graph, like, summary of, like, a meeting. So I think that'll be that'll be cool. And I think it's it's doable. Like, some of my collaborators, like, they do discuss graphs in Miro. Right? Because, like, you just have, like, different note types, and it's, like, super fast. You want something that's closer that can sort of approximate the speed and energy of, like, live discussion without, like, slowing it down too much. And, like, that actually has a nice, like, history that is, like, you know I think one of the early systems was oh, shoot. Conklin was is, like, a hypertext tool. We have the IDIS, issue based information systems. We got all this, like, kind of group decision making support systems where they have, like, a system where you can sort of take us have, like, a representation parallel to the discussion. I think that's probably something worth exploring. Miro does have an API. So you could, like, you know, export the Miro board and then, like, parse it into discourse structures if you have, like, a convention. Like, our our reds are gonna be questions, our greens are gonna be claims, and our evidence is gonna be pink. You can probably, like, parse that later, use, like, AI systems to, like, connect them to literature or whatever. Yeah. I'm excited to that that would be fun to explore. Yeah.

Speaker 5 14:45 – 14:45

Well, I'll definitely reach out to Connect After and

Speaker 1 15:00 – 15:00

Yeah. So it seems like we've reached the end of the questions from the chat. I'll take this time to share that I've created a thread in our Slack for any kind of follow-up discussion. I'll also get a link for anyone who would like to join our Slack. They can follow this and go to the seminar discussion channel, in order to follow-up. And then we do have one more question here. Maybe we can squeeze it in. Let's try, and then we'll give our speaker, a customer a round of applause for such a great presentation. So, Ryan.

Speaker 7 15:15 – 15:15

Yeah. So just a small question, which is the it seems like there's kind of two processes that we can distinguish, right, which is the production of kind of the discourse the objects themselves, right. Like pieces of evidence and claims and so on. And then doing the actual kind of organizational work of putting these things into a graph. And I was just like wondering if you had thoughts on kind of what that separation looks like. And the the thing that prompted me was the the example of, like, can we build discourse graphs around meetings? Right? Where in that example, you could imagine that the back and forth is happening way too fast to produce, like, a well structured graph,

Speaker 1 15:30 – 15:30

but

Speaker 7 15:45 – 15:45

you can still be producing the discourse objects and then having this kind of, kind of post hoc or, like, you know, like, this process afterwards of of structuring it. And I just wondered what you thought of that kind of Yeah. Separation.

Speaker 2 16:00 – 16:00

Yeah. I I think that's a great insight. I I I 100% agree. Right? Like, the just thinking carefully about the the kind of the work is is really important. I'm actually curious. Like, we've got I think at CSCW and CHI, like, the the conferences, I saw a bunch of, like, meeting summarization projects. Like, we've got, like, this kind of AI system that does, like, transcripts and so on, like, live parsing of what's going on. I think something like that could be really interesting. But, yeah, generally, like, thinking about, like, what thinking of it as a separate task, I think is number one, that I think is really insightful. And then that's where I think, like, kind of mix initiative stuff could be really useful because it is tedious. Right? After the thing, you're like, the excitement's over and, like, try to, like, structure this and, like, that's where I think, like, you want some some AI assistance to, like, kind of make the bitter pill, like, less difficult to swallow. So I'm excited about something like that. I think I think the time is right. This is good enough to speed up the work of somebody doing that that work. It's a power move, though. So I wouldn't automate summaries of meetings because, like, that is a power laden role where you get to decide what the agenda was. So I wouldn't see that to an AI.

Speaker 1 16:15 – 16:15

Yeah. Great. Thank you, Ryan. Thank you, Joel. And and thank you very much, Joel, for such a lively presentation and discussion. It's been really, really nice to have you here and get to have you participate in the community. And it seems like there's a lot of future potential collaborations and discussions, so I hope that we'll be able to continue the discussions in our online Slack. And then we'll see you at another seminar in the future. As this customer, we always like to thank our speaker for presenting. So for everyone

Speaker 2 16:30 – 16:30

The company's clearly unmuted at the airport. Yeah. Yeah. We go to the airport. So

Chan Metagov 20231115

Top Keywords

Transcript

Listen