97 Pioneering Progress: Advocating for Experimentation in Government with Nava Labs

Speaker 0 0:00 – 0:50

Hello. I'm Ryan Cook, and this is Civic Tech Chat, a show that looks at the way technology, politics, and policy impacts the world around us. The tools we use, the way services are delivered, and how we talk about and set policy all shape our society. We'll gather around and have a chat about these things together and more. This will be a sort of special episode that we're about to hop into. I'm, grateful for the fact that Nava PBC has partnered with us to do a sponsored episode. So I'm excited both for the partnership and for the very interesting conversation we'll have ahead about their innovative work and research. Martel, Genevieve, thank you both for coming on to Civic Tech Chat. Could you introduce yourselves and tell us all a bit about what you do?

Speaker 1 0:51 – 1:18

Yeah. I can start. My name is Genevieve Godet. I'm director of Nava Labs, at Nava Public Benefit Corporation and, started and co lead the lab with Martel. And in general, the lab is a philanthropically funded research and innovation arm where we prototype policy and systems changes and then advocate for further adoption into government services.

Speaker 2 1:19 – 1:44

And I'm Martel Esposito, and I'm the partnerships and evaluation manager and, cofounder and colead of Nava Labs at Nava Public Benefit Corporation with Genevieve as she as she shared. And I think what's really cool about our work together is that we bring together technology delivery and advocacy and communications expertise, expertise, to create a really great innovation team.

Speaker 0 1:45 – 1:53

And for each of you, what would you say is your personal why? The thing that drives you to get out of bed each morning and do all those things.

Speaker 2 1:54 – 2:16

My why is it doesn't have to be this way. The technology exists to make it easy for people to get the benefits that they are eligible for, and we just need to be working on implementing that technology effectively. But it doesn't have to be this way. So I know I can I can play a role in in helping helping make the change? And so that's my why.

Speaker 0 2:16 – 2:25

I I feel like that phrase you you're like, you have a ready made thing for one of those laptop stickers at a at a one of the conferences. Like, it doesn't have to be this way. Yeah. Yeah.

Speaker 1 2:26 – 3:03

Yeah. Mine mine is very similar that, technology, you know, in in the right hands and with the right design can empower people. And I think especially in public services can empower people to get the things they need for themselves and for their families and enable them to have, you know, sort of all of the the the choices and options, and the autonomy to live a a full and and dignified and and comfortable life. So that feels like yeah. If we if we can do that even a little bit, that that would be totally enough.

Speaker 0 3:05 – 3:19

So I think we can go ahead and hop into our main topic today, which we're here to cover, Nava Labs. So I'm gonna ask you a fun compound question about that, which is, well, what is Nava Labs? What are you trying to accomplish, and why is it valuable?

Speaker 1 3:20 – 4:39

Yeah. Thank you for that. And I think to to really explain what Nava Labs is, it's important to start with, well, what is Nava, the organization that we are we sit within? So Nava Public Benefit Corporation is a consultancy that helps government agencies make their digital services, more usable, more effective, more efficient. And Nava Labs, which is a newer branch within the company, that Martel and I started, really focuses on incubating new scalable products, practices, and policy ideas and building the evidence and partnerships that we need to, do those projects and advocate for those changes. So within the lab, we're really able to bring together the things that Nava is excellent at, like service design, human centered design, and agile development with some of the research and advocacy, specialties altogether on one team. And when we kind of zoom way out from just the products and the projects that we're working on in the lab, we're able to look at what kinds of different levers might need to be pushed to realize something very, very different in that, in that space. So is it a policy change that might need to happen? Do we need to be advocating for a different kind of strategy, you know, at the federal level or at the state level?

Speaker 2 4:40 – 5:17

And, ultimately, what we're trying to do is close the participation gap for programs. So what we know that there are currently $228,000,000,000 in public benefits that go unclaimed annually. And so by understanding which products, which practices, which policy ideas can help close that implementation gap in the context of the lab, we can then advocate, for those products, practices, and policies in the broader program areas that we work in, which, can range from programs like WIC and SNAP, Medicaid, TANF, and many more.

Speaker 0 5:19 – 5:45

And I gotta make sure I heard you right. That was 228,000,000,000 with with a b that you mentioned. Right? Yes. Billion with a b. That is a massive amount of money to kinda have out there not, kind of doing its intended purpose. I mean, that's more than the GDP of some nations out there in the world. And for that to be unclaimed annually is interesting. But what's the human story behind a massive number like that? And how does, AI work kinda change that reality?

Speaker 1 5:46 – 6:57

The Urban Institute was actually the the place that did the sort of had this finding, and some of their research about, the participation gap in major programs like Medicaid, different tax credits for folks who are below certain income thresholds, SNAP, TANF programs like that. And when we when we begin to put that sort of in the in the context of what people experience, you know, that is millions of, millions of families going without the the food budget for the month that could really help them feed their families. That is missed doctor's appointments. That is conditions that go from acute to much more chronic because folks are not able to to get the health care that they need. That is, you know, millions of dollars in tax refunds that don't make their way to families with young children, who could put that, you know, into into child care, into education, into into any number of things. And there are a lot of reasons why that happens. And, you know, Martel, I think I'll pass it over you to talk about some of those that people have when they try to access those programs.

Speaker 2 6:58 – 8:36

Yeah. So the the why behind the numbers that it's really hard to participate in many public benefit programs. You have to apply. You have to verify data. Sometimes you have to go in for an interview. Sometimes they're collecting health data. There are a range of steps that, you know, on the surface maybe don't seem that difficult. But when you get into the the details, oh, do you have this particular document that showcases this particular piece of information during this particular time period? It can actually be quite it can be quite challenging. And then also to even just navigate and understand which which programs that you're eligible for can be can be difficult. And many organizations actually have staff, who help people understand and, help them navigate the programs that they might be eligible and help help them enroll in those programs. We've we've called these staff collectively benefits navigators, but these might be call center staff, outreach specialist, sometimes community health workers play this role. So, yeah, there's a there's a whole there is a whole story behind the number. There's a reason why there are so many unclaimed benefits, but we're trying to trying to change that, with with some of our AI work. How might we leverage AI tools to help reduce some of the burdens that both the beneficiaries and staff face when helping families navigate and enroll in benefit programs, including a chatbot?

Speaker 0 8:37 – 9:22

Yeah. And I think it's, like, really easy for folks to hear about a number like that and, like, lose sense for, what it actually means. Because it's not even really about just, like, one person to sing a benefit. There's, like, a ripple effect. If it's someone in the family or someone in the communities, things, happen and affect your neighborhood, right, and the folks around you. Happy to hear that y'all are thinking about the the human impact of these things along with the the challenge of how to get that allocated properly. I also find the your your use of the word navigator is interesting based on kind of how you're describing the process. I'm kinda curious though as as, like, those navigators are kind of in a a real caseworker setting, which I believe I I heard that your chat buttoned up being piloted in such a place. Can you walk us through, like, what's what's a day like for one of those navigators that's kinda working with this?

Speaker 2 9:24 – 11:55

Oh, man. It could be it's it could be very highly variable. Some navigators specialize in very specific instances or very specific parts of that, navigation enrollment experience. So some navigators, like call center specialists might be helping someone troubleshoot their application, while others might be more like a caseworker where they're working with a family to understand the family's needs, learn about what's what's difficult for them, what they could use help with, and then kind of help match them with programs that they can sign up for, and enroll in and, get the the help that they need. In the context of our chatbot project, we have been working with a nonprofit organization in LA County called Imagine LA. They have an existing benefit navigator tool, which is, a little confusing because it's also called navigator. It is it is, it is a software tool that is a screener, that caseworkers use to help people, understand what benefits that they might be eligible for. And so we've been working, to, add a layer of AI on top of that to make it easier make that navigation easier for caseworkers. So, essentially, before they might be going through, and searching Google or searching internal databases or doing a lot of, information retrieval that takes a little bit more time, a little bit more of their own discernment, their own navigation to to find, information about different programs for the families that they serve. With the chatbot, the goal was, let's make this really easy. AI generative AI is really good at retrieving information. It's one of the one of the strengths of the technology at the moment. So let's let's leverage that and make it really easy to pull information and resources from credible sources with citations, just like they would do if they were doing it on their own, but do it instantly instead of them having to go and search a bunch of different sources. That's a that's a general look at the bore before and after of the tool. We're currently have just we've currently just wrapped up our pilot, so we'll have more details about the, the before and after and the experience that the caseworkers had using the tool.

Speaker 0 11:57 – 12:06

I I think I heard you describe a, like, retrieval augmented generation style chatbot. Is am I hearing that accurately? Is that kinda what y'all Yeah. Strove to build?

Speaker 1 12:07 – 14:14

That's right. So the chatbot that we were piloting you is grounded in, kind of sources of truth that range from publicly available government websites that, you know, the, kind of have the rules and and policies that caseworkers need to have in mind, in addition to, the sort of internal knowledge management tools that that Imagine LA uses to kind of help navigators, in in their work to to get people into programs. So it's, it is trying to mitigate against some of those risks around hallucinations. You know, we don't want, a tool making up a program or anything like that. But we do want it to do the thing that computers are really good at, which is like search a bunch of files, find the appropriate citation, get that information into the hands of the navigator so that they can continue, you know, the conversation that they're having with their client at that time. And I think that within that, there's a really important, kind of insight that's earned this project around just how hard it is to be a navigator, to be a frontline worker for a government program. You know, before before the lab even existed, this was work that that Nava and folks at Nava were doing to make government services significantly better for the people who use them, but also the people who have to kind of manage them and administer them on those front lines. And so coming into the project, Martel and I were really focused on how can we how can we really lower the burden that's on navigators and that administrative and operational burden of of managing these programs. And by doing so, hopefully, create more longevity in that role. It's a high turnover, high burnout kind of job, and and really create more opportunities for for human connection within within that, public services delivery.

Speaker 2 14:14 – 14:47

Yeah. And I think a dream of ours is that ultimately, this can help reduce the remove the barrier to entry for some of this work so that, you could go to your local library and your librarian or your volunteers can use tools like these to help families navigate and enroll in benefits. And then the highly skilled and highly trained caseworkers can focus on that more one on one in person time to assess and understand more detailed needs and supports that a family might might need.

Speaker 0 14:48 – 15:17

Genevieve, it was interesting to hear you mention the maybe the challenge of avoiding hallucinations, having guardrails, making sure the app doesn't just, like, legislate its own program for for how things work. Because, it in my experience, kinda doing, like, retrieval augmented generation, the easy part is just getting a pipeline that does something. The hard part is, like, getting it to be accurate, to, like, reliably give you citations kinda to will not hallucinate so much. Is that a challenge y'all ran into and, kind of along the way there?

Speaker 1 15:17 – 16:21

In early prototyping, yes. They saw things like that come up. What our team was able to do was give the the tool a lot of instruction about what what it doesn't know. So, for example, we were piloting in LA County, and we kind of built in guardrails around if you ask about SNAP in Arizona. So this is like, you know, help buying food outside of the jurisdiction of where, those case workers are operating and the chatbot will say, I actually can't answer that question, But you might wanna go look at the Arizona Snap, you know, official website, and it can it can show you that. So getting really clear on, you know, here's your role as a chatbot. Here's what you're supposed to do, and here's what you actually can't answer questions about. We found was really critical to, you know, kind of kind of mitigating that risk that it was going to just, like, be asked a question it didn't know about and do what largely image models do, which is guess. We actually wanna reduce the guessing as much as possible.

Speaker 2 16:22 – 17:03

We also, made a point to experiment with direct quotations. So the the tool outputs, a a plain language response and then links to a direct source so that caseworkers can actually just go to the direct source, which is what they would have probably done, without the tool and with with more effort before the tool. It's something that we've gotten some early signal has been really valuable to to be able to see that direct source. Like, oh, this comes from this particular policy manual. That makes a lot of sense that that's that's where that's coming from.

Speaker 0 17:04 – 17:16

Oh, and you also mentioned that you try to have it give a plain language response. Is that something y'all had to, like, spend time fine tuning it, I guess, right in that style for folks? Or what what was that process like?

Speaker 1 17:17 – 18:05

I actually don't think we had to do much, proper fine tuning there. The team instead built in some, kind of monitoring around reading level. So just going back to that Flesch Kinkade reading score, which, if folks are doing reading level assessments, we'll know that has been around for a really long time. That's not like an AI. You don't need an AI tool to do that at all. But, no. We found that given given the right instructions around readability and reading level, large language models are pretty good at at reducing things to plain language. And we found so far, initial indications, we haven't finished our evaluations yet, but it seemed like even in other, commonly spoken languages, that was also true.

Speaker 2 18:06 – 19:01

And one of the things that we're doing with this particular chatbot pilot is quite an intense, study of accuracy. We're working with the Georgetown Better Government Lab. They have a National Science Foundation grant to study accuracy, administrative burden, and bias in, AI tools applied to public benefits context, and we're very fortunate to be their first partner on this work. And so, we we've been pleasantly surprised by, by that study so far. We'll be able to share results shortly, but this is something that we we have been we have been focusing on because we know it is, it is of interest for government agencies. We don't wanna be giving incorrect information and hallucinations, making things up, that kind of thing. So we'll have our data on that soon as well, which is exciting.

Speaker 0 19:02 – 19:23

In Martel's answer before, I believe you brought up the term burden as something y'all have a goal to reduce for folks that are involved in this process, which, really brought my brain to the phrase, emotion, which is a thing that's referenced, a bunch in your written materials about this work. For folks that are maybe not familiar with the phrase or the concept,

Speaker 2 19:23 – 20:59

what is it, and why is it an important thing to work on? So administrative burden is a a concept developed by the Better Government Lab about the burden of engaging with government applications and processes. This can be paperwork, the information you have to understate the information that you need to understand to move forward in a process, the documents that you have to gather and share, all of the all of the different steps, in applying and engaging with government programs. And this takes time and mental effort and can be a be a barrier to, participating in programs. So it impacts that coverage gap that we talked about, that $228,000,000,000 in unclaimed benefits. And so for us, administrative burden is really important to study and look at as part of our work, because we want to target tools, technology tools, interventions, that ultimately reduce burden. We wanna be able to measure that it's reduced burden, and we wanna be able to show that that burden reduction has actually resulted in increased participation in programs. So, along the if we think about your experience with government, if you're if it's a good experience, if it's an easy experience, you're more likely to participate and and get those benefits. If it's a burdensome experience, you're less likely to. And so we're trying to go from a burdensome experience to a nonburdensome experience, and that's something that we can measure in the context of our project.

Speaker 0 21:00 – 21:32

Something that we're gonna continue to talk about, I think, that relates to what you're describing there, you know, trying to take action is the idea of kinda using experimentation to kinda test ideas and kinda see what you can do about these problems. I believe in a lot of your writing, you use the phrase early experimentation, and you're kinda referring to things like prototyping, which, we've been talking about here, piloting, and eventually trying to take that and scaling it to a broader problem. But we start at that beginning place, that, early experimentation. Why is it important to engage in that, for programmatic work?

Speaker 1 21:33 – 23:23

You know, folks who are listening who work on, you know, software development, technology tools, this is a really established kind of practice, especially in agile methodologies where we're kind of working in a loop. Right? We're trying things out. We're measuring them and kind of iterating from there. And in the context of the lab, we're applying that really so that we can we can fail early. Right? We can fail without investing a lot of our team's time or a lot of our partners' money into understanding what works. And when we're thinking about emerging technologies like AI, we we wanna get to some of those, initial signals on our hypothesis as quickly as possible. So, for example, in the context of, of the chatbot of some of our early questions and experiments were around, you know, will will case workers use this? Is this, you know, sort of an appropriate way to reduce their workload? Can the models actually support some of the questions that we know folks would need to answer? And by creating, you know, a container in the lab where we we can fail, it's safe, you know, and we can we can iterate from there. It enables our teams to move faster. It gives us a stronger point of view on what actually is going to work, and it allows us to share back really quickly with the field. You know, here's the experiments that we did. Here's what we learned. You don't have to do them yourself. You can build on all that we've learned, and use those to, to improve the programs that are services that you're working on. It sounds a bit like you're describing kind of like a change slash risk management kind of principle where it's like maybe your agency or org wants to make a bet on something. So instead of doing it all at once, you test it like a small scale.

Speaker 0 23:23 – 23:27

So you kinda validate the idea before you put the big investment in. Am I am I kinda hearing that correctly?

Speaker 1 23:28 – 23:59

Yeah. That's right. And that was a big, you know, kind of impetus for us to create the lab in the first place. I think it's important to keep in mind the the context of the broader NAVA team where we've got folks working on mission critical systems at the federal level, at the state level, in some cases at the local level. And all of those, you know, projects are, it's high risk to be doing experimentation in that context. Right?

Speaker 2 24:00 – 24:04

It doesn't mean we're not iterating and testing, but we may not be taking

Speaker 1 24:05 – 24:53

as high of risks as Genevieve was saying. Right. Working with a a totally different strategy or a new technology or something like that. We really needed a space where that could be done responsibly, you know, in a way that kind of stewards, goals around public benefit and and public money, which the, you know, the lab is philanthropically funded, so we don't use government funds, for those experiments. You know, in order to to do that responsibly, but sort of have some momentum of innovation. We really needed a different space where we could be kind of use it taking all of those insights that we were getting about, like, what it's like to deliver a service as a benefit navigator, and identify opportunities to experiment with with other ways to to make those things a lot easier.

Speaker 2 24:53 – 25:17

In the lab, we are focused first on learning and second on building software. And I think in most other government context, the the first goal is to build working software. So we have the privilege of being able to essentially fail, for the purposes of learning so that, we can share that out, and we can, peep other people can learn from from our experiences.

Speaker 0 25:18 – 26:03

As you talk about these methods for learning and risk management, it makes me think of the materials that I saw, that novel labs have talking about a multi phased approach for evaluating the use of ethical AI tools. You mentioned user research, early experimentation, which we've been talking about a bit, prototyping, piloting, and eventually scaling. If we start talking about user research and early experimentation first, it sounds like you're trying to get evidence together through, quick actions rather than trying to set up, like, large, long experiments. Why is that kind of approach crucial when we talk about things like policy change? And has have you ever kind of come across surprises as you've kind of gone about about those phases with your work?

Speaker 2 26:04 – 27:20

Yeah. I think in in the early experimentation, we were able to get signals about what uses of AI are appropriate and what uses of AI are acceptable to our end users. And then from there, finding that right problem solution fit. So identifying the problems within that caseworker experience, within that benefit navigator experience, and where generative AI actually has is strong and can solve those problems. So so matchmaking that coupled with understanding, interest and concerns and or excitement from perspective beneficiaries and caseworkers and benefit navigators about using AI, was was extremely helpful for how we then thought about which tools we wanted to take further and which tools would could potentially have the most impact. So there's there's both an element of feasibility, so understanding, is this feasibly gonna be accepted and gonna also potentially have impact and be the appropriate solution to the problem? I don't know, Genevieve, if you have have some other thoughts here.

Speaker 1 27:21 – 28:35

Yeah. Just to ground it a little bit more, I'll share that, we did not set out to build a chatbot. We set out with our, you know, funders at the Gates Foundation, who really who really jump started this work for us to figure out if there were any responsible ethical applications of generative AI in in helping benefit navigators connect people to programs, You know, that that broad of a question. So that early experimentation really, was a combo of user research, but also our team just looking at at at the time and I say at the time because these tools change, like, every couple of weeks at this point. At the time, what could large language models actually do? So, for example, we were looking at, you know, given, given a bunch of documents, like policy rules and and things like that, could it reliably grab and cite back answers to policy questions, you know, answers to questions like if my client is, a part time student and lives at home with their parents, are they eligible for food assistance, you know, and and more and more complicated questions like that. In that early experimentation, the answer was

Speaker 2 28:38 – 28:53

In that early experimentation, the answer was no. We were getting a lot of incorrect answers back from the general LLMs, which then prompted us to pursue additional guardrails within our tool.

Speaker 1 28:54 – 30:49

Yeah. You know, but we also looked at other use cases around, when you apply when you apply for a benefit program, particularly one where you are getting money from the government, you have to provide a lot of documents around who you are and how much money you have and whether you have any assets and in the shape of your household, you know, the structure of your household, I mean. And so we were also doing things like, you know, out of the box, can any of these frontier models kind of take a bunch of documents and then tell us what the documents say and prove about the person? And we had a a really funny case where, we were calling it the document analyzer, where we were trying to get the document analyzer to tell us whether or not something was a photo ID of a person. And so we had, like, a, you know, a mock driver's license. We had the same mock driver's license, really blurry. And we had, like, a piece of notebook paper where someone had written, this is a photo ID. Right. And and two of those counted. And it wasn't the two. You know, there shouldn't have been two of those things that really counted, for that kind of use case. And so, you know, really looking at, like, what what are the limits of the technology? So experimenting in that case, doing concept testing with the the navigators that we're working with to understand, like, what is the appropriateness, to the task at hand? Would they want to use a tool for this? How do they feel about it? You know, what does it make sense to fit into their workflow? So, you know, experimentation kind of in all of those domains while we're looking for, whether whether there was any right opportunity to to pilot something in the the chatbot kind of emerged, at the forefront of those tools. And, we're we're gearing up to pilot another one of

Speaker 2 30:49 – 31:17

them. Also, say it emerged because our our our partners also were really interested in the chatbot at the time. So we were actually very open and excited to explore a range of tools, but, this was a particular tool that they they could see extra value for them that they wanted to pursue. So part of it was also what was our particular nonprofit partner's interest and needs as well.

Speaker 0 31:17 – 31:31

So it sounds like kinda this process led you to this solution idea, which then sounds like it passed through that kind of feasibility filter y'all were describing, which then led it into, hey. Like, we actually have an appropriateness

Speaker 2 31:31 – 31:32

checklist.

Speaker 1 31:32 – 31:33

Oh, Oh, an appropriateness

Speaker 0 31:33 – 32:04

checklist. Checklist for anyone that wants to follow our, essentially, our feasibility assessment for Yeah. Maybe we can put it in in the show notes for that. Oh, nice. Okay. We're not listeners. I'll make sure we get that linked in there. So you'll be able to click on and see what they're talking about. But since it it sounds like it it got enough checks on the checklist to kinda go to your your next, phase, which is kinda getting into a prototype and into a pilot. What did you all learn kinda through the process with this chatbot project going through prototyping and piloting?

Speaker 1 32:05 – 33:47

So our our core partner in that pilot, Martel mentioned them earlier, they're Imagine LA, and they have a software product called the Benefit Navigator that folks at different nonprofits across LA County use in in their work as professional navigators. Right. So, the Benefit Navigator as a as a software product can help you, do the intake for a client and understand kind of what is the plan. What what do we think they are probably eligible for, then can kind of lead you off into those different applications. And so for us moving into that pilot phase, before we were able to pilot, we needed to understand, well, you know, where should this chatbot show up in the experience for a navigator? In our case, we actually made our our chatbot available via API so that navigators weren't having to, like, add another tool to their already, you know, kind of burdened workflow like we talked about. So making sure that there was the right kind of context in terms of the workflow for the chatbot to show up in doing early testing to make sure that, that, you know, it was just working. Testing to make sure that, that, you know, it was just working, generally, and that they were prepared to kind of do that integration, making sure that we recruited benefit navigators who ended up being from, actually five different organizations in LA County. I think at at final count, we had 60 navigators who use the chatbot during the pilot and making sure that they they were ready and trained and able to really, use it once once we went live with the pilot.

Speaker 2 33:48 – 35:15

So related to the training, we were like, okay. We're gonna we're gonna train everyone. They're gonna use it. It's gonna be great. Adoption wasn't as high as we were hoping when we first first started. So I think that was an, an interesting early lesson for us of, we need to invest not only in training, but also in additional touch points to understand what are the barriers and challenges for people, why are they not using the chatbot. And so our team did some some sessions with some of the navigators to understand why they weren't, you know, why they maybe weren't using it as frequently as we were hoping that they would and, were able to, share additional information and answer many questions. And after that, we ended up getting more adoption. So I think that was a key key lesson learned and a big value of testing this in actual real world setting is, you don't actually know how people will use it until it is in the real world. You won't understand why they're using it or why they're not using it until you talk to them. And, yeah, that was that was a really interesting, aspect of the pilot. In addition to, some early anecdotes of navigator experiences, we've gotten some pretty positive feedbacks so far, and we've seen, the usage increase over time, which was really, really exciting. And we will be reporting on the full pilot results very soon.

Speaker 0 35:15 – 35:48

We've been talking about the context of AI tools in the the navigator sense, like folks that are helping others out as they engage with the service. But, as we were preparing, y'all also mentioned to me that you have work on kind of another side of it, which is, like, the folks that are filling out the applications themselves where you have a pilot for an agentic AI that can autonomously fill out benefit applications. What does something like that mean for the future of government service and how folks interact with it? I think it's very exciting.

Speaker 2 35:49 – 35:49

Yeah.

Speaker 1 35:50 – 37:59

I'll give a little context to that. So, it's really exciting. We are expanding our, AI tools for caseworkers program, with a with a grant from google.org, which we will be, kind of in partnership with, Imagine LA again and Riverside County, which is a county in Southern California next to Los Angeles County. So I'm I'm based in Los Angeles. This is all very close to to me and my community that I'm in. But for this this pilot, we are we are still caseworker facing for this pilot and for this experimentation. But we are looking at, you know, once someone has gone through that intake process with their caseworker, once they've identified what they're eligible for, can we use information that already exists, either in the benefit navigator system or the case notes or the conversation between those folks to begin filling out those applications for people? And this targets, you know, another of what we talked about earlier in terms of what is really burdensome in terms of getting into these programs. So if you can imagine, you know, if you are eligible for, CalFresh, which is the California version of SNAP, so food assistance, assistance. You have to provide, you know, documentation of your income, who you are, your your household, kind of all those things we discussed earlier. But let's say you're also eligible for cash assistance or or another program, maybe childcare because you have young children in your household. You will provide that same information over and over again across these different applications. And so this project is really looking at how can we dramatically reduce the time and effort, that is spent on paperwork and, you know, increase some of that program efficiency while looking at the potential for these more emerging kind of AI paradigms around agents to to begin to kind of shift the way that that that service gets delivered.

Speaker 2 38:00 – 39:17

I think it could has the potential, to completely transform how government serves people. I we were at the kickoff for this particular grant opportunity last week, and I I know Genevieve and I were talking about how, this could completely disrupt how we how digital applications flow as we know it. The digital flows could become a thing of the past. So imagine a future where a caseworker and a client are just having a conversation, and the AI agent is working in the background to populate the appropriate information into application forms. And so you're having this very human to human interaction instead of focusing on extracting data for the purposes of just filling out a form. The data is coming up in the context of a conversation. And then, you know, we, of course, will promote, caseworker oversight to review the applications. We wouldn't we wouldn't want just information being sent without consent or without oversight. But it it has the ability to allow the caseworker to focus on the humanity of the interaction rather than the bureaucracy of the interaction, which is really exciting.

Speaker 0 39:18 – 39:48

So we've been talking about some fun and interesting projects here, which, for some reason, it makes me think of this quote that, I can't remember where it comes from, but I'll attempt to paraphrase it where it's the difference between playing or messing around in science is writing down the results. So as we think about that in this context, you know, I imagine you're collecting a bunch of data. We'd be kind of figuring out criteria to evaluate these experiments we've been talking about, these pilots, these prototypes. What does that look like in the context of this sort of work?

Speaker 2 39:48 – 41:37

Yeah. So we're we're collecting a lot of data. As I said, our our goal is to learn and share share out what we're learning for purposes of policy change, for purposes of scaling ideas, practices, products. So we're collecting a lot of data to share both qualitative, and quantitative. And we are looking at several different areas of, several different outcome areas, some of some of which we've already shared. So we're looking at that appropriateness that appropriateness of the problem solution fit, for the AI and the problem. We're looking at the acceptability of the tools by both the the benefits navigators, the caseworkers, and the beneficiaries, the families. We're looking at accuracy. That's a really interesting one. I'll get back to that in a moment. We're looking at administrative burden, and we're looking at bias, so that we're not inadvertently providing good service to one population and bad service to another population. We wanna know where our weaknesses are so that we can correct for that in the in the broader, service experience. But going back to accuracy, I do wanna highlight that this one is a really important one for us because I feel like it's often misunderstood in the AI context. We're often like, oh, we need really, really accurate AI models. And, yes, that's true. But, really, what we need is accurate results. Like, we re we need accurate enrollments. We need accurate referrals, and humans aren't a 100% accurate. The AI models may not be a 100% accurate. But maybe together, they're gonna be more accurate. And that's that's what we're trying to test is the human that engages with the AI tool. Can we get to better outcomes? Can we get to better can we get some more people enrolled? Can we can we close that coverage gap?

Speaker 0 41:38 – 42:09

I think when we talk about either whether it's, like, new tools, new processes, new ways of working, I think we inevitably run into the question of especially if you've just determined that they're working at at the scale we're talking about when and you wanna make it bigger. Right? I think you inevitably hit the thing where it's like, well, you gotta convince other people to be down to implement it. Right? So you have to get, like, buy in from stakeholders, which I guess is maybe the, like, more biz like, business y way to to say it. What, promising kinds of practices have y'all, found so far for trying to get that buy in from folks?

Speaker 2 42:09 – 44:12

We are really focused on doing all of our work in the open. I know it's a general civic tech practice to do work in the open, but everything of ours is open. Open source code, open practices, open processes, open results sharing. And we see our value in novel labs to the these broader advocacy conversations for programs, is in sharing our learnings. And, we are technology researchers and implementers, and so we are not, you know, we're not the lobbyists, but we are we have a lot of valuable information that could be contributed to these broader conversations to decision makers and influencers of decision makers. They should have the information they need to make better decisions. So what we've been doing is we've been intentional about sharing what we've learned with organizations who have relationships with program administrators and policy makers. We've hosted public demo days and invited decision makers and influencers. We've taken meetings with government agencies and partners who wanna learn more. We are contributing to requests for comment, requests for information on AI topics. And we've also created a lot of resources and tools with our communications team. Nava has a wonderful communications team. Shout out to the comms team. We've created a specific web page that houses, all of our process so far, so you you can you can join our journey from research to experimentation to prototyping to piloting. We have it all documented in case studies. We have toolkits like the appropriateness toolkit we mentioned. We host events like the demo days. Yeah. We're we're continuously learning things that we think could be valuable to, broader conversations around AI implementation in government programs that policymakers, program implementing program implementation decision makers could find valuable.

Speaker 0 44:14 – 44:44

Something I've seen, even in my own work when it comes with, like, your measuring things and coming up with metrics, that sort of things, particularly ones that lead to decision making, is that a problem can come up, especially if you have, some sort of reward attached to the outcome of a metric and that the metric itself becomes, like, the goal. And when that happens, it can lead to some, like, unintended incentives and unintended consequences. How can you seek to avoid a situation like that?

Speaker 2 44:45 – 46:07

Yeah. This this makes me think about the accuracy measure that I talked about a little bit earlier. I think there can be overemphasis on model accuracy and not enough emphasis on outcomes. What are the outcomes we're trying to achieve, the accuracy of those outcomes. So I could see us getting if we're overemphasize emphasizing model accuracy, that's great. But if we're not actually measuring impact on outcomes, we could have a great model and actually not have great impact on that coverage gap that we're trying to achieve. So that that's one example that comes to mind. And my my advice, is to my advice to folks thinking about, when you're thinking about what to measure, think think not just about the the product metrics, so not just the model accuracy. Think about the program outcomes you want to achieve and try to see where the linkage between the product metrics that you can measure and the program the program outcomes and see if you can measure using program data as well. So that would be my recommendation is, like, don't necessarily just go all in on product metrics because if we're not looking at program outcomes, it might not actually be be doing any good.

Speaker 0 46:08 – 46:23

Let's, take a moment and step back from, I guess, numbers and and data. And, let's think about, like, success, which leads to a question I have, which is, what does success look like for a family that benefits from your work beyond the kind of data and number stuff we've been talking about?

Speaker 1 46:24 – 48:18

To me, I think that success really looks like just access to access to the benefits they wouldn't have otherwise gotten. Right? And so kind of going back to to where we started, that translates into that translates into, kind of no friction around buying the food you need for your family that week or that month, easy access to health care when you need it or even for preventative services. And over time, you know, we really see those benefits compound into people having more options, people having more room to, you know, make the decisions that work for them and their families in terms of, education, you know, the kind of jobs they wanna go after, wealth building, things like that. And, you know, it's it's been shown, that as those things compound, people live longer. They'll they have healthier, you know, kind of outcomes. They have better socioeconomic outcomes. And I think that we're just a tiny piece of that in, like, making this one part of the safety net stronger. But I think the the potential is really great for for helping connect families to those benefits. And then I think within that, one of the things that this project is is looking at is is what is the paradigm shift that can, you know, or will come about as a result of introducing, you know, more AI enabled tools into government services. And and how can we kind of influence that to to be a good thing to have a positive impact on American families who are trying to connect with programs that are gonna help them. How can we make sure that we kind of steward these technologies in a responsible and ethical way toward those goals?

Speaker 0 48:19 – 48:35

And, as we get to, I think, the cap of our main topic before we get into the kind of closing sort of questions, there's a fun magic wand kind of thing for us to cover first, which is if you could wave a magic wand and change one policy tomorrow, what would it be and why?

Speaker 2 48:35 – 49:21

I think that we should establish more experimentation labs, within the context of government agencies to allow people to experiment with emerging technologies like AI, but also just other technologies have a place where people can take risks and do experiments and pilots and prototypes and, not have such pressure, delivery pressure that is usually accompanied by technology work in government. It there is precedent for this. We we we do a lot of piloting work on all sorts of interventions and programs across government. So it's not like this is a new thing, but I think technology specific experimentation labs would be would be awesome.

Speaker 1 49:22 – 50:19

Yeah. I think for me, I I'm also realizing, I think this might be the same answer I gave you when I did the podcast in, like, 2018 or 2019. But more room to fail. Ever more room to fail. You know, I think that's that's sort of the the biggest value of the lab to my mind is that, we have we have the space to to do these experiments, to really set some bold, ambitious goals and just, like, run at them. I think that, you know, government services and and the safety net are, like, some of the most tangible ways we kind of make good on our our promises to each other. And, like, why wouldn't we try some really bold, ambitious ways of of carrying that off? So I would I would love to see more more safe contained failure, within within the space.

Speaker 0 50:19 – 50:24

Oh, I might have to go back into the archive and see see how consistent that answer is now.

Speaker 1 50:25 – 50:30

I I think it might be the same answer. It sounds familiar. I I think you might be right.

Speaker 0 50:30 – 50:47

Yeah. It's a good answer, though. Thanks. And as we get kind of to our closing section, I imagine there's folks out there listening, maybe discovering Nava for the first time as they kinda go through here. Maybe they're interested in the work y'all are doing. Is there anything you'd like to share with them about the work y'all have coming up?

Speaker 1 50:47 – 51:53

Yes. So I think there's a few plugs I would love to make. We are always hiring at Nava, for for, technologists, so designers, engineers, product managers, project managers, etcetera. You can check that out at navapvc.com. We also are sharing very frequently, what is going on in the lab. As Martel said, we are are publishing our pilot results soon, and we're also kicking off two exciting additional pilots. One of them being the the agentic submissions tool, with with Imagine LA and Riverside County. So folks can follow along with that at the the Nava Lab section of the Nava PBC website. We have lots of demo days and love to engage with folks. So if you're interested in following along with what we're doing or you just want to get in touch, please go to the website, sign up for our newsletters. You can even sign up for a lab specific one if you just wanna keep up with what we're doing there. Am I missing anything?

Speaker 2 51:54 – 52:02

You can subscribe to our newsletter. We just started a newsletter too. So we can we can share maybe share the link, in

Speaker 1 52:02 – 52:04

Oh, yeah. Can we put that put that somewhere?

Speaker 0 52:05 – 52:16

Yeah. I can, we will make sure that these links are in the show notes because, you know, I I I'm hearing somewhere that there's a newsletter folks can subscribe to so we can we can make sure it's there and ready for them. Yes.

Speaker 1 52:17 – 52:23

Yes. Yeah. You can hear from Martel and I every other week about about the lab and what's going on.

Speaker 0 52:23 – 52:35

Awesome. Well, Martel, Ginny, thank you both for coming on to Civic Tech Chat. I have no doubt that, with what we covered today, that folks will have some interesting tidbits they can bring into their day or into their work. So, again, thank you so much.

Speaker 1 52:36 – 52:50

Thank you for having us. Oh, I'll do one last plug. We're always looking for more partners in the lab. So get in touch with us. We would love to to figure out if there's a pilot or experiment or project we can do together. So let us know.

Speaker 0 52:52 – 52:59

Visit us on the web at civictech.chat, or subscribe to us for content updates wherever it is you download your podcasts.

97 Pioneering Progress: Advocating for Experimentation in Government with Nava Labs

Top Keywords

Transcript

Listen