James Routley

18 Jul 2025, 13:07 p.m.

Generative AI Abstinence and Harm Reduction

Oberoi describes:

- A tour of the tools...

- A crash course on how to write an effective prompt.

- How I create video chapters ...

- Tips on how to approach more ambitious projects that rely on LLMs.

I am tremendously excited about how AI can make government data more accessible and transparent.

Tools like this are valuable public goods. I'd like to see cities fund them in the way they do libraries.

Cities should do things like this instead of releasing chatbots!.....

Additional context (added for blog post)

New York City's Charter Revision Commission has the responsibility to suggest major changes to how the city runs. The current Commission announced its interim report on July 1st, 2025. It surprised a bunch of people by saying they were thinking of putting a measure on the November 2025 election ballot to make a major change to NYC primary elections. The Commission said that July 15th was the last day for the public to submit written comment, and that that, on July 21st, they would vote to decide which questions they would put on the ballot.

On July 7th, they held their first public hearing since the release of the interim report. Over 4.5 hours, 57 people gave spoken testimony, many representing advocacy organizations. The video went up on YouTube on July 8th, and YouTube, as usual, provided its own AI-generated transcript/captions. The Commission published a brief summary of their testimony on (I think) July 14th, but still has not released a full transcript; it often takes many business days, or even weeks, to get an official transcript of these sorts of city hearings.

citymeetings.nyc likely had transcripts and summaries up within 24 hours of the July 7th hearing, per its paid contract with the Commission.

On July 15th, Aditya Mukerjee posted to ask New Yorkers to submit comment before midnight. I did so. I also copied his post from Bluesky to Mastodon. The ensuing conversation included questions and confusion about what specific model of open primary elections the Commission wanted to propose. While researching this, I didn't think to check citymeetings.nyc. I scrubbed through the YouTube auto-transcript to find and skim testimony, somewhat effectively.

Conversation at the July 7th hearing included people discussing that confusion, and suggesting we need more time to work that out and clarify public messaging before the Commission puts a question on the ballot. On July 16th, the Commission announced it would wait longer to ask voters to open the primaries.

In summary: if I had used citymeetings.nyc to research this issue, I would have better understood it, written better testimony, and participated more usefully in the online conversation.

My abstinence

"if you want to use LLMs effectively and responsibly you must acknowledge that they will fabricate things."

Oberoi specifically describes the perils of false/hallucinated output and ways he has mitigated that problem. He doesn't discuss systemic bias, energy usage, or other ethical issues with using LLMs, and how he mitigates those concerns. (I've sent him a note saying I'd be interested in his thoughts on that.)

Nevertheless I emerge with a mix of feelings about my own abstinence.

In 2022 I wrote about how I was thinking about the ethics of using Whisper, an LLM I use to transcribe audio. I did some evaluation, in the absence of ethical guidance from trusted assesors. I continue to use it frequently.

I've tried to abstain from using AI/LLM-type tools that I haven't evaluated this same way. As far as I know, we still don't have any guides like "these LLMs are LESS unethically trained" - https://github.com/mozilla-ai/lumigator/issues/1338 is where I suggest Lumigator do that.

Understanding the cost-benefit of using, e.g., the chatbot-type tools would get easier if I got hands-on experience using them, so I could concretely say: in these domains, with this amount of effort, I have these new capabilities that allow me to do these new things/to do these things faster/better/more delightfully.

But I am unaware of any chatbots that are trained only on ethically sourced data, and which offer a way to mitigate the climate impact of their training and the user's usage.

As Chelsea Troy discusses in "Does AI benefit the world?" (her writing on ML/AI/LLMs has been invaluable to me as I think about this),

Our ethical struggle with generative models derives in part from the fact that we…sort of can’t have them ethically, right now, to be honest.... we did not have the necessary volume of parseable data available until recently—and even then, to get it, companies have to plunder the internet.

And her perspective is: it really is not feasible to get enough people to genuinely consent to sharing their data to train these models sufficiently for usability. And I trust her, saying that, far more than I trust the AI company founders and employees and their apologists.

Someone else noted: As the utility, availability & cost of the unethically trained models get better & better, the incentives to gather & use necessarily smaller ethically-gathered datasets, + train models on them, go down.

Risks, benefits, and capabilities I want

Troy compares the impacts of three technological changes (cars, the consumer internet, and generative text and image models), discusses the disparate impacts on different populations, and says:

Could we theoretically improve the net benefit of any given technical development today if we make efforts to maximize its positive outcomes and mitigate its negative ones? I believe so, and I believe that’s basically the option available to us.

And citymeetings.nyc really brings that home, for me.