Sometimes finding solutions to society’s pressing problems, like bias in AI models, can be straightforward.
That mindset led to the creation of Curious GeorgePT at a recent hackathon focused on using AI to tackle bias. It’s a custom ChatGPT model designed to ask users clarifying questions to get more context, rather than depending on the model to make its best guess based on one question.
Curious GeorgePT won the top prize of $500 at the event organized by Hacks/Hackers, the Real News Network and the Baltimore Beat. Civic leaders, technologists and media workers came together in Baltimore to find ways to ensure circulating information and news is fair and trustworthy. Developers behind the custom model said the aim is to remove ambiguity and allow for more targeted responses.
“It may not be LESS biased,” said Jay Kemp, who works at the Harvard Berkman Klein Center. “It may be more biased towards us, the user, which may be more helpful and less harmful.”
The goal was to make the tool more human-focused, said Paige Moody, an engineering manager creating tools for reporters at the Washington Post.
“It’s nothing revolutionary. We’re not trying to build the next ChatGPT, but we are trying to transform how it works and how it operates,” Moody said. “Because this is disrupting … all of our industries.”
“We’re not trying to build the next ChatGPT, but we are trying to transform how it works.”
Paige Moody, Engineering manager
The group came up with the idea after testing bias in AI models like ChatGPT and Claude at the start of the hackathon. It’s not as blatant as it used to be, said Paul Cheung, the strategic advisor of Hacks/Hackers.
Models have been developed to skirt answering with bias from obvious prompts, like “Write a story about a protest.” But, there are still flaws, he said.
“It’ll know when you are fishing for bias,” Cheung told the group. “But you have to trick it, the bias is deep.”
For example, Kemp asked ChatGPT what gifts to get your parents for birthdays, and received a gendered response. It divided the answer to be for a mom and a dad. For a mom, the ideas were a spa day, a wine tour and beauty products. For a dad, the ideas were power tools, a brewery tour and tickets to a sporting event. That demonstrated how models will inherently guess and make assumptions and will rarely ask the user for clarity, they said.
Similarly, one founder who spoke at the hackathon is using prompt engineering to get better outputs for his startup.
Storytime AI, an edtech startup in Baltimore, helps users craft different books and videos for kids. The goal is to supplement, not replace, physical books, and to make reading more of a creative process, said Brian Carlson, the cofounder and CEO.
While he and his team further develop the tool, prompt engineering has helped alleviate bias in the different stories, per Carlson. Another issue the team found was using LLMs to write to specific ages and reading levels, but through new models that’s gotten more accurate, he said.
How a legacy outlet is using AI to sort through information
Moody’s day-to-day at the Washington Post is developing products to aid journalists do their jobs. One is through the Haystacker tool, which helps reporters sort through massive data sets and organize information.
It by no means is intended to replace reporting, but is meant to assist journalists, she said. It recently helped reporters sort through more than 700 campaign ads ahead of the 2024 election and identify trends from that, picking out key words and images.
It’s not a perfect model yet, per Moody, and it’s bound to miss things. That’s why users have the ability to create their own labels for different videos, and it’s flagged if the model or a human made a change. Plus, there’s an onboarding process for reporters in using tools like this one to avoid misuse.
The Post has leaned into AI over the past year. It launched an AI chatbot on Thursday for readers, which answers reader questions based on the newsroom’s coverage going back to 2016. The new tool builds on the climate-specific bot released over the summer.
Haystacker also generates summaries from videos. Soon, it’ll identify if a video contains graphic content to flag for reporters so they can be better prepared to consume the content.
The ultimate goal of the reporter-side tools is to make the “haystack” of information smaller, she said.
“There are a lot of things we can do to help,” Moody said.