The objective:

To determine if AI can transform our vetted, verified Datebook entries into high-quality news briefs.

Our reporters and editors need time, more than almost anything else. So if there is a way to efficiently and accurately find facts, eliminate redundant work and simplify research, we’re interested. That was the premise behind this experiment to test whether we could convert human-verified calendar entries about upcoming events into short news briefs with the information at hand plus additional context. Our goal was to take our vetted, verified input from the Datebook calendar product we already produce and build a workflow using AI that would transform highly formatted event descriptions into short-form news briefs that would tell readers about newsworthy happenings throughout the Bay Area.  


The approach:

Train Gemini by providing it with information about its role, the input we will provide it, and the desired output- including format, content, and voice.

We tasked our overnight editor, Gabe Agcaoli, with this experiment to have Google’s Gemini AI assistant turn our datebook entries into news briefs and analyze the results in terms of time required to craft the briefs, effort to verify their accuracy and conclusions about whether the outputs met our high standards. We did not publish the briefs; this was just a test to see how the tools might work.    

The results:

The learning outcomes from this experiment were well worth it, but the technology has a ways to go before it can meet the high-standards of the journalism we aim to produce.

After a three-month period, we summarized our findings and are sharing them here. It turns out that the level of accuracy we require with fact-checking and verification led to little or no time saved. But the takeaways — and the tools as they get better — are worth returning to. The relationship we have with our readers, peers and clients is hard earned, so we will keep testing new products and workflows that might help us save time – but not at the expense of our reputation and the trust our readers put in us. Here’s what Gabe had to say about this experiment:


From August to October 2025, I turned Datebook entries or media advisories to news briefs as a potential new consistent form of content for Bay City News. I did not write it but rather, artificial intelligence did, specifically the Google Gemini AI assistant. 
 
In Gemini, I used what is called a “Gem”, which is a customized AI assistant to help  you on any topic. The assistant, called the “Datebook Brief Writer”, was told to follow these instructions: 

“You are a journalist for Bay City News. Your job is to take datebook entries and write news briefs that can be used for radio and television broadcast. I have included a document called ‘News Brief Examples’ that includes 100 example briefs written by our lead writers. Use these 100 news brief examples to understand and adopt our agency’s writing style. Pay close attention to the variety in the examples and try to emulate those different approaches in your own writing. We want to avoid a formulaic feel in our briefs. Be sure to strictly adhere to the following rules:  

Follow the AP writing style guidelines. The first paragraph should be a sentence that is at most 30 words. The style should be factual and informative. The briefs should be under 250 words. In addition to the information from the datebook entries, also use the internet to provide extra context and briefly explain why the event or news is significant. Include concise, relevant background about organizations or people involved to provide context, add value, and inform the reader more fully. Ensure this context does not cause the brief to exceed the word limit.” 

The results showed a clear split. The AI is great at routine, low-stakes items – like sports and community events – and can draft them very quickly. But when it comes to legal or political stories, it can make serious mistakes, sometimes inventing details that weren’t in the source material.  


My findings:

1. The Hallucination Problem: High Stakes, High Risk  

The biggest issue I found was the AI making up believable but false information—especially in legal or crime stories.  


Example 1:

(L-R) San Mateo County Undersheriff Dan Perea, San Mateo County Sheriff Christina Corpus, and attorney Thomas Mazzucco exit court on Friday, Aug. 22, 2025 in Redwood City, Calif. (Alise Maripuu/Bay City News)

August 22, 2025 – Sheriff Christina Corpus

The AI received a simple advisory about a procedural hearing. But it invented an entire backstory involving “confidential records” and inmate deaths and even added mentions of a grand jury – none of which were true.  
An editor besides me had to spend more than eight minutes removing the false allegations, which could have been legally dangerous.


Example 2:

A row of taxis waits for fares in downtown San Francisco on Feb. 28, 2005. The San Francisco Taxi Workers Alliance says the spread of autonomous vehicles has threatened the livelihoods of traditional cab companies. (Thomas Hawk/Flickr, CC BY-NC)

August 27, 2025 – Taxi Medallion Debt

The AI described taxi medallions as “now-worthless.” That wording is legally risky, so I had to change it to “devalued.” A small difference in tone, but an important one.  


 
2. Where the AI Works Best: Sports & Community Events  

For straightforward items with clear details, the AI performed very well.  
Drafts for news briefs about the Golden State Warriors (Oct. 17) and Willie Mays (Sept. 22) were mostly accurate. I noted that the Willie Mays draft was the first time the AI’s output was “all-around accurate.” Routine announcements – like Clean Air Day or Mass Casualty Training – were also easy for the AI. I mostly spent less than five minutes cleaning up these stories.  


3. Cutting the “Fluff”: Fixing Tone and Over-Explanation  

I often had to remove extra promotional or unnecessary information that the AI pulled from advisories.  


Example 1:

A Mexican woman named Raquel gives tearful testimony to the Marin Board of Supervisors about the fear of ICE felt by her children on Tuesday, Oct. 21, 2025, in San Rafael, Calif. (County of Marin via Bay City News)

August 26, 2025 – ICE Protest

The AI included PR-style language from the Interfaith Movement and ICE. I removed it to keep the brief neutral.


Example 2:

A contractor performs track replacement work on the Sonoma-Marin Area Rail Transit (SMART) line crossing Highway 37 near Petaluma on Friday, April 19, 2024.(Caltrans via Bay City News)

September 3, 2025 – Caltrans/Subsidence Science

The AI tried to explain pavement buckling in scientific detail. I needed about 10 minutes to verify the science and make sure nothing was incorrect.  


4. Performance Breakdown by Story Type  

Sports and community events-related news briefs were fast and easy, taking under 5 minutes to write and fact-check. I mostly removed generic phrases or hype.  
 
For news briefs related to infrastructure or science, I checked technical claims for accuracy for about 10 minutes. 
 
However, for politics or legal cases-related news briefs, using AI was really risky and it took at least 10 minutes for me to polish these drafts as they needed the most scrutiny. The AI often added false context, invented narrative details, or misinterpreted legal terms.  
 
The AI worked well for more than half of Datebook items, which were mainly sports and community content. But for the rest – which are of importance – strict human review is necessary because the AI tends to “get creative,” which can lead to inaccuracies or even legal risk. 


So what’s my conclusion?  

I did not feel like a writer. I felt like an auditor. 

AI doesn’t replace reporters. It changes their job. What AI writes, reporters or editors will have to fact-check almost every time. 
 
The time AI saves by drafting stories after supposedly collecting information, is likely to be used for verifying facts, removing invented details, fixing the story structure or fixing the tone to remain objective or less libel-risk. 
 
This begs the question, what is the real cost of saving a few keystrokes, if the price is the public’s confidence? 

Until the machine can credibly flag its own risk, the editor remains the essential firewall – a person armed with skepticism, institutional memory, and judgment no algorithm can replicate. The future of AI in the newsroom isn’t about replacing the writer, it’s about the machine learning to respect the writer. The goal is building a perfect, trustworthy AI assistant that knows exactly when to lean back, hand the controls to the human, and simply admit: “This one needs your judgment.”