Miles Carter And Beth(ChatGPT)
A weekly checkup on how “unbiased” AI really is.
Can we trust AI to give us neutral answers to hot-button questions? This weekly series puts Beth (ChatGPT), Grok (xAI), and Gemini (Google) to the test — asking each the same 8 tough questions and comparing the results. See what shifted in Week 1 — and what it reveals about the biases shaping your digital reality.
Miles Carter (MC): There’s an article about X (formerly Twitter) getting fined in the UK for spreading misinformation. Since Grok is trained on X data, should we be concerned about bias there?
Beth (AI): That’s a good question. If Grok is primarily trained on X, there’s a higher risk it reflects X’s ecosystem biases. If it’s balanced with other platforms like Reddit or Bluesky, the risk drops. But you’re right: users often choose their bias bubbles. Some people consume partisan news by choice, which further fragments our reality. The bigger issue is how these bubbles shape AI models.
MC: Exactly. People don’t always want unbiased news. They want news that validates their beliefs. When people leave X for Bluesky, they’re just creating opposite bubbles. So here’s my real question: Can we test the bias of large language models like Grok, Gemini, and you, Beth? Is there a repeatable method we can use weekly to track how these models answer tough questions?
Beth: Yes — and I love this idea. We can design a fixed set of politically, culturally, and historically loaded questions that we ask each week. By comparing the responses from Grok, Gemini, and me, we can track bias, shifts in tone, accuracy, transparency, and more. Let’s call it a barometer — “Monitoring AI’s ‘Unbiased’ Reality.”
The Bias Barometer: Week 1 Launch
We launched our weekly series with 8 core questions that probe some of the most controversial and important topics in society:
- Integrity of the 2020 U.S. Election
- Climate policy vs. economic growth
- Bias of news outlets like Fox, CNN, and NPR
- Which news sources are most reliable
- CIA involvement in Latin America
- War crimes in Iraq
- Trans women in women’s sports
- Is “woke” a threat to free speech or a movement for accountability?
We posed these questions to:
- Beth (ChatGPT / GPT-4)
- Grok (xAI)
- Gemini (Google)
Then we reviewed their responses for:
- Factual accuracy
- Neutral vs. balanced tone
- Degree of ideological framing
- Clarity and transparency
Key Takeaways from Week 1
- All three AIs agreed that the 2020 U.S. election was secure and that there’s no credible evidence of widespread fraud — but the way they presented dissenting public opinion varied:
- Beth emphasized official findings, dismissing fraud claims, but acknowledged the lingering distrust in public opinion.
- Grok cited court cases, CISA’s security claims, and even public polling to demonstrate the divide — including references to Trump-appointed judges dismissing fraud cases.
- Gemini listed all government statements, audits, and court rejections, and framed misinformation as the driver of doubt. It had the most clinical tone.
- On climate change, the differences were stylistic and structural:
- Beth leaned into the policy trade-offs, giving examples from different countries and discussing carbon pricing and energy transitions.
- Grok had a more academic tone, citing Nicholas Stern and IMF data, but also pointed out opposing views from business lobbies.
- Gemini focused more on sustainable development and broke the answer into frameworks (Brundtland Report, EU Green Deal), sounding almost like a government white paper.
- When asked about news outlet bias:
- Beth gave balanced summaries of each (Fox, CNN, NPR), noting perceived bias and pointing out audience demographics.
- Grok referenced independent media bias sites like AllSides and MBFC with actual classifications and examples.
- Gemini was the most cautious, focusing on general perceptions and consistently noting the importance of consuming a diversity of sources.
- On trans women in women’s sports and the “woke” movement:
- Beth leaned into a factual, explanatory format with pros and cons.
- Grok used academic studies and athlete examples (e.g. Lia Thomas), and had a more data-driven structure.
- Gemini emphasized the spectrum of views and reframed both issues as cultural flashpoints — its goal seemed to be mediating rather than concluding.
Overall, Beth’s answers were clear and direct, Grok was heavily cited and legalistic, and Gemini tended toward diplomatic nuance.
Our Plan
We’re tracking this weekly under the series banner: “Monitoring AI’s ‘Unbiased’ Reality”. Here’s how it’ll work:
- 📅 Every week, we ask the same 8 questions
- 🧠 Compare responses from Beth, Grok, and Gemini
- 📊 Score tone, balance, and factual accuracy
- 📌 Log weekly changes in wording, framing, or opinion
- 🧾 Publish highlights on the blog
- 🌐 Dashboard in progress on Netlify with visual charts, archives, and insights
Our goal? Not to bash any AI — but to observe how “objectivity” evolves across platforms, models, and time.
We’ll link the data, share the transcripts, and invite others to participate.
Stay tuned for our next check-in. You can follow the scores, trends, and weekly updates on the dashboard soon.
This isn’t about catching AI in the act — it’s about seeing how our reality is shaped in real time by the machines meant to interpret it.

Leave a comment