Case Study

The Philadelphia Inquirer has been publishing for more than 190 years. Here’s how it built an AI-driven research tool to unlock its archives.

Case study from the Lenfest AI Collaborative & Fellowship Program, in partnership with Microsoft and OpenAI

April 18, 2025

The Philadelphia Inquirer collaborated with Microsoft in a two-week hackathon, developing an AI-driven “Research Assistant.” The tool enhances newsroom research efforts significantly, empowering reporters and editors, reducing research time, improving accuracy, and helping unlock the value of the newspaper’s massive historical archive. The initiative addresses common newsroom challenges like scattered archive sources and manual information synthesis, positioning AI as a valuable partner in journalism. The project, described in the following case study, includes future plans to scale this solution industry-wide.

In January 2025, The Philadelphia Inquirer participated in the Microsoft AI Development Acceleration Program (MAIDAP) Hackathon, an initiative facilitated through The Lenfest Institute for Journalism’s AI Collaborative & Fellowship Program, in partnership with OpenAI and Microsoft.

The two-week collaboration with Microsoft’s AI & Data Program marks the first time that Microsoft has opened participation to external organizations. The Inquirer was among three selected participants — and the only media company — alongside a healthcare company and a tech firm using data center heat for hot water.

The Inquirer’s participation focused on exploring AI-driven solutions to enhance newsroom workflows, including intelligent content recommendations, automation in news production, and ethical AI applications for journalism. Inquirer participants included journalists, data professionals, product leaders, and technologists, working alongside AI experts from Microsoft and Lenfest Institute AI fellows.

The Inquirer team worked to address the challenge that journalists spend a significant amount of time conducting background research across multiple fragmented archives, manually filtering through vast amounts of data, and synthesizing information for storytelling. They hoped to leverage AI for intelligent search, contextual retrieval, and summarization to assist journalists in their research workflow.

Key outcomes included AI-powered summarization tools and prototype automation systems for content workflows to augment the work by journalists and editors. This program aligns with broader industry efforts to increase newsroom efficiency while preserving editorial integrity.

Here’s an overview of The Inquirer’s MAIDAP experience:

Background and challenges

Current research process in newsrooms

The hackathon team set out to understand how The Inquirer newsroom conducts research by meeting with key journalists and editors. They uncovered the following:

Journalists begin with broad searches (internet-based, internal archives).

Queries are manually refined using keyword-based systems and date ranges with limited semantic understanding.

Research spans multiple sources (e.g., Newspapers.com, internal document archives, digital library systems).

High cognitive load: Journalists must manually correlate fragmented reports over extended timeframes.

Challenges identified

Based on the user research, the team identified five key challenges that journalists encounter when conducting research for stories. These include:

Scattered and non-centralized archives: Journalists search across multiple databases with different interfaces and limitations.

Lack of a semantic, natural language search mechanism: Current tools require precise keywords and date filters, making retrieval cumbersome.

Time-consuming synthesis of information: Reporters must manually extract insights across hundreds of articles.

Metadata gaps: Date-based retrieval is inconsistent, requiring manual guessing of relevant timeframes.

Loss of traditional librarian functions: AI tools must replicate the role of newsroom research librarians.

AI-powered research assistance: the solution

Objective

The primary aim of this project was to develop an AI-driven “Research Assistant” that could seamlessly understand journalists’ natural language queries, retrieve highly relevant information across diverse archival sources, and synthesize the results into coherent, insightful responses —all while preserving journalistic integrity and transparency. This framework could later be used to power tools and experiences available publicly to the Inquirer audience.

Core features

AI-driven semantic search & retrieval:

At the heart of the solution is an advanced semantic search engine. Unlike traditional keyword searches, this AI-driven tool integrates semantic and keyword-based retrieval, indexing articles from multiple archives. This hybrid search capability means that the AI can better understand what a journalist is truly asking — even when the question is vague or lacks explicit date information. For example, a traditional keyword-based search system would likely fail to return relevant results to a journalist’s question, “When did the city last deal with a major flood?” The question is inherently vague because it lacks explicit date references, geographic qualifiers, and precise terminology. To interpret intent behind the question, this hybrid approach associates related concepts such as emergency response, flood events, disaster recovery, and city crisis management to specific terms such as “flood” and the name of the city. By intelligently inferring relevant timeframes, linking to original sources, and structuring outputs clearly, the system significantly improves the speed and accuracy of information retrieval.

Summarization & synthesis:

To help journalists navigate vast archives, the system leverages advanced AI-powered summarization. Rather than requiring reporters to sift through extensive coverage spanning over a decade, the tool quickly identifies and synthesizes the critical information. This enables journalists to swiftly grasp key insights and trends without getting bogged down by hundreds of detailed articles.

Metadata enhancement for contextual accuracy:

Recognizing the importance of context in journalism, the system enhances the accuracy of retrieved information through improved metadata handling. Articles are accurately linked to their true publication dates, allowing more precise searches and better contextual relevance.

Guided user interaction & prompts:

The AI tool doesn’t just wait for instructions — it proactively guides journalists through structured prompts, similar to how tools like Copilot and Bing Chat assist users. These prompts are designed to streamline the research process by suggesting useful queries such as:

“Summarize all coverage on [Topic] from the last 5 years.”

“Find investigative reports related to [Entity] between [Date 1] and [Date 2].”

“Identify past reporting on legal cases similar to [Case X].”

This guided approach makes the tool intuitive to use, even for journalists unfamiliar with advanced search techniques.

Evaluating and mitigating AI risks:

Recognizing that AI systems are fallible, significant emphasis was placed on ensuring accuracy and transparency. To minimize the risk of misinformation, all generated content includes citation verification, ensuring each response is traceable to the original source. Additionally, advanced groundedness scoring was implemented to evaluate the probability that the AI generated unfounded or “hallucinated” information. The solution also leverages Azure AI Content Safety measures to maintain compliance with ethical and responsible AI practices, underscoring the importance of trustworthiness in journalism.

Technical approach

Infrastructure & data pipeline

The team structured its approach in three key areas:

Data indexing & preprocessing

Retrieval-Augmented Generation (RAG) pipeline

Model selection & optimization

Data indexing & preprocessing

The solution began with more than 125,000 archival articles, requiring careful preprocessing to ensure effective retrieval. Each article was systematically structured to support both retrieval processes and metadata enhancements. Azure AI Search was employed to handle this extensive dataset, enabling robust hybrid search techniques to combine embedding similarity and efficient keyword-based lookups, providing a strong foundation for accurate retrieval.

Retrieval-Augmented Generation (RAG) pipeline

The RAG pipeline blends semantic understanding with precise keyword matching to refine searches. Specialized prompts were fine-tuned to enhance contextual accuracy, and the AI-generated responses were augmented with structured metadata, including critical details like dates and relevant entity tags, ensuring comprehensive and context-rich results.

Model selection & optimization

The team chose GPT-4 for query refinement due to its superior natural language understanding and generation capabilities, significantly enhancing the user experience by making searches intuitive and effective. Azure AI Foundry was utilized for generating and evaluating responses, ensuring a robust AI interaction framework. Additionally, structured output techniques were integrated to significantly improve accuracy, especially for queries related to specific timeframes, optimizing overall search performance.

Evaluation & impact measurement

Evaluating the effectiveness of this project is critical to understanding its real-world impact on newsroom efficiency and journalistic quality. The team established comprehensive success metrics designed to measure improvements across several dimensions, from reduced research time and enhanced accuracy in information retrieval to the overall reliability of generated content and user satisfaction.

Success metrics

Time saved in research: Benchmarking pre- and post-AI implementation to measure efficiency gains.

Search accuracy & precision: Evaluating search retrieval quality using

Precision@K (relevance of retrieved documents).

Recall@K (coverage of relevant documents).

Mean Reciprocal Rank (MRR) (ranking effectiveness).

Summarization & coherence:

AI-generated summaries vs. manual journalist synthesis.

Measuring readability and fact alignment.

Groundedness & citation verification:

% of responses correctly attributed to retrieved documents.

AI hallucination rate monitoring.

User study & iterative feedback:

Conducting reporter-driven user tests to refine system behavior.

Collecting qualitative feedback on AI’s effectiveness in improving research workflow.

Future roadmap & scalability

The next phase of the project aims to further enhance the system’s capabilities, broaden the range of accessible archives, and expand its application for The Philadelphia Inquirer and across the industry. The team outlined several key areas for growth and development, emphasizing continuous improvement, collaboration, and scalability:

Expanding archive coverage: The goal is to significantly enrich the archive by integrating more historical records, multimedia assets like images and videos, and digitized documents in formats like PDFs. Improving OCR (Optical Character Recognition) technology will be essential for accurately retrieving information from scanned archival materials.

Enhancing AI capabilities: Future iterations will feature advanced tools like event timeline extraction, automatically identifying significant events within topic coverage, and cross-article synthesis capabilities to clearly illustrate how stories evolve chronologically.

Collaboration & industry adoption: To maximize the impact and utility of the AI-powered research assistant, the team plans to scale the solution to additional newsrooms by sharing code repositories and technical instructions for newsrooms developers across the world to leverage. Additionally, valuable enhancements and refinements will be contributed back to Microsoft’s Retrieval-Augmented Generation (RAG) repository, fostering broader innovation and adoption across the journalism community.

Conclusion

This project represents a critical step in AI-driven journalism research, bridging the gap between historical archives and modern newsroom workflows. By enhancing information retrieval capabilities, The Philadelphia Inquirer plans to continue to refine the work done during the hackathon to establish powerful AI tools help journalists reduce time and identify trends to increase the value of their reporting.

Key takeaways

AI significantly reduces research time for journalists by providing context-aware, semantically enriched search.

The system functions as a digital librarian, assisting with retrieval, synthesis, and summarization.

Evaluation results indicate improved efficiency, accuracy, and potential for newsroom-wide adoption.

Glossary of terms

@K (in Search Retrieval Metrics): A parameter in information retrieval evaluation that represents the number of top-ranked results considered when assessing search quality. It is used in metrics such as:

Precision@K – Measures the proportion of relevant documents among the top K retrieved results, assessing the accuracy of search ranking.

Recall@K – Measures the proportion of relevant documents retrieved out of all relevant documents available, evaluating search completeness.

AI (Artificial Intelligence): The ability of computers or machines to perform tasks typically associated with human intelligence, such as learning, reasoning, and understanding language.

Azure AI Foundry: A Microsoft framework designed for rapidly developing and evaluating AI-powered applications and solutions

Azure AI Search: A Microsoft cloud-based service that provides advanced search capabilities using artificial intelligence for efficient and accurate information retrieval.

GPT-4 (Generative Pre-trained Transformer 4): A state-of-the-art large language model developed by OpenAI, known for its advanced natural language understanding and generation capabilities.

Groundedness scoring: A technique used to ensure AI-generated responses are factually accurate and based on verifiable information.

Metadata: Data that provides information about other data, such as the date of publication, author, and topic classification.

OCR (Optical Character Recognition): Technology that converts scanned images or handwritten text into machine-readable digital text.

Prompt: A structured query or instruction given to an AI model to guide its responses.

Retrieval-Augmented Generation (RAG): An AI technique that retrieves relevant information from databases and uses it to generate accurate, contextually informed responses.

Semantic search: A method of searching that aims to understand the intent and contextual meaning behind user queries, rather than just matching keywords. Instead of relying solely on literal matches between search terms and indexed content, semantic search delivers more relevant results by considering factors such as the relationships between words and the broader context of the query.

Summarization: The process of creating a concise and coherent version of a longer text, capturing the main ideas and insights.

Traditional Librarian Functions: Newrooom librarians were historically responsible for the organization, cataloging, preservation, and retrieval of information across physical and digital formats by creating structured metadata, maintaining taxonomies, curating authoritative collections, and facilitating access through reference services.