AI-Assisted Moderation: How AI is Helping Us Grow Foodnoms’ Food Database

February 28, 2025

The Foodnoms Community Database has been a critical component of Foodnoms’ success. By crowdsourcing data from users, the database has grown significantly, now containing over 660,000 unique foods, with over a third of submissions originating from user submissions.

But crowdsourcing data comes with many challenges. It’s a constant struggle to balance quality, completeness, cost, and efficiency. Each decision requires careful trade-offs.

To address these challenges, we’ve been working on a major upgrade to the Foodnoms Community Database—one that has been years in the making. Today, we’re excited to announce the next evolution of our moderation system. While this update doesn’t include any user-facing changes, it will have a major long-term impact—resulting in a more accurate, comprehensive, and well-moderated database.

To understand why this update is so important, let’s take a look at how moderation has worked in the past and the challenges that led us to designing the new AI-assisted moderation system.

Outgrowing the Previous Moderation System

When we launched the database nearly five years ago, every user submission was manually reviewed to ensure quality. But as contributions grew, we quickly realized this strategy was unsustainable.

To keep up with the flood of new foods, we introduced autoapprovals, which instantly approved submissions that passed eight strict validation checks. This significantly reduced the moderation burden while keeping quality standards relatively high.

One of these checks involved detecting outliers in numerical data—one of the most common sources of errors. It worked by comparing submitted values to a percentile threshold based on the entire database.

While these autoapproval checks caught many common mistakes, they weren’t a silver bullet. Other issues began to grow into larger problems:

Autoapproval False Positives: Certain foods, like oils, nut butters, and protein powders, naturally have extreme nutrient values, which frequently triggered false positives and prevented valid submissions from being autoapproved.
Manual Moderation Still Required: Autoapproval didn’t help in cases where multiple candidates existed for a single food, nor did it assist when users submitted corrections to an existing entry in the database.
Inefficient Tools: The moderation UI was slow, unintuitive, and frustrating to use, lacking even basic bulk operations, which made the process feel tedious and time-consuming.

Old Foodnoms moderation tool — Screenshot of the old moderation tool showing a single food with warnings at the bottom.

As Foodnoms continued to grow, so did the backlog. Corrections went unaddressed, and foods that didn’t pass autoapproval often never made it into the database. Things weren’t "on fire," so to speak, since the Foodnoms app allows users to save and prioritize their own contributions in search results, which helped mitigate the issue. However, the larger problem remained—we had an enormous amount of valuable crowdsourced data that wasn’t being fully utilized.

We needed a smarter, more scalable way to moderate food submissions without compromising on quality.

Designing a New System Leveraging AI

From our experience building Foodnoms AI, we were confident that AI could significantly improve moderation workflows. However, we still wanted the process to be driven by human oversight. While AI proved capable of detecting issues and suggesting corrections, it still made occasional mistakes.

Initially, we considered using AI similarly to autoapproval, but instead of crude rule-based checks, LLMs can leverage context—such as the food name and serving size—to distinguish between extreme but valid values and true errors.

We also recognized the need to improve other moderation tasks where autoapproval wasn’t effective. An autoapproval-like system wasn’t feasible for foods with multiple submissions, but LLMs have the capacity to handle these more complex cases.

Additionally, we needed to address the longstanding performance and UX issues that had been plaguing the moderation process.

To solve these challenges, we built an AI-powered moderation system designed to handle this complexity.

How It Works

The new system is powered by two AI assistants using OpenAI gpt-4o:

Validation Assistant: This assistant checks for accuracy and makes minor corrections. Unlike the original autoapproval system, it evaluates submissions in full context. A high protein ratio might be perfectly valid for a protein powder, while an incorrect brand name could indicate a submission error.
Synthesis Assistant: When multiple users submit data for the same food, we need to determine the most likely accurate version. This assistant synthesizes conflicting submissions into a single, high-confidence food record.

It took several iterations to reach satisfying results with these assistants. Initially, we structured data as CSV, but this led to inconsistencies. Fields weren’t always aligned properly, and validation results sometimes contradicted rejection reasons. We transitioned to JSON for greater structure and clarity (at the cost of more tokens), and later adopted OpenAI’s Structured Outputs feature, which enforces strict schema validation. We also integrated Zod to catch edge cases, improving consistency and eliminating unexpected AI behaviors.

To further refine moderation feedback, we experimented with different rejection formats and settled on per-field validation comments. Now, the AI categorizes each field with an outcome of changed (e.g., typo or capitalization fix), warning (requires moderator verification), or rejected (likely invalid). This approach provides actionable insights at a glance.

Since our top priority is ensuring the accuracy of calories, carbohydrates, protein, and fat, we also fine-tuned our rejection logic. Instead of discarding entire submissions over uncertain micronutrient values, we selectively remove questionable fields while preserving the most important data points. When future users submit corrections to the food with additional metadata, it will be re-evaluated and reconsidered.

System Architecture

To efficiently moderate the high quantity of submissions, we designed a batch-oriented architecture that integrates four components:

Postgres: Database store for all foods and contribution metadata.
OpenAI Batch API: Processes moderation tasks in bulk, ensuring cost efficiency and scalability.
S3: Tracks pending work and AI-generated suggestions.
Interval: Node.js-based frontend platform that powers the moderator interface.

Performance & Scalability Improvements

Our first priority when building the new system was ensuring it could function efficiently at our current scale. The old moderation system had become painfully slow under the weight of the growing backlog. At times, it was virtually unusable.

Thankfully, many of the performance bottlenecks were resolved simply by optimizing our database queries and adding missing indexes. The previous system relied on nested subqueries and inefficient filtering, which led to sluggish response times. By leveraging Postgres’s JSON and aggregation capabilities, we significantly reduced the number of queries required per operation, dramatically improving load times.

As we optimized the queries, we decided to address another issue: prioritization. Now, we process items using a FILO (First In, Last Out) approach instead of FIFO (First In, First Out). Given the huge backlog, we decided it was better to prioritize newer changes rather than older ones that had been stuck in the queue for years.

Shifting to a batch-oriented architecture also had an unintended but welcome benefit: it reduced database load caused by the moderation UI. Previously, the moderation UI made expensive queries on every page load, causing unnecessary load on the database. Now, these queries are only performed when a new batch job is triggered, significantly improving responsiveness and efficiency.

Improved Moderation User Experience

A well-designed moderation tool is essential. Reviewing food submissions is a tedious task, and a poorly optimized UI can lead to frustration and errors.

We decided to adopt Interval for the UI, as it allowed us to focus on backend business logic rather than designing UI components. While this occasionally restricted our ability to create the exact UI we envisioned, it ultimately saved us a lot of time. Interval was acquired while we were in the middle of development, but thankfully, they open-sourced their product, so we simply moved to self-hosting.

Compared to the previous moderation UI, these are the notable improvements:

Bulk Operations: When possible, moderators can review and process several items at a time.
Usability Improvements: Clear visual indicators and color coding make it easier to scan for issues, and handy links to relevant Google searches allow for quick verification.
Data Traceability: Moderators can access the raw, original data to investigate discrepancies.
Performant AI Suggestions: Foods are sent to the AI assistants in the background, and the outputs are cached before being shown to moderators, allowing them to take advantage of the suggestions without waiting for any processes to complete.
Reduced Database Load: The SQL queries that power the UI have been optimized. Instead of multiple queries, a single query is now scoped to just the foods being viewed. Additionally, necessary joins and filters have been refined, improving responsiveness and eliminating historical concerns about high database load.

New Foodnoms moderation tool — Screenshot of the new moderation tool showing multiple user contributions after being initially reviewed by AI.

These enhancements make the moderation process faster, smoother, and more efficient, ultimately ensuring that high-quality food data reaches the community faster.

Real-World Impact

The new AI-assisted moderation system went live in mid-January, and since then, we’ve been hard at work refining workflows and addressing issues based on moderator feedback.

So far, we’ve updated 2,655 foods in the Foodnoms database based on 10,346 user contributions. While we still have a sizable backlog—tens of thousands of submissions to review—the system is already making an impact.

Barcode scan success rates in early 2025 have also been slightly higher than in any month of 2024—an early, positive sign of things to come.

Barcode scan success rate in US for logged-in users — Barcode scan success rate for logged-in users in the US over the last 12 months.

Future Opportunities

We've identified several ways to improve the system and make moderation even more efficient:

Automated Batching: OpenAI calls can be triggered automatically as new submissions arrive, eliminating the need for manually initiating batch processing.
Enhanced Autoapproval: As the new system continues to prove itself, we may disable rule-based autoapproval in an effort to improve quality.

The new AI-powered moderation system marks a significant step forward for the Foodnoms database. While this update works behind the scenes, its effects will become increasingly noticeable as the database grows and improves in quality, ultimately making Foodnoms a better product.

If you're a Foodnoms user, we hope you start noticing improvements in barcode scan accuracy and search results in the coming months. Thank you for your patience over the years as we’ve worked to build a scalable and responsible approach to moderation.

This post was co-authored with Scotty Waggoner, who led development of the new AI-assisted moderation system. Huge thanks to Scotty for his contributions to both this post and all of the technology behind it!