Subscribe

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Service

New Contextual AI Model Surpasses GPT-4o in Accuracy

Contextual AI Revolutionizes Industry Contextual AI Revolutionizes Industry
IMAGE CREDITS: PYMNTS

Contextual AI has just introduced its new grounded language model (GLM), asserting that it delivers the most accurate factual responses in the industry. The startup claims it outperforms popular AI systems from Google, Anthropic, and OpenAI on a critical truthfulness benchmark. Specifically, Contextual AI reports an 88% factuality score on the FACTS benchmark, while Google’s Gemini 2.0 Flash comes in at 84.6%, Anthropic’s Claude 3.5 Sonnet hits 79.4%, and OpenAI’s GPT-4o stands at 78.8%.

Factual accuracy has long been a challenge in enterprise software, especially with large language models. These models can generate impressive text but often include “hallucinations,” or made-up information. Contextual AI aims to solve this reliability gap by crafting a model optimized for enterprise retrieval-augmented generation (RAG). In such settings, accuracy is paramount.

“Part of the solution is a technique called RAG — retrieval-augmented generation,” stated Douwe Kiela, CEO and cofounder of Contextual AI, during an exclusive conversation with VentureBeat. “I was one of RAG’s co-inventors, so our company revolves around executing RAG correctly and taking it to the next level.”

Why Contextual AI Differs from General-Purpose LLMs

Unlike broad-based AI systems such as ChatGPT or Claude, which handle everything from creative writing to coding tips, Contextual AI focuses solely on high-stakes environments. According to Kiela, companies in regulated industries cannot afford guesswork. When dealing with financial, healthcare, or telecommunications data, every detail matters.

“If you face a RAG problem in a strictly regulated enterprise setting, you have zero tolerance for hallucinations,” Kiela explained. “A general-purpose model might be a good fit for the marketing department. But it’s not ideal where you need absolute certainty and can’t risk mistakes.”

Groundedness as the New Standard for Enterprise AI

Contextual AI’s central philosophy is “groundedness,” meaning every response must stick to the provided data or explicitly admit when it lacks information. For example, if you share a recipe or formula stating “this is only valid in most scenarios,” standard AI models might ignore that nuance. Contextual AI’s GLM, however, will highlight the limitation by saying, “It’s actually only valid for most cases.” This extra clarity is vital in fields where ambiguous statements can lead to compliance problems or financial risk.

Another powerful capability is the model’s willingness to say “I don’t know.” In many enterprise situations, admitting uncertainty is far safer than fabricating a response. Kiela emphasized that this admission of uncertainty is “incredibly powerful in regulated settings, where trust and transparency are critical.”

Inside Contextual AI’s RAG 2.0 Framework

Contextual AI’s platform rests on what the company calls “RAG 2.0.” It aims to avoid the usual piecemeal approach to retrieval-augmented generation. Many RAG systems rely on stitched-together components: an off-the-shelf model for embeddings, a vector database for retrieval, and a large language model for generation. These elements often function independently, which can limit effectiveness.

By contrast, Contextual AI develops every component in a unified manner. The system uses a “mixture-of-retrievers,” which analyzes a query, then plans a strategy for retrieving the most relevant information. These insights move to a top-tier “re-ranker,” which helps filter out irrelevant or low-value data. This carefully orchestrated pipeline results in a more accurate, grounded response from the GLM.

Integration Beyond Text: Charts, Databases, and Diagrams

Though the GLM focuses on text, Contextual AI’s wider platform can process diverse data forms. It connects seamlessly to databases like BigQuery, Snowflake, Redshift, and Postgres. It also handles multimodal content such as charts, circuit diagrams, and more. Kiela believes the most complex enterprise challenges often combine structured and unstructured data. For instance, you might have transaction records, policy documents, and procedure manuals all needing to interact.

“I’m particularly excited about merging structured and unstructured data,” Kiela said. “That’s where many of today’s critical enterprise problems lie—somewhere between database entries and text-based policies, with additional reference materials in the mix.”

Looking Ahead: Reliability That Drives Real-World ROI

Contextual AI plans to release its specialized re-ranker shortly after the GLM launch. It will also expand how it processes and comprehends documents. Future updates will likely include advanced agentic features for automated decision-making under strict accuracy requirements.

The startup, founded earlier in 2023 by Kiela and Amanpreet Singh, already counts organizations like HSBC, Qualcomm, and the Economist among its customers. Many of these businesses are under pressure to show tangible returns on their AI investments. Contextual AI believes its specialized approach will help enterprises achieve those measurable outcomes by drastically reducing hallucinations and boosting factual accuracy.

According to Kiela, a more focused, predictable model can ease concerns in industries where misinformation can damage credibility or create serious liabilities. “While a general-purpose language model might be exciting and creative, it might not be best for a tightly regulated enterprise. Our grounded language model is less flashy but more dependable. It excels at sticking to the context so you can trust it to get the job done.”

Ultimately, Contextual AI offers a path for enterprises to deploy AI without fear of misinformation. By refining each step — from intelligent retrieval strategies to a language model that values accuracy above all — the company hopes to herald a new era where groundedness is no longer just an option, but a prerequisite for AI solutions in mission-critical settings

Share with others