CIOInsights - Insights From Technology Leaders

A better approach, Cognitive Architectures are the Future of AI

By Arthur Wielgosz, Technology Expert

Lots of people talking, few of them know, that the soul of an inference engine is probabilistic and doesn't care about the truth. In this paper, we are discussing the nuanced realms of Artificial Intelligence (AI), Machine Learning (ML), and Large Language Models (LLMs). We'll delve into how these technologies, particularly LLMs, are reshaping our understanding and interaction with digital systems, focusing on their probabilistic nature and the innovative solutions addressing their inherent challenges. Let us start with the simple definition to put this into context:

AI (Artificial Intelligence) - The broad science of making intelligent machines or systems that can simulate human intelligence processes.
ML (Machine Learning) - A subset of AI focused on algorithms and statistical models that enable computers to perform specific tasks without using explicit instructions, relying on patterns and inference instead.
LLMs (Large Language Models) - Advanced ML models trained on vast amounts of text data to understand and generate human-like text, a pinnacle of current AI research in natural language processing.

Inference Engines and LLMs

Large Language Models (LLMs) like ChatGPT function as inference engines, using Machine Learning to analyse and generate text based on vast amounts of data. These models predict the most probable next word or phrase in a sequence, effectively 'inferring' human-like responses. This capability stems from their training on extensive collections of text, allowing them to apply accumulated knowledge to new queries.

Inference engines, despite their impressive capabilities, have inherent limitations due to their probabilistic nature. They generate responses based on statistical likelihoods, leading to "best guesses" rather than definitive answers. This approach can cause "hallucinations," where the engine produces incorrect or nonsensical information, especially when faced with queries outside its training data. This challenge underscores the need for advanced mechanisms to improve the reliability and accuracy of these AI systems.

Addressing the Challenges

The Mixtral (Mixture of Experts Model) approach, represents a significant evolution in tackling the challenges posed by inference engines. Unlike traditional models that rely on a single neural network to process all types of information, Mixtral utilises a diverse set of specialised sub-models, each an "expert" in a particular domain. This architecture mirrors the way human expertise is distributed across different fields, allowing for a more nuanced and precise approach to problem-solving.

In a Mixtral system, a "router" network plays a crucial role, dynamically selecting the most relevant experts based on the specific context of the input data. This means that for any given query, only the most applicable sub-models are activated, making the process both efficient and effective. The selected experts process the data independently, and their outputs are then aggregated to form a comprehensive response.

This methodology addresses several limitations of traditional inference engines. By leveraging multiple experts, the Mixtral model reduces the likelihood of hallucinations, as each expert's specialised knowledge contributes to a more accurate and reliable output. Moreover, this approach allows the system to handle a wider range of queries with higher confidence, as there is likely an expert well-suited to any given task.