Large Language Models can be updated by using Retrieval Augmented Generation of High-Quality Data like Journals. This should be done on topics that the user asks using Gen AI. Unnecessary data that no one cares about need not be updated. This updated LLM fragment should be marked as high quality with more weightage given to it. If there is time-bound data, stale data should be marked, and RAG should be performed again every time or whenever necessary. Problem with Present Large Language Models The problem with present LLMs is in the frantic attempt to train LLMs, maximum internet was covered without considering the quality of the data in it. We need to rectify it. But it is impossible to start from the beginning and correct everything. Also, there is no need to do it. We can concentrate on the data that is most needed. Hallucinations Hallucination does not come from somewhere. It is the improper data fed into the LLMs during the training of the model. If we need to minimize hallucinati...
In this article, we discuss why Generative AI hallucinates. Is it based on the quality of training data? How can we get output that has minimum hallucination? Is Artificial Intelligence biased? How to Mitigate Bias in Generative AI? We discuss these in this article. Introduction You should understand first how AI Models work. AI takes input from innumerable sources of information such as web pages on the World Wide Web, Books, Journals, Articles, and so on. These data are stored as nodes in Neural Networks. Generative AI just predicts the next most probable word. The next Node connection path it should take to churn out the answer to your question. That means no data scientist can ever examine or predict the output of Generative AI. It is just connections between words. How will you examine connections between words? Hallucinations of Generative AI It is a common misbelief that AI Large Language Models hallucinate and give out non-existent information. It does not make up a...