Wikipedia, a multilingual free online encyclopedia, has developed a new AI tool that will scan thousands of citations simultaneously to help analyze and verify the information in them.
Demand Of Citations
Wikipedia is using a database of more than 4 million citations. Users demand the citations to get evidence for the claims in order to get satisfied. For example, a Wikipedia article states that President Obama traveled to Europe and later to Kenya where he met his paternal relatives for the first time. There is a need for citations and hyperlinks to assure that the above-mentioned information is correct and is taken from a valid source.
Although hyperlinks do not explain every point, they’re still helpful in supporting the articles. The problem is seen in many cases that hyperlinks lead to unrelated pages that don’t have relevant info. They either stop reading the topic or move to another topic leaving the original one.
Meta Starts Working on AI Tool
An article about Joe Hipp, the first American heavy-weight boxer who fought in World Boxing Association (WBA) was covered by Wikipedia. The article did not mention Joe Hipp or boxing, rather they described how he was the first Native American boxer to challenge WBA.
Joe Hipp claimed that Wikipedia lets people believe things even if the citations as not true. This could be used to spread misinformation worldwide. For this reason, Facebook owner Meta started working with Wikimedia Foundation using the Meta AI (a development research lab for the social media giant). They claimed it to be the first machine learning model that will automatically scan thousands of citations at once.
This will help save time as checking each citation manually would take too long.
Meta AI Efforts
Fabio Petroni, the research tech lead manager for the team Meta AI told Digital Trends:
I think we were driven by curiosity at the end of the day. We wanted to see what was the limit of this technology. We were absolutely not sure if this AI could do anything meaningful in this context. No one had ever tried to do something similar before.
He further clarified how this tool will work:
With these models, what we have done is to build an index of all these webpages by chunking them into passages and providing an accurate representation for each passage that is not representing word-by-word the passage, but the meaning of the passage. That means that two chunks of text with similar meanings will be represented in a very close position in the resulting n-dimensional space where all these passages are stored.
According to Petroni, the team is still working on bringing it to the point:
What we have built is a proof of concept. It’s not really usable at the moment. In order for this to be usable, you need to have a fresh index that indexes much more data than what we currently have. It needs to update constantly, with new information coming every day.
This means that the AI tool is not only limited to text but also supports multimedia. It will help on different platforms supporting images and videos.