How and Why I started Trustium
By Cedar Milazzo
You might think that a product like Trustium, which exposes misinformation in news articles, was inspired by a high-profile or tragic event involving fake news.
While there have been no shortage of well-documented instances around the world worthy of outrage and action, the origin of the company is, by comparison, less remarkable, but much more personal.
It began during a conversation at a family holiday dinner in 2016. The discussion veered into a variety of topics and at some point landed on a discussion about residents in the area near Yosemite, California who were killing possums en masse because they were spreading rabies to dogs. My father-in-law, who lived in the area, had been informed by an article on the internet and was convinced it was true.
However, my wife was a volunteer at the local wildlife rescue center and had extensive experience providing care to injured possums. In fact she had chosen to take care of possums specifically because no vaccination was required since opossums are not able to contract or spread the rabies virus.
In the light of opposing information, there was confusion.
The article about the ‘dangerous epidemic’ that my father-in-law had read, created fear. That fear was real, but the basis for the fear — the information in the article — was false.
I thought, “Why isn’t there an app that lets you know if an article is credible or not?”
Over the next few months, I saw almost daily discussions and reports of “fake news” and other misinformation come up. Initially I thought the big tech companies would quickly put a solution into place to fight this threat. Given the resources they have and their ability to mobilize small armies of bright people, I thought for sure this issue would quickly be resolved.
It wasn’t. Wondering why, I delved into how the best-positioned tech companies operate. The obstacle seemed to be that fighting misinformation was actually against their business model. Seriously. Big social media companies such as Facebook, Twitter, Instagram, etc all make money when people stay on their site looking at more ads. This “engagement” as they call it is actually increased by divisive, anger-inducing, or emotionally charged content. The more upset people get, the more they argue, share opposing content, the more they generate revenue for the social media platform!
After a few months of investigation and waiting for “big tech” to do something about it, I decided that if they weren’t going to, I should.
In early 2017, I started the company that became Trustium.
I searched through the web for articles I thought nobody would dispute as reliable news reporting (mostly local events) as well as obvious hoaxes and biased reporting. Using this data, I trained a very basic Artificial Intelligence to recognize the differences between these types of articles. I discovered that even just by looking at the “clickbaityness” of headlines I was able to say with a fairly high degree of confidence whether something was reliable or not! With a prototype ML model that was trained on only a few hundred examples of these different types of articles, I was able to get close to 90% accuracy.
With that prototype in hand, I realized this was a feasible solution that could actually help people differentiate between solid journalism and various forms of misinformation. At that point, I created Trustium and began building the company as well as working on a way to gather the huge amounts of data that are necessary to train a high-quality ML model that could recognize misinformation with 95% or higher accuracy. I thought that was the minimum bar for something I could actually put on the market. I fortuitously ran into Elizabeth Earle through a mutual acquaintance, and enlisted her help in using a more academic method of identifying misinformation.
Elizabeth’s PhD thesis was on rhetoric in the news media, and so she had a deep understanding of how journalism was constructed. With her help, the Trustium team was able to identify about 20 basic language indicators that we could use to determine whether an article was legitimate or not. These included things like the ratio of adverbs in the article, the relevance of the title to the actual topic discussed in the article, and the percentage of emotionally charged words present.
Using these new indicators, we were able to create better and better models that slowly approached our target accuracy.
At the same time, we enlisted the help of volunteers across the country to search the web for examples of news articles we could use to train the system. We recruited people from all walks of life, all ages, political stripes, education, and regions of the U.S. to minimize the inherent bias every human has. Once we had a nice selection of tens of thousands of articles, we sent those articles back to the volunteers for “double checking”, and only used those articles that were unanimous across all the reviewers. Obviously this isn’t 100% foolproof, and some articles may have been mis-rated, but after the double and triple checking, we saw our accuracy rates jump up quite significantly.
Finally, in January of 2019, we decided we had something good enough to share with the public and created a browser extension to exercise the ML models and let people know whether articles they were viewing were credible or not.
Naturally, this has led us to seek out allies in academia, media, and industry to further test and refine our solutions for consumers and businesses. We keep learning every day, and with partners like the Credibility Coalition, Pro-Truth Pledge, and others, we’re making real progress in the fight.
As we head into another election cycle in the US, we anticipate a spike in misinformation campaigns, but this time we will be there to help people be more knowledgeable news consumers who argue a whole lot less.