Combatting the intentional injection of misinformation is an ongoing battle at the forefront of modern social media. Misinformation can be difficult for even human reviewers to detect and the costs of and time delay associated with human review are prohibitive. To help combat the problem, an algorithm to classify the accuracy of content could be integrated directly into social media platforms if it achieved a threshold accuracy to be trusted by the general public. This paper proposes a hierarchy of trained and pre-trained neural networks for the classification of news articles as fake or real. Since datasets available for fake news are limited, training a network solely with the fundamental data would be challenging. In the solution presented, the lead net relies on a hierarchy of pre-trained subnets to assemble a set of high-level features to use as inputs in classification. The advantage lies in that the subnets can be trained on other datasets for which more information is available. For example, a subnet may be able to recognize equivocation and flag its occurrence in an article. The lead net can then account for equivocation in its final fake or real classification. Some of the high-level inputs are generated with methods other than neural networks. The lead net also accounts for general information associated with the articles such as average word length, number of nouns, number of semicolons, date and more. The technique of using externally trained subnets fed into a lead net could be extended to other domains.
|