Last active
January 3, 2020 21:35
-
-
Save klintan/7bb82fce7e38db017586c817a31c0cb0 to your computer and use it in GitHub Desktop.
Truthfinder 8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def compute_source_trust(data, sources): | |
''' | |
Compute every source trustworthiness. The trustworthiness score is the average confidence of | |
all facts supplied by source w | |
:param data: Dataframe all facts for object O | |
:param sources: dict all unique sources and current scores | |
:return: dict of unique sources with updated scores | |
''' | |
for source in sources: | |
# t(w) trustworthiness of website w | |
# the average confidence of all facts supplied by website/source w | |
t_w = sum([confidence for confidence in data[data['source'] == source]['confidence'].values]) / len( | |
data[data['source'] == source].index) | |
# tau(w) trustworthiness score of website w | |
# as explained in the paper, 1 - t(w) is usually quite small and multiplying many of them | |
# might lead to underflow. Therefore we take the logarithm of it to better model how trustworthy a source is | |
tau_w = -np.log(1 - t_w) | |
# update the source score to the new score | |
sources[source] = tau_w | |
return sources |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment