Created
September 30, 2020 15:08
-
-
Save alfredfrancis/f46304960c83093af17c4d0678178847 to your computer and use it in GitHub Desktop.
Python script to remove email signature
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import talon | |
from talon import quotations | |
from talon.signature.bruteforce import extract_signature | |
import html2text | |
raw_text = '<div dir="ltr"><div>\n<span style="color:rgb(0,0,0);font-family:docs-Calibri;font-size:15px;white-space:pre-wrap">I have applied for PF final settlement on Apr 26 but till today i didnt get any update .i am requesting you to please speedup the process and settle amount. i have many commitments and medical emergency so please consider and do the needful earliest possible.</span><br><div><br></div><br><div dir="ltr" class="gmail_signature"><div dir="ltr">\n<div>Best,</div>\n<div><br></div>\n<div>Alfred Francis</div>\n<div>AI Architect, Cogniwide</div>\n<div>+91 701 209 7621</div>\n</div></div>\n</div></div>\n<br>\n\n' | |
text = html2text.html2text(raw_text) | |
text, signature = extract_signature(text) | |
print(text) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment