Last active
April 16, 2025 08:39
-
Star
(135)
You must be signed in to star a gist -
Fork
(28)
You must be signed in to fork a gist
-
-
Save anshoomehra/ead8925ea291e233a5aa2dcaa2dc61b2 to your computer and use it in GitHub Desktop.
How to Parse 10-K Report from EDGAR (SEC)
i have Html url i dont know how to get txt url of 10k file after that I am able to use above notebook code
any one can help me please
Jesus, you saved my life!
I just tried this, and it does not seem to return anything for the example above?
I just tried this, and it does not seem to return anything for the example above?
import requests
url = "https://www.sec.gov/Archives/edgar/data/1571996/000157199624000036/dell-20240202.htm" must be .htm
headers = {
"User-Agent": 'get it from sec website', # by SEC website
'Accept-Encoding': 'gzip, deflate',
'Host': 'www.sec.gov'
}
response = requests.get(file_url, headers=headers)
html_content = response.text.replace('\xa0', ' ')
you can use this code to parse a 10kfile Once you have HTML you can create your regex function to parse specific content from HTML, or you can get a complete 10k filing as text
Does anyone know any such similar script to retrieve 10-Q?
@Tarun3679
https://github.com/john-friedman/datamule-python
from datamule import Portfolio
portfolio = Portfolio('10q')
portfolio.download_submissions(submission_type='10-Q',ticker='MSFT')
for document in portfolio.document_type('10-Q'):
document.parse()
print(document.data)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Amazing! Thanks for sharing.