Created
December 4, 2022 08:36
-
-
Save MrAch26/5e2aa7e73b508f8ba9133d468efa4348 to your computer and use it in GitHub Desktop.
Captcha Solver with python
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from PIL import Image | |
from scipy.ndimage import gaussian_filter | |
import numpy | |
import pytesseract | |
from PIL import ImageFilter | |
def solve_captcha(filename): | |
# thresold1 on the first stage | |
th1 = 140 | |
th2 = 140 # threshold after blurring | |
sig = 1.5 # the blurring sigma | |
from scipy import ndimage | |
original = Image.open(filename) | |
original.save("original.png") # reading the image from the request | |
black_and_white = original.convert("L") # converting to black and white | |
black_and_white.save("black_and_white.png") | |
first_threshold = black_and_white.point(lambda p: p > th1 and 255) | |
first_threshold.save("first_threshold.png") | |
blur = numpy.array(first_threshold) # create an image array | |
blurred = gaussian_filter(blur, sigma=sig) | |
blurred = Image.fromarray(blurred) | |
blurred.save("blurred.png") | |
final = blurred.point(lambda p: p > th2 and 255) | |
final = final.filter(ImageFilter.EDGE_ENHANCE_MORE) | |
final = final.filter(ImageFilter.SHARPEN) | |
final.save("final.png") | |
number = pytesseract.image_to_string(Image.open('final.png'), lang='eng', | |
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789').strip() | |
print("RESULT OF CAPTCHA:") | |
print(number) | |
print("===================") | |
return number |
@Manedi send more examples
will try
Of course, and thank you for replying. I've gotten it to run and give a result(just not the correct one), edited your script as it wasn't working for some reason for me. I wrote a simple script to grab captchas from the site I would like to use the solver on, I'll attach 10 for you.
Sent from Outlook for Android<https://aka.ms/AAb9ysg>
…________________________________
From: Daniel Ach ***@***.***>
Sent: Saturday, July 27, 2024 10:09:19 PM
To: MrAch26 ***@***.***>
Cc: Comment ***@***.***>
Subject: Re: MrAch26/captcha_solver.py
@MrAch26 commented on this gist.
________________________________
@Manedi<https://github.com/Manedi> send more examples
will try
—
Reply to this email directly, view it on GitHub<https://gist.github.com/MrAch26/5e2aa7e73b508f8ba9133d468efa4348#gistcomment-5135442> or unsubscribe<https://github.com/notifications/unsubscribe-auth/BKEHPMFAEFMF3FMZVXQ4UYTZOQEABBFKMF2HI4TJMJ2XIZLTSKBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDHNFZXJJDOMFWWLK3UNBZGKYLEL52HS4DFVRZXKYTKMVRXIX3UPFYGLK2HNFZXIQ3PNVWWK3TUUZ2G64DJMNZZDAVEOR4XAZNEM5UXG5FFOZQWY5LFVEYTCOJWGYZTQMZSU52HE2LHM5SXFJTDOJSWC5DF>.
You are receiving this email because you commented on the thread.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
@MikeyD-rbg Can you send more examples of your captcha and with better resolution ?
this snippet is ment for number as mention on line 28
tessedit_char_whitelist=0123456789
I've sent you 10 more examples. unfortunately all the captchas are from a site and all this size.
How many more examples would you like?
I've manged to solve one of the captchas but the z is not capitalized, how do I ensure all thr letters are capitals?
Added this line but still lowercase z;
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ'
I have got good results with this script. The image preprocessing works really good and is the key to improve the code. Thanks @MrAch26!
@MrAch26 I've now downloaded over 200k captchas 😂 unfortunately all the same resolution, how many should I upload?
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
@MikeyD-rbg
Can you send more examples of your captcha and with better resolution ?
this snippet is ment for number as mention on line 28
tessedit_char_whitelist=0123456789