Skip to content

Instantly share code, notes, and snippets.

@simonklee
Last active August 29, 2015 13:56
Show Gist options
  • Select an option

  • Save simonklee/8842786 to your computer and use it in GitHub Desktop.

Select an option

Save simonklee/8842786 to your computer and use it in GitHub Desktop.
Domain whitelist
import re
class DomainWhitelist(object):
url_re = re.compile(
r'([a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6})',
re.IGNORECASE)
kogama_re = re.compile(
r'^(http[s]?://)?([a-z0-9-.]{2,8})?kogama.com(.br)?',
re.IGNORECASE)
def check(self, value):
for url in self.url_re.findall(value):
if not self.kogama_re.match(''.join(url)):
return False
return True
@alexandrebini

Copy link
Copy Markdown

@simonz05 take a look: http://rubular.com/r/LFWxz7kpgd

I just changed the "dot" part:

r'^(http[s]?://)?([a-z0-9]{2,8}.)?kogama.com(.br)?'

@kogama

kogama commented Feb 6, 2014

Copy link
Copy Markdown

The modified reg-ex is for testing if a link is in the white-list or not. It's the first reg-ex that determines if it's a link or not, and thus it's the first reg-ex that need to handle all URL-cases. This is the one: r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*(),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'

@alexandrebini

Copy link
Copy Markdown

Got it.

This one that you send is not matching www.domain.com without http|s: http://rubular.com/r/FqYasTLz5l

This one is: http://rubular.com/r/FAsg1mTTnB

r'(http[s]?://|www)(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'

@simonklee

Copy link
Copy Markdown
Author

I think I'll simplify it to just look for domain names. It should solve the issue.

[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,6}

http://rubular.com/r/GOuWVw98Qi

@alexandrebini

Copy link
Copy Markdown

We may have problems with usernames don't we? http://rubular.com/r/nFfQpESoPP

@simonklee

Copy link
Copy Markdown
Author

Yes and no. You wont be able to write a comment with the username "simon.kogama" in it. However, that is not something we can solve, since domain and user names follow some of the same rules.

We do however not run these validations on usernames directly. Usernames are upon creation checked versus a much stricter regex which limits the possibilities. You can create a username like this xxx.com, but that is about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment