Words Scraper - Selenium Based Web Scraper To Generate Passwords List

Selenium based web scraper to generate passwords list.

# Download Firefox webdriver from https://github.com/mozilla/geckodriver/releases
$ tar xzf geckodriver-v{VERSION-HERE}.tar.gz
$ sudo mv geckodriver /usr/local/bin # Make sure it is in your PATH
$ geckodriver --version # Make sure webdriver is properly installed
$ git clone https://github.com/dariusztytko/words-scraper
$ sudo pip3 install -r words-scraper/requirements.txt

Use cases

Scraping words from the target's pages
$ python3 words-scraper.py -o words.txt https://www.example.com https://blog.example.com
Such generated words list can be used to perform online brute-force attack or for cracking password hashes:
$ hashcat -m 0 hashes.txt words.txt
Use --depth option to scrape words from the linked pages as well. Optional --show-gui switch may be used to track the progress and make a quick view of the page:
$ python3 words-scraper.py -o words.txt --depth 1 --show-gui https://www.example.com
Generated words list can be expanded by using words-converter.py script. This script removes special chars and accents. An example Polish word źdźbło! will be transformed into the following words:
  • źdźbło!
  • zdzblo!
  • źdźbło
  • zdzblo
$ cat words.txt | python3 words-converter.py | sort -u > words2.txt

Scraping words from the target's Twitter
Twitter page is dynamically loaded while scrolling. Use --max-scrolls option to scrape words:
$ python3 words-scraper.py -o words.txt --max-scrolls 300 --show-gui https://twitter.com/example.com

Scraping via Socks proxy
$ ssh -D 1080 -Nf {USER-HERE}@{IP-HERE} >/dev/null 2>&
$ python3 words-scraper.py -o words.txt --socks-proxy https://www.example.com

usage: words-scraper.py [-h] [--depth DEPTH] [--max-scrolls MAX_SCROLLS]
[--min-word-length MIN_WORD_LENGTH]
[--page-load-delay PAGE_LOAD_DELAY]
[--page-scroll-delay PAGE_SCROLL_DELAY] [--show-gui]
[--socks-proxy SOCKS_PROXY] -o OUTPUT_FILE
url [url ...]

Words scraper (version: 1.0)

positional arguments:
url URL to scrape

optional arguments:
-h, --help show this help message and exit
--depth DEPTH scraping depth, default: 0
--max-scrolls MAX_SCROLLS
maximum number of the page scrolls, default: 0
--min-word-length MIN_WORD_LENGTH
default: 3
--page-load-delay PAGE_LOAD_DELAY
page loading delay, default: 3.0
--page-scroll-delay PAGE_SCROLL_DELAY
page scrolling delay, default: 1.0
--show-gui show browser GUI
--socks-proxy SOCKS_PROXY
socks proxy e.g.
-o OUTPUT_FILE, --output-file OUTPUT_FILE
save words to file

Known bugs
  • Native browser dialog boxes (e.g. file download) freeze scraper

Please see the CHANGELOG

Via: feedproxy.google.com
Words Scraper - Selenium Based Web Scraper To Generate Passwords List Words Scraper - Selenium Based Web Scraper To Generate Passwords List Reviewed by Anónimo on 8:33 Rating: 5