Skip to content

wget2 | Cheatsheet

GNU Wget2 is the successor of GNU Wget, a file and recursive website downloader. Designed and written from scratch it wraps around libwget, that provides the basic functions needed by a web client. Wget2 works multi-threaded and uses many features to allow fast operation.

In many cases Wget2 downloads much faster than Wget1.x due to HTTP2, HTTP compression, parallel connections and use of If-Modified-Since HTTP header.


Installation

emerge --ask net-misc/wget2

Crawl website and download only if file is valid from a file named 5.txt

wget2 --max-threads=250 --spider -i 5.txt -save-content-on=200

Grab header of website

wget2 https://www.nr1.nu -S

Read exported bookmark config and crawl ALL bookmaŕked sites

wget2 --spider --force-html -i bookmarks_5_1_22.html 

Dowwnload specifik file type

wget2 https://www.nr1.nu \
    --method=GET  \
    --http-user='fbi@info.gov' \
    --http-password='hidden@mail.gov' \
    --referer='https://fbi.gov/secr3t/crawler' \
    --user-agent='(FBI Crawler/v1.0.1|ForRealSeriousCrime|WeCrawlingForObtainingEvidence) AppleIsMalware/v1.0)'  \
    --save-headers \
    --auth-no-challenge \
    --header="Accept-Encoding: all" \
    --secure-protocol=auto \
    --http2=on \
    --https-enforce=soft \
    -A '*.html' -r  

??? Note "Mirror any website as a pro - wuseman edition

    ```bash
    wget2 --method=GET --password=yourFriend --user=yourFriend \
        --http-user=yourFriend --http-password=yourFriend \
        --referer='https://random.gov/secr3t/crawler' \
        --user-agent='(random Crawler/v1.0.1) Hunter)' \
        --adjust-extension -o ~/logs/wget2/wget2.log \
        --stats-site=h:~/logs/wget2/stats-site.log \
        --stats-server=h:~/logs/wget2/-stats-server.log \
        --stats-tls=h:~/logs/wget2/stats-tls.log \
        --stats-ocsp=h:~/logs/wget2/stats-oscp.log \
        --stats-dns=h:~/logs/wget2/stats-dns.log \
        --progress=bar --backups=backups --force-progress \
        --server-response --quote=0  -e robots=off \
        --inet4-only --tcp-fastopen --chunk-size=10M \
        --local-encoding=encoding --remote-encoding=encoding \
        --verify-save-failed --header='Accept-Charset: iso-8859-2' \
        --max-redirect=250 --dns-caching --http2-request-window=250 \
        --cut-dirs=100 --unlink --spider --limit-rate=20k --random-wait \
    ```