Uncovering CloudFlare

Common Techniques to Uncover Cloudflare

You can use some service that gives you the historical DNS records of the domain. Maybe the web page is running on an IP address used before.
Same could be achieve checking historical SSL certificates that could be pointing to the origin IP address.
Check also DNS records of other subdomains pointing directly to IPs, as it's possible that other subdomains are pointing to the same server (maybe to offer FTP, mail or any other service).
Review all DNS-only A/AAAA/CNAME siblings and boring records such as direct, origin, api, admin, cpanel, autodiscover, mail, mx, staging, or forgotten IPv6-only hostnames. A single unproxied record is enough to leak the box.
Audit DNS-only TXT/SPF/MX data and the mail infrastructure. If web and mail share the same machine, bounces for invalid users, SMTP banners, or Received headers can reveal the real IP address.
If you find a SSRF inside the web application you can abuse it to obtain the IP address of the server.
Search a unique string of the web page in browsers such as shodan (and maybe google and similar?). Maybe you can find an IP address with that content.
In a similar way instead of looking for a uniq string you could search for the favicon icon with the tool: https://github.com/karma9874/CloudFlare-IP or with https://github.com/pielco11/fav-up
This won't work be very frequently because the server must send the same response when it's accessed by the IP address, but you never know.
When you collect candidate IPs, validate them before trusting the first hit:
Send the right Host header and SNI (curl --resolve, openssl s_client -servername ...).
Compare status code, title, favicon hash, body hash, headers and error pages.
A direct-origin response usually won't expose Cloudflare-specific behavior such as cf-ray, cf-cache-status, or /cdn-cgi/trace.
Compare the TLS certificate SANs, web server banner, and optional JA3/JARM style fingerprints to eliminate shared-hosting false positives.
If the target uses Cloudflare Tunnel or is originless (for example, Workers/Pages acting as the origin), origin-IP hunting may be a dead end and you should pivot to alternate hostnames or application-layer bugs.

For more recon pivots around favicon hashes, CT logs, passive DNS and related-domain discovery:

../../generic-methodologies-and-resources/external-recon-methodology/README.md

Tools to uncover Cloudflare

Search for the domain inside http://www.crimeflare.org:82/cfs.html or https://crimeflare.herokuapp.com. Or use the tool CloudPeler (which uses that API)
Search for the domain in https://leaked.site/index.php?resolver/cloudflare.0/
CF-Hero is a comprehensive reconnaissance tool that combines current DNS, historical DNS, Shodan, Censys, ZoomEye and SecurityTrails, and can validate candidates with response matching to reduce false positives.
CloudFlair searches Censys certificates containing the target name, extracts candidate IPv4 hosts and compares their responses. Note that, as of late 2024, free Censys accounts no longer expose the API access CloudFlair expected.
CloakQuest3r: CloakQuest3r is a powerful Python tool meticulously crafted to uncover the true IP address of websites safeguarded by Cloudflare and other alternatives, a widely adopted web security and performance enhancement service. Its core mission is to accurately discern the actual IP address of web servers that are concealed behind Cloudflare's protective shield.
Censys
Shodan
Bypass-firewalls-by-DNS-history
If you have a set of potential IPs where the web page is located you could use https://github.com/hakluke/hakoriginfinder

# Install and run CF-Hero with extra data sources enabled
# API keys are configured in ~/.config/cf-hero.yaml

go install -v github.com/musana/cf-hero/cmd/cf-hero@latest
cf-hero -f domains.txt -shodan -censys -securitytrails -title "Target title"

Fast validation of candidate origins

TARGET=target.com
IP=1.2.3.4

# Send the real hostname in SNI + Host and inspect headers/body
curl -sk --resolve ${TARGET}:443:${IP} https://${TARGET}/ -D - -o /tmp/${TARGET}.body

# Compare the certificate directly exposed by the candidate IP
openssl s_client -connect ${IP}:443 -servername ${TARGET} </dev/null \
  | openssl x509 -noout -subject -issuer -ext subjectAltName

# You can check if the tool is working with
prips 1.0.0.0/30 | hakoriginfinder -h one.one.one.one

# If you know the company is using AWS you could use the previous tool to search the
# web page inside the EC2 IPs
DOMAIN=something.com
WIDE_REGION=us
for ir in `curl https://ip-ranges.amazonaws.com/ip-ranges.json | jq -r '.prefixes[] | select(.service=="EC2") | select(.region|test("^us")) | .ip_prefix'`; do
    echo "Checking $ir"
    prips $ir | hakoriginfinder -h "$DOMAIN"
done

Uncovering Cloudflare from Cloud infrastructure

Note that even if this was done for AWS machines, it could be done for any other cloud provider.

For a better description of this process check:

https://trickest.com/blog/cloudflare-bypass-discover-ip-addresses-aws/?utm_campaign=hacktrics&utm_medium=banner&utm_source=hacktricks

# Find open ports
sudo masscan --max-rate 10000 -p80,443 $(curl -s https://ip-ranges.amazonaws.com/ip-ranges.json | jq -r '.prefixes[] | select(.service=="EC2") | .ip_prefix' | tr '\n' ' ') | grep "open"  > all_open.txt
# Format results
cat all_open.txt | sed 's,.*port \(.*\)/tcp on \(.*\),\2:\1,' | tr -d " " > all_open_formated.txt
# Search actual web pages
httpx -silent -threads 200 -l all_open_formated.txt -random-agent -follow-redirects -json -no-color -o webs.json
# Format web results and remove eternal redirects
cat webs.json | jq -r "select((.failed==false) and (.chain_status_codes | length) < 9) | .url" | sort -u > aws_webs.json

# Search via Host header
httpx -json -no-color -list aws_webs.json -header Host: cloudflare.malwareworld.com -threads 250 -random-agent -follow-redirects -o web_checks.json

Bypassing Cloudflare through Cloudflare

Authenticated Origin Pulls

This mechanism relies on client SSL certificates to authenticate connections between Cloudflare’s reverse-proxy servers and the origin server, which is called mTLS.

Cloudflare supports global, zone-level, and per-hostname AOP. The important detail for attackers is that global AOP uses a Cloudflare-provided certificate shared across all Cloudflare accounts, so it only proves that the request came from the Cloudflare network, not from the victim's specific zone.

Caution

Therefore, if the victim trusts the global/shared AOP certificate, an attacker can just place their own domain in Cloudflare, point it to the victim origin IP, and send traffic from Cloudflare to the victim while bypassing the victim's hostname-specific WAF/bot/rate-limit setup.

This is specially interesting when the target only verifies "is this request coming from Cloudflare?" instead of "is this request coming from my Cloudflare configuration?".

Allowlist Cloudflare IP Addresses

This will reject connections that do not originate from Cloudflare's IP address ranges. However, this is vulnerable to the previous setup too: an attacker can proxy traffic through their own Cloudflare tenant and still reach the victim from a valid Cloudflare source IP.

If you have the origin IP, test both situations:

Directly contacting the origin with the victim Host header.
Reaching the origin through your own Cloudflare zone/Worker/custom domain and comparing which protections disappear.

Cloudflare-managed alternate hostnames

Sometimes you will not discover the origin IP, but you can still reach the application through Cloudflare-owned hostnames that the target forgot to disable:

Workers: <worker>.<account>.workers.dev
Workers preview URLs: <preview>-<worker>.<account>.workers.dev
Pages: <project>.pages.dev

This does not reveal the origin IP, but it can bypass hostname-specific WAF, Access, caching, or rate-limit rules that were only tuned for the vanity domain. If you see any of these names leaked in JavaScript, source maps, CSP headers, CI logs, screenshots or public repos, hit them directly and compare the behavior with the main domain.

Bypass Cloudflare for scraping

Cache

Sometimes you just want to bypass Cloudflare to only scrape the web page. There are some options for this:

Use Google cache: https://webcache.googleusercontent.com/search?q=cache:https://www.petsathome.com/shop/en/pets/dog
Use other cache services such as https://archive.org/web/

Tools

Some tools like the following ones can bypass (or were able to bypass) Cloudflare's protection against scraping:

https://github.com/sarperavci/CloudflareBypassForScraping

Cloudflare Solvers

There have been a number of Cloudflare solvers developed:

Fortified Headless Browsers

Use a headless browser that isn't detected as an automated browser (you might need to customize it for that). Some options are:

Puppeteer: The stealth plugin for puppeteer.
Playwright: The stealth plugin is coming to Playwright soon. Follow developments here and here.
Selenium: SeleniumBase is a modern browser automation framework featuring built-in stealth capabilities. It offers two modes: UC Mode, an optimized Selenium ChromeDriver patch based on undetected-chromedriver, and CDP Mode, which can bypass bot detection, solve CAPTCHAs, and leverage advanced methods from the Chrome DevTools Protocol.

Smart Proxy With Cloudflare Built-In Bypass

Smart proxies proxies are continuously updated by specialized companies, aiming to outmaneuver Cloudflare's security measures (as thats their business).

Som of them are:

ScraperAPI
Scrapingbee
Oxylabs
Smartproxy are noted for their proprietary Cloudflare bypass mechanisms.

For those seeking an optimized solution, the ScrapeOps Proxy Aggregator stands out. This service integrates over 20 proxy providers into a single API, automatically selecting the best and most cost-effective proxy for your target domains, thus offering a superior option for navigating Cloudflare's defenses.

Reverse Engineer Cloudflare Anti-Bot Protection

Reverse engineering Cloudflare's anti-bot measures is a tactic used by smart proxy providers, suitable for extensive web scraping without the high cost of running many headless browsers.

Advantages: This method allows for the creation of an extremely efficient bypass that specifically targets Cloudflare's checks, ideal for large-scale operations.

Disadvantages: The downside is the complexity involved in understanding and deceiving Cloudflare's deliberately obscure anti-bot system, requiring ongoing effort to test different strategies and update the bypass as Cloudflare enhances its protections.

Find more info about how to do this in the original article.