How to Build a Custom Wordlist with CeWL & Crunch
Generic wordlists like rockyou.txt are useful, but custom wordlists tailored to your target are far more effective. This guide covers building targeted wordlists using various techniques.
Why Custom Wordlists?
The Problem with Generic Lists
- rockyou.txt: 14 million passwords, but generic
- Most are irrelevant to your target
- Miss organization-specific passwords
- Time wasted testing unlikely passwords
Custom Wordlist Advantages
- Target-specific terms
- Company name variations
- Industry terminology
- Local references
- Higher success rate
CeWL: Custom Word List Generator
CeWL spiders a website and builds a wordlist from the content it finds.
Basic Usage
cewl https://target.com -w output.txt
Common Options
cewl https://target.com \
-d 2 \ # Spider depth
-m 5 \ # Minimum word length
-w wordlist.txt \ # Output file
--with-numbers \ # Include words with numbers
-e \ # Include email addresses
--email_file emails.txt # Email output file
Advanced Usage
Spider deeper:
cewl https://target.com -d 4 -m 6 -w deep_wordlist.txt
Follow external links:
cewl https://target.com -d 2 -o -w wordlist.txt
Include metadata from files:
cewl https://target.com -d 2 -m 5 -a -w wordlist.txt
Practical Example
Targeting a company website:
# Get words from main site
cewl https://acmecorp.com -d 3 -m 5 -w acme_words.txt
# Get words from about page
cewl https://acmecorp.com/about -d 1 -m 4 >> acme_words.txt
# Get words from blog
cewl https://acmecorp.com/blog -d 2 -m 5 >> acme_words.txt
# Deduplicate
sort -u acme_words.txt -o acme_words.txt
Crunch: Pattern-Based Generator
Crunch generates wordlists based on specified patterns and character sets.
Basic Syntax
crunch <min-len> <max-len> [charset] [options]
Character Sets
# Lowercase letters
crunch 4 6 abcdefghijklmnopqrstuvwxyz
# Numbers only
crunch 4 4 0123456789
# Mixed
crunch 6 8 abcdef123
Using Patterns
The -t option allows patterns:
@— Lowercase letters,— Uppercase letters%— Numbers^— Special characters
Examples:
# Company name + 4 digits
crunch 10 10 -t acme%%%%
# Capitalize first letter + numbers
crunch 8 8 -t ,@@@%%%%
# Name + year pattern
crunch 8 8 -t @@@@2024
# Common password patterns
crunch 10 10 -t ,@@@@@%%%
Output Options
# Write to file
crunch 6 6 -t @@@@%% -o passwords.txt
# Compress output
crunch 6 8 abc123 -o START -z gzip
# Split into chunks
crunch 6 6 abc123 -c 1000 -o START
Practical Example
Generate company-specific patterns:
# Company name variations
crunch 8 12 -t acme%%%% -o acme_patterns.txt
crunch 8 12 -t Acme%%%% >> acme_patterns.txt
crunch 8 12 -t ACME%%%% >> acme_patterns.txt
# With seasons
crunch 10 10 -t acme@@@%% -o acme_seasons.txt
echo "Manually add: acmefall23, acmespring24, etc."
# Common suffixes
crunch 8 8 -t acme%%! -o acme_suffix.txt
Combining Tools
Workflow: Company-Targeted Wordlist
Step 1: Gather base words with CeWL
cewl https://target.com -d 3 -m 4 -w base_words.txt
Step 2: Add common terms
echo "password" >> base_words.txt
echo "welcome" >> base_words.txt
echo "login" >> base_words.txt
echo "admin" >> base_words.txt
Step 3: Generate variations
# Add company name
echo "acmecorp" >> base_words.txt
echo "AcmeCorp" >> base_words.txt
echo "ACMECORP" >> base_words.txt
Step 4: Apply mutations
Password Mutations
Using John the Ripper Rules
John has powerful mangling rules:
john --wordlist=base_words.txt --rules --stdout > mutated.txt
Custom Rules
Create targeted rules in john.conf:
# Append years
$2$0$2$3
$2$0$2$4
$2$0$2$5
# Append common suffixes
$!
$1
$1$!
$1$2$3
Using Hashcat Rules
hashcat --stdout -r /usr/share/hashcat/rules/best64.rule base_words.txt > mutated.txt
Manual Mutations with Bash
# Capitalize first letter
cat base_words.txt | sed 's/\b\(.\)/\u\1/' >> mutated.txt
# Add numbers
for word in $(cat base_words.txt); do
echo "${word}1"
echo "${word}123"
echo "${word}2024"
done >> mutated.txt
# Add special characters
for word in $(cat base_words.txt); do
echo "${word}!"
echo "${word}@"
echo "${word}#"
done >> mutated.txt
OSINT-Based Wordlists
Employee Names
Gather from:
- Company website
- Social media
- Press releases
# Create name variations
# John Smith becomes:
echo "jsmith" >> names.txt
echo "johnsmith" >> names.txt
echo "john.smith" >> names.txt
echo "smithj" >> names.txt
echo "Johns" >> names.txt
Tools for Names
Username Anarchy:
./username-anarchy "John Smith" >> usernames.txt
Location-Based Terms
Add local references:
- City names
- Sports teams
- Local landmarks
- Street names
Industry Terms
Research industry-specific vocabulary:
- Healthcare: patient, hipaa, medical
- Finance: banking, trading, invest
- Tech: admin, root, server, cloud
Content Discovery Wordlists
Building API Endpoint Lists
# Common patterns
echo "api" >> endpoints.txt
echo "v1" >> endpoints.txt
echo "v2" >> endpoints.txt
echo "users" >> endpoints.txt
echo "admin" >> endpoints.txt
# Generate API paths
for endpoint in $(cat endpoints.txt); do
echo "/api/${endpoint}"
echo "/api/v1/${endpoint}"
echo "/api/v2/${endpoint}"
done > api_paths.txt
Technology-Specific Lists
For WordPress:
cat >> wp_specific.txt << EOF
wp-admin
wp-content
wp-includes
wp-login.php
xmlrpc.php
wp-config.php.bak
EOF
For .NET:
cat >> dotnet_specific.txt << EOF
web.config
web.config.bak
elmah.axd
trace.axd
aspnet_client
EOF
Wordlist Optimization
Removing Duplicates
sort -u wordlist.txt -o wordlist.txt
Filtering by Length
# Keep only 6-12 character words
awk 'length >= 6 && length <= 12' wordlist.txt > filtered.txt
Removing Unlikely Passwords
# Remove lines with non-ASCII
grep -v '[^[:print:]]' wordlist.txt > clean.txt
# Remove very long lines
awk 'length <= 20' wordlist.txt > reasonable.txt
Prioritizing
Put most likely passwords first:
# Common patterns at top
cat likely_passwords.txt wordlist.txt > prioritized.txt
Practical Scenarios
Scenario 1: Corporate Pentest
# 1. Gather from website
cewl https://acmecorp.com -d 3 -m 5 -w acme_cewl.txt
# 2. Add company variations
echo -e "acme\nAcme\nACME\nacmecorp\nAcmeCorp" > acme_base.txt
# 3. Generate patterns
crunch 8 12 -t acme%%%% >> acme_patterns.txt
crunch 8 12 -t Acme%%%% >> acme_patterns.txt
# 4. Combine and deduplicate
cat acme_*.txt | sort -u > acme_final.txt
# 5. Apply mutations
hashcat --stdout -r best64.rule acme_final.txt > acme_mutated.txt
Scenario 2: WiFi Assessment
# 1. Gather SSID-related terms
echo "coffeshop" > wifi_base.txt
echo "CoffeeShop" >> wifi_base.txt
echo "freewifi" >> wifi_base.txt
# 2. Add common patterns
crunch 8 8 -t coffee%% >> wifi_patterns.txt
# 3. Common WiFi passwords
cat common_wifi.txt >> wifi_base.txt
# 4. Final wordlist
cat wifi_*.txt | sort -u > wifi_final.txt
Scenario 3: CTF Challenge
# 1. Analyze challenge context
# (name, theme, hints)
# 2. Build contextual list
echo "flag" > ctf_words.txt
echo "capture" >> ctf_words.txt
echo "secret" >> ctf_words.txt
# 3. Add challenge-specific terms
# (from challenge description)
# 4. Generate variations
Tools Summary
| Tool | Purpose | Best For |
|---|---|---|
| CeWL | Web scraping | Target-specific terms |
| Crunch | Pattern generation | Structured passwords |
| John | Mutations | Rule-based variations |
| Hashcat | Mutations | Large-scale processing |
| Username Anarchy | Names | User enumeration |
Best Practices
- Start small — Begin with targeted lists
- Iterate — Refine based on results
- Document — Track what works
- Legal — Only use with authorization
- Efficient — Remove unlikely entries
Need help with security assessments? Contact us: m1k3@msquarellc.net