Skip to main content
🧠Educationalintermediate7 min read

How to Build a Custom Wordlist with CeWL & Crunch

Create targeted wordlists for password attacks and content discovery using CeWL, Crunch, and other techniques.

wordlistspassword crackingCeWLCrunchtools
Share:𝕏in

How to Build a Custom Wordlist with CeWL & Crunch

Generic wordlists like rockyou.txt are useful, but custom wordlists tailored to your target are far more effective. This guide covers building targeted wordlists using various techniques.

Why Custom Wordlists?

The Problem with Generic Lists

  • rockyou.txt: 14 million passwords, but generic
  • Most are irrelevant to your target
  • Miss organization-specific passwords
  • Time wasted testing unlikely passwords

Custom Wordlist Advantages

  • Target-specific terms
  • Company name variations
  • Industry terminology
  • Local references
  • Higher success rate

CeWL: Custom Word List Generator

CeWL spiders a website and builds a wordlist from the content it finds.

Basic Usage

cewl https://target.com -w output.txt

Common Options

cewl https://target.com \
    -d 2 \                    # Spider depth
    -m 5 \                    # Minimum word length
    -w wordlist.txt \         # Output file
    --with-numbers \          # Include words with numbers
    -e \                      # Include email addresses
    --email_file emails.txt   # Email output file

Advanced Usage

Spider deeper:

cewl https://target.com -d 4 -m 6 -w deep_wordlist.txt

Follow external links:

cewl https://target.com -d 2 -o -w wordlist.txt

Include metadata from files:

cewl https://target.com -d 2 -m 5 -a -w wordlist.txt

Practical Example

Targeting a company website:

# Get words from main site
cewl https://acmecorp.com -d 3 -m 5 -w acme_words.txt

# Get words from about page
cewl https://acmecorp.com/about -d 1 -m 4 >> acme_words.txt

# Get words from blog
cewl https://acmecorp.com/blog -d 2 -m 5 >> acme_words.txt

# Deduplicate
sort -u acme_words.txt -o acme_words.txt

Crunch: Pattern-Based Generator

Crunch generates wordlists based on specified patterns and character sets.

Basic Syntax

crunch <min-len> <max-len> [charset] [options]

Character Sets

# Lowercase letters
crunch 4 6 abcdefghijklmnopqrstuvwxyz

# Numbers only
crunch 4 4 0123456789

# Mixed
crunch 6 8 abcdef123

Using Patterns

The -t option allows patterns:

  • @ — Lowercase letters
  • , — Uppercase letters
  • % — Numbers
  • ^ — Special characters

Examples:

# Company name + 4 digits
crunch 10 10 -t acme%%%%

# Capitalize first letter + numbers
crunch 8 8 -t ,@@@%%%%

# Name + year pattern
crunch 8 8 -t @@@@2024

# Common password patterns
crunch 10 10 -t ,@@@@@%%%

Output Options

# Write to file
crunch 6 6 -t @@@@%% -o passwords.txt

# Compress output
crunch 6 8 abc123 -o START -z gzip

# Split into chunks
crunch 6 6 abc123 -c 1000 -o START

Practical Example

Generate company-specific patterns:

# Company name variations
crunch 8 12 -t acme%%%% -o acme_patterns.txt
crunch 8 12 -t Acme%%%% >> acme_patterns.txt
crunch 8 12 -t ACME%%%% >> acme_patterns.txt

# With seasons
crunch 10 10 -t acme@@@%% -o acme_seasons.txt
echo "Manually add: acmefall23, acmespring24, etc."

# Common suffixes
crunch 8 8 -t acme%%! -o acme_suffix.txt

Combining Tools

Workflow: Company-Targeted Wordlist

Step 1: Gather base words with CeWL

cewl https://target.com -d 3 -m 4 -w base_words.txt

Step 2: Add common terms

echo "password" >> base_words.txt
echo "welcome" >> base_words.txt
echo "login" >> base_words.txt
echo "admin" >> base_words.txt

Step 3: Generate variations

# Add company name
echo "acmecorp" >> base_words.txt
echo "AcmeCorp" >> base_words.txt
echo "ACMECORP" >> base_words.txt

Step 4: Apply mutations

Password Mutations

Using John the Ripper Rules

John has powerful mangling rules:

john --wordlist=base_words.txt --rules --stdout > mutated.txt

Custom Rules

Create targeted rules in john.conf:

# Append years
$2$0$2$3
$2$0$2$4
$2$0$2$5

# Append common suffixes
$!
$1
$1$!
$1$2$3

Using Hashcat Rules

hashcat --stdout -r /usr/share/hashcat/rules/best64.rule base_words.txt > mutated.txt

Manual Mutations with Bash

# Capitalize first letter
cat base_words.txt | sed 's/\b\(.\)/\u\1/' >> mutated.txt

# Add numbers
for word in $(cat base_words.txt); do
    echo "${word}1"
    echo "${word}123"
    echo "${word}2024"
done >> mutated.txt

# Add special characters
for word in $(cat base_words.txt); do
    echo "${word}!"
    echo "${word}@"
    echo "${word}#"
done >> mutated.txt

OSINT-Based Wordlists

Employee Names

Gather from:

  • LinkedIn
  • Company website
  • Social media
  • Press releases
# Create name variations
# John Smith becomes:
echo "jsmith" >> names.txt
echo "johnsmith" >> names.txt
echo "john.smith" >> names.txt
echo "smithj" >> names.txt
echo "Johns" >> names.txt

Tools for Names

Username Anarchy:

./username-anarchy "John Smith" >> usernames.txt

Location-Based Terms

Add local references:

  • City names
  • Sports teams
  • Local landmarks
  • Street names

Industry Terms

Research industry-specific vocabulary:

  • Healthcare: patient, hipaa, medical
  • Finance: banking, trading, invest
  • Tech: admin, root, server, cloud

Content Discovery Wordlists

Building API Endpoint Lists

# Common patterns
echo "api" >> endpoints.txt
echo "v1" >> endpoints.txt
echo "v2" >> endpoints.txt
echo "users" >> endpoints.txt
echo "admin" >> endpoints.txt

# Generate API paths
for endpoint in $(cat endpoints.txt); do
    echo "/api/${endpoint}"
    echo "/api/v1/${endpoint}"
    echo "/api/v2/${endpoint}"
done > api_paths.txt

Technology-Specific Lists

For WordPress:

cat >> wp_specific.txt << EOF
wp-admin
wp-content
wp-includes
wp-login.php
xmlrpc.php
wp-config.php.bak
EOF

For .NET:

cat >> dotnet_specific.txt << EOF
web.config
web.config.bak
elmah.axd
trace.axd
aspnet_client
EOF

Wordlist Optimization

Removing Duplicates

sort -u wordlist.txt -o wordlist.txt

Filtering by Length

# Keep only 6-12 character words
awk 'length >= 6 && length <= 12' wordlist.txt > filtered.txt

Removing Unlikely Passwords

# Remove lines with non-ASCII
grep -v '[^[:print:]]' wordlist.txt > clean.txt

# Remove very long lines
awk 'length <= 20' wordlist.txt > reasonable.txt

Prioritizing

Put most likely passwords first:

# Common patterns at top
cat likely_passwords.txt wordlist.txt > prioritized.txt

Practical Scenarios

Scenario 1: Corporate Pentest

# 1. Gather from website
cewl https://acmecorp.com -d 3 -m 5 -w acme_cewl.txt

# 2. Add company variations
echo -e "acme\nAcme\nACME\nacmecorp\nAcmeCorp" > acme_base.txt

# 3. Generate patterns
crunch 8 12 -t acme%%%% >> acme_patterns.txt
crunch 8 12 -t Acme%%%% >> acme_patterns.txt

# 4. Combine and deduplicate
cat acme_*.txt | sort -u > acme_final.txt

# 5. Apply mutations
hashcat --stdout -r best64.rule acme_final.txt > acme_mutated.txt

Scenario 2: WiFi Assessment

# 1. Gather SSID-related terms
echo "coffeshop" > wifi_base.txt
echo "CoffeeShop" >> wifi_base.txt
echo "freewifi" >> wifi_base.txt

# 2. Add common patterns
crunch 8 8 -t coffee%% >> wifi_patterns.txt

# 3. Common WiFi passwords
cat common_wifi.txt >> wifi_base.txt

# 4. Final wordlist
cat wifi_*.txt | sort -u > wifi_final.txt

Scenario 3: CTF Challenge

# 1. Analyze challenge context
# (name, theme, hints)

# 2. Build contextual list
echo "flag" > ctf_words.txt
echo "capture" >> ctf_words.txt
echo "secret" >> ctf_words.txt

# 3. Add challenge-specific terms
# (from challenge description)

# 4. Generate variations

Tools Summary

ToolPurposeBest For
CeWLWeb scrapingTarget-specific terms
CrunchPattern generationStructured passwords
JohnMutationsRule-based variations
HashcatMutationsLarge-scale processing
Username AnarchyNamesUser enumeration

Best Practices

  1. Start small — Begin with targeted lists
  2. Iterate — Refine based on results
  3. Document — Track what works
  4. Legal — Only use with authorization
  5. Efficient — Remove unlikely entries

Need help with security assessments? Contact us: m1k3@msquarellc.net

Found this helpful? Share it:

Share:𝕏in

Need Help With This?

Have questions about implementing these security practices? Let's discuss your specific needs.

Get in Touch

More in Educational

Explore more articles in this category.

Browse 🧠 Educational

Related Articles