๐ OverTheWire Bandit Level 9: Finding Unique Lines with sort and uniq
Level 9 introduces text processingโone of the most powerful skills in Linux. When you're dealing with files containing many duplicate lines and only one unique line, you need tools to filter and analyze it. That's where sort and uniq come in.
Level 9 teaches you:
- Using
sortto organize lines in a file - Using
uniqto find unique or duplicate lines - Combining commands with pipes (
|) - Using
grepto filter results - Processing text data efficiently
This level is where you start seeing the real power of Linuxโcombining simple commands to solve complex problems. This skill is essential for log analysis, data processing, and security assessments.
๐ฏ The Objective
After logging into bandit8, your goal is to find the password for Level 9. The password is in a file called data.txt in your home directory. The file contains many lines, and almost all of them are repeated multiple times. Only one line appears exactly onceโthat's your password.
What Level 9 teaches:
- Using
sortto organize file contents - Using
uniqto find unique lines - Understanding how pipes connect commands
- Using
grepto filter output - Processing large text files efficiently
The challenge: The file has thousands of lines, and most are duplicates. You need to find the one line that appears only once. Manual searching would be impossibleโthat's why sort and uniq exist.
๐ Understanding the Problem
Let's start by connecting to Level 8 and seeing what we're dealing with:
sshpass -p `cat bandit8` ssh bandit8@bandit.labs.overthewire.org -p 2220
Once connected, let's check the data.txt file:
ls -la data.txt
cat data.txt | head -20
You should see many lines, and if you look closely, you'll notice many duplicates. The file might have thousands of lines, but only one is unique.
The problem: How do you quickly find the one line that appears only once among thousands of duplicate lines?
The answer: Use sort to group duplicates together, then uniq to find unique lines, then filter with grep to get the one that appears once.
๐ง Understanding sort and uniq
Let's dive deeper into these commands, because they're incredibly useful:
The sort Command
The sort command organizes lines in a file alphabetically (or numerically):
sort filename
What it does:
- Reads all lines from a file
- Sorts them alphabetically
- Outputs the sorted lines
Why this matters: When lines are sorted, duplicates become adjacent (next to each other). This is crucial for uniq to work properly.
Example:
# Before sorting:
apple
banana
apple
cherry
banana
# After sorting:
apple
apple
banana
banana
cherry
Notice how duplicates are now next to each other.
The uniq Command
The uniq command filters out duplicate adjacent lines:
uniq filename
Important: uniq only works on adjacent duplicates. If duplicates aren't next to each other, uniq won't catch them. That's why you need sort first.
Common uniq options:
uniqโ Removes adjacent duplicates (keeps one copy)uniq -uโ Shows only unique lines (lines that appear once)uniq -dโ Shows only duplicate linesuniq -cโ Shows count of each line
Example:
# Input (must be sorted first):
apple
apple
banana
banana
cherry
# uniq (removes duplicates):
apple
banana
cherry
# uniq -c (shows count):
2 apple
2 banana
1 cherry
Combining Commands with Pipes
The pipe (|) connects commands together:
command1 | command2
What it does:
- Takes output from
command1 - Feeds it as input to
command2 - Allows you to chain multiple commands
For Level 9:
cat data.txt | sort | uniq -c | grep -v "10"
This:
- Sorts
data.txt(groups duplicates together) - Pipes to
uniq -c(shows count of each line) - Pipes to
grep -v "10"(filters out lines with "10", leaving only lines with count of 1)
๐ Step-by-Step Walkthrough
Step 1: Connect to Level 8
sshpass -p `cat bandit8` ssh bandit8@bandit.labs.overthewire.org -p 2220
Step 2: Check the File
Let's see what we're working with:
ls -la data.txt
wc -l data.txt
The wc -l command counts lines. You'll see the file has many lines (possibly thousands).
Step 3: Sort and Find Unique Lines
Use sort, uniq, and grep to find the unique line:
cat data.txt | sort | uniq -c | grep -v "10"
Breaking this down:
cat data.txtโ Reads the file contents| sortโ Sorts all lines alphabetically, grouping duplicates together| uniq -cโ Shows count of each line (how many times it appears)| grep -v "10"โ Filters out lines containing "10" (the-vmeans "invert", so it shows lines that DON'T contain "10")
What you'll see: Most lines will show a count of 10 (they appear 10 times). One line will show a count of 1 (it appears only once). The grep -v "10" filters out all the lines with count 10, leaving only the line with count 1.
Output: You should see one line that looks like:
1 UsvVyFSfZZWbi6wgC7dAFyFuR6jQQUhR
The number 1 at the beginning indicates this line appears only once. The string after it is your password.
Step 4: Extract the Password
The password is the string that appears after the count of 1. Copy that stringโthat's your password for Level 9.
Step 5: Save the Password
Copy the password and save it:
On Linux/macOS:
echo "PASSWORD_HERE" > bandit9
On Windows (PowerShell):
"PASSWORD_HERE" | Out-File -FilePath bandit9 -NoNewline
Step 6: Connect to Level 9
sshpass -p `cat bandit9` ssh bandit9@bandit.labs.overthewire.org -p 2220
๐ก Understanding the Solution
Let's break down why this command works:
Why Sort First?
Without sorting, duplicates are scattered throughout the file:
apple
banana
apple
cherry
banana
uniq only removes adjacent duplicates, so it won't catch the duplicate "apple" and "banana" because they're not next to each other.
After sorting, duplicates are adjacent:
apple
apple
banana
banana
cherry
Now uniq can properly identify duplicates.
Why uniq -c?
The -c flag shows the count of each line:
2 apple
2 banana
1 cherry
This tells us:
- "apple" appears 2 times
- "banana" appears 2 times
- "cherry" appears 1 time (unique!)
Why grep -v "10"?
In Level 9, most lines appear 10 times. The output would look like:
10 password1
10 password2
10 password3
1 unique_password
The grep -v "10" filters out all lines containing "10", leaving only the line with count 1:
1 unique_password
The -v flag: Inverts the matchโit shows lines that DON'T match the pattern.
๐ ๏ธ Alternative Methods
Here are different ways to find the unique line:
Method 1: sort | uniq -c | grep -v "10" (Recommended)
cat data.txt | sort | uniq -c | grep -v "10"
Pros: Fast, efficient, standard approach Cons: None really
Method 2: Using sort | uniq -u
sort data.txt | uniq -u
Pros: Simpler, directly shows unique lines Cons: Might show multiple lines if there are several unique ones (unlikely for Level 9)
Note: uniq -u shows lines that appear exactly once, but it requires the file to be sorted first.
Method 3: Using sort | uniq -c | grep "^\s*1\s"
cat data.txt | sort | uniq -c | grep "^\s*1\s"
Pros: More explicit pattern matching Cons: More complex regex pattern
What this does:
^\s*1\sโ Matches lines starting with whitespace, then "1", then whitespace- More precise than
grep -v "10"but more complex
Method 4: Manual Search (Not Recommended)
cat data.txt
# Manually scan through thousands of lines...
Pros: Simple, no new commands Cons: Extremely slow, error-prone, nearly impossible
For Level 9, use Method 1 โ it's the most efficient and teaches you valuable skills.
๐ Real-World Context
Why does this matter in penetration testing?
In real security assessments, you'll constantly need to process and analyze text data:
1. Log File Analysis
Log files often contain:
- Repeated entries (normal activity)
- Unique entries (anomalies, attacks)
- Error messages
- Access attempts
Example: Finding unique IP addresses in an access log:
cat access.log | awk '{print $1}' | sort | uniq -u
2. Finding Unique Errors
When analyzing logs, unique errors might indicate:
- New attack patterns
- System issues
- Unusual activity
Example: Finding unique error messages:
cat error.log | grep "ERROR" | sort | uniq -u
3. Password List Processing
When working with password lists:
- Removing duplicates
- Finding unique passwords
- Organizing wordlists
Example: Creating a unique wordlist:
sort passwords.txt | uniq > unique_passwords.txt
4. Finding Unique Users or IPs
During enumeration, you might need to:
- Find unique users in logs
- Identify unique IP addresses
- Extract unique email addresses
Example: Finding unique users:
cat auth.log | awk '{print $9}' | sort | uniq -u
5. Data Analysis
When analyzing data dumps:
- Finding unique entries
- Identifying patterns
- Filtering duplicates
Example: Finding unique entries in a database dump:
cat dump.txt | sort | uniq -u
6. Combining with Other Tools
Real-world analysis often combines multiple tools:
Example: Finding unique failed login attempts:
cat auth.log | grep "Failed password" | awk '{print $11}' | sort | uniq -u
The skill you're learning: How to efficiently process and analyze text data. This is essential when:
- Analyzing log files
- Processing data dumps
- Finding unique entries
- Filtering duplicates
- Identifying patterns in text data
- Working with large files
๐จ Common Mistakes
Mistake 1: Using uniq Without sort
Wrong:
cat data.txt | uniq -c
# Won't work! Duplicates aren't adjacent
Right:
cat data.txt | sort | uniq -c
# Sort first, then uniq
Why: uniq only works on adjacent duplicates. Without sorting, duplicates are scattered throughout the file, so uniq won't catch them.
Mistake 2: Wrong grep Pattern
Wrong:
cat data.txt | sort | uniq -c | grep "10"
# Shows lines with "10", not lines without "10"!
Right:
cat data.txt | sort | uniq -c | grep -v "10"
# The -v inverts the match
Why: We want lines that DON'T contain "10" (the unique line with count 1). The -v flag inverts the match.
Mistake 3: Forgetting the Pipe
Wrong:
cat data.txt sort uniq -c
# Syntax error - missing pipes
Right:
cat data.txt | sort | uniq -c
# Pipes connect the commands
Why: The pipe (|) is required to connect commands. Without it, you're passing arguments incorrectly.
Mistake 4: Reading the Wrong Output
Confusion: "I see multiple lines in the outputโwhich one is the password?"
Clarification:
- If
grep -v "10"returns multiple lines, those are all unique (each appears once) - For Level 9, there should be exactly one unique line
- If you see multiple lines, double-check your command
For Level 9: The output should be exactly one line with count 1โthat's your password.
Mistake 5: Not Understanding How uniq -c Works
Confusion: "Why does uniq -c show counts?"
Clarification:
uniq -cshows how many times each line appears- Most lines will show count
10(they appear 10 times) - One line will show count
1(it appears once) - That's the unique lineโyour password
Example output:
10 CV1DtqXWVFXTvM2F0k09SHz0YwRINYA9
10 UsvVyFSfZZWbi6wgC7dAFyFuR6jQQUhR
1 tRUYK7jIDDON0p5I4gRMbau3Ma2sM8z8
The line with count 1 is your password.
๐ป Practice Exercise
Try these to reinforce what you learned:
-
Create a test file with duplicates:
echo -e "apple\nbanana\napple\ncherry\nbanana\nunique_line" > test.txt -
View the file:
cat test.txt -
Sort it:
sort test.txt # Notice duplicates are now adjacent -
Find unique lines:
sort test.txt | uniq -u # Should show: cherry and unique_line -
Show counts:
sort test.txt | uniq -c # Shows count of each line -
Filter for unique (count 1):
sort test.txt | uniq -c | grep "^\s*1\s" # Shows only lines with count 1 -
Clean up:
rm test.txt
๐ Understanding uniq Options
This is a good time to understand all uniq options:
uniq (Default)
Removes adjacent duplicates, keeping one copy:
sort file.txt | uniq
Example:
Input: apple
apple
banana
banana
Output: apple
banana
uniq -u (Unique Only)
Shows only lines that appear exactly once:
sort file.txt | uniq -u
Example:
Input: apple
apple
banana
cherry
Output: cherry
uniq -d (Duplicates Only)
Shows only lines that appear more than once:
sort file.txt | uniq -d
Example:
Input: apple
apple
banana
cherry
Output: apple
uniq -c (Count)
Shows count of each line:
sort file.txt | uniq -c
Example:
Input: apple
apple
banana
cherry
Output: 2 apple
1 banana
1 cherry
For Level 9: We use uniq -c to see counts, then filter with grep -v "10" to find the line with count 1.
๐ What's Next?
Level 10 introduces base64 encodingโa common way to encode data. You'll learn how to decode base64-encoded strings to reveal hidden passwords.
Before moving on, make sure you:
- โ
Successfully used
sortto organize file contents - โ
Understand how
uniq -cshows line counts - โ
Know how
grep -vfilters output - โ
Can combine
sort,uniq, andgrepto solve problems - โ
Understand why sorting is necessary before using
uniq
๐ Key Takeaways
After completing Level 9, you should understand:
sortcommand โ Organizes lines alphabetically, grouping duplicates togetheruniqcommand โ Filters duplicate adjacent linesuniq -cโ Shows count of each linegrep -vโ Inverts match (shows lines that DON'T match)- Command chaining โ Combining simple commands to solve complex problems
๐ฏ Quick Reference
| Problem | Solution | Example |
|---|---|---|
| Sort file contents | Use sort | sort file.txt |
| Find unique lines | Use sort | uniq -u | sort file.txt | uniq -u |
| Show line counts | Use sort | uniq -c | sort file.txt | uniq -c |
| Filter output | Use grep -v | grep -v "pattern" |
| Chain commands | Use pipe | | cmd1 | cmd2 | cmd3 |
Questions about Level 9 or using sort and uniq? Reach out directly:
- Email: m1k3@msquarellc.net
- Phone: (559) 670-3159
- Schedule: Book a free consultation
M Square LLC
Cybersecurity | Penetration Testing | No-Nonsense Advice