A beginner-friendly tutorial for scientists, students and bioinformatics users who want to understand the terminal, navigate folders, inspect files, combine commands, work safely with large datasets and prepare for reproducible NGS data analysis on Linux, macOS or remote servers.
The Unix command line is a text-based way to control a computer. You type commands into a terminal, and the shell interprets those commands. On Linux servers, HPC clusters and many bioinformatics workstations, the command line is the main interface for running analyses.
TerminalThe application window where you type commands and see output.
ShellThe program that reads your commands. Bash and Zsh are common shells.
CommandsPrograms or shell features that perform actions such as listing files, copying data or running analyses.
Why learn it? A single command can process thousands of files, repeat an analysis exactly, document your workflow and run tools that have no graphical interface.
2. Terminal, shell and prompt
When you open a terminal, you usually see a prompt. The prompt may show your username, computer name, current directory and a symbol such as $. You type commands after the prompt and press Enter.
Often indicates a root or administrator shell. Be careful.
# apt update
~
Your home directory.
cd ~/projects
.
The current directory.
cp file.txt ./backup/
..
The parent directory.
cd ..
3. Filesystem and paths
Unix-like systems organize files in a tree. The top of the tree is the root directory, written as /. Your personal files are usually in your home directory, for example /home/username on Linux or /Users/username on macOS.
Absolute pathA full path from the root directory, such as /home/andrey/projects/sample.fastq.gz.
Relative pathA path starting from the current directory, such as data/sample.fastq.gz or ../results.
Common filesystem locations
/ # root directory
/home/username # user home directory on Linux
/Users/username # user home directory on macOS
/tmp # temporary files
/mnt # mounted disks or network locations on many Linux systems
/project # project storage on some servers or clusters
/scratch # temporary high-performance storage on some clusters
4. Navigation: pwd, ls and cd
The first commands to learn are pwd, ls and cd. They show where you are, list files and change directories.
Command
Purpose
Example
pwd
Print current directory.
pwd
ls
List files.
ls -lh
cd
Change directory.
cd ~/projects
cd ..
Move one directory up.
cd ..
cd -
Return to the previous directory.
cd -
Try navigation commands
pwd
ls
ls -lh
ls -la
cd ~
mkdir -p unix_tutorial
cd unix_tutorial
pwd
5. Creating, copying, moving and deleting files
File operations are powerful and sometimes irreversible. Learn them carefully on test files before using them on real data.
Command
Purpose
Example
mkdir
Create directory.
mkdir results
touch
Create empty file or update timestamp.
touch notes.txt
cp
Copy files or directories.
cp notes.txt notes_backup.txt
mv
Move or rename files.
mv notes.txt project_notes.txt
rm
Remove files.
rm old_file.txt
rm -r
Remove directories recursively.
rm -r old_folder
Be careful with rm: deleted files are usually not moved to a recycle bin. Avoid commands such as rm -rf * unless you fully understand where you are and what will be removed.
6. Viewing and summarizing text files
Many scientific files are plain text or compressed text: FASTQ, FASTA, SAM, VCF, GTF, BED, CSV and TSV. Unix commands let you inspect them quickly.
Command
Purpose
Example
cat
Print entire file to screen.
cat README.txt
less
View long files page by page.
less sample.vcf
head
Show first lines.
head -n 20 sample.fastq
tail
Show last lines.
tail -n 20 log.txt
wc
Count lines, words or bytes.
wc -l genes.txt
zcat
Print compressed gzip files.
zcat reads.fastq.gz | head
Create a small example file
cat > samples.tsv << 'EOF'
sample_id group batch
S1 control A
S2 control A
S3 treated B
S4 treated B
EOF
cat samples.tsv
head -n 2 samples.tsv
wc -l samples.tsv
7. Wildcards, quoting and tab completion
Wildcards help you select many files at once. Quoting protects spaces and special characters from being interpreted by the shell.
*Matches any number of characters. Example: ls *.fastq.gz
?Matches one character. Example: ls sample?.txt
QuotesUse quotes around file names with spaces: cat "my file.txt"
Tab completionPress Tab to autocomplete file names and reduce typing errors.
Wildcards are expanded before the command runs. Always test with ls before using wildcards in destructive commands.
8. Pipes and redirection
Pipes and redirection are central Unix ideas. A pipe sends the output of one command into another command. Redirection saves output to a file or reads input from a file.
9. Text processing: grep, sort, uniq, cut, awk and sed
Unix text-processing tools are especially useful for logs, metadata tables, genomic intervals and annotation files.
Command
Purpose
Example
grep
Search for text patterns.
grep treated samples.tsv
sort
Sort lines.
sort names.txt
uniq
Collapse repeated adjacent lines.
sort names.txt | uniq -c
cut
Extract columns or characters.
cut -f 1 samples.tsv
awk
Process columns and patterns.
awk '$2=="treated"' samples.tsv
sed
Stream editing and substitutions.
sed 's/treated/case/g' samples.tsv
Text-processing examples
# Show treated samples
grep treated samples.tsv
# Count samples per group
tail -n +2 samples.tsv | cut -f 2 | sort | uniq -c
# Extract sample IDs from batch A
awk 'BEGIN{FS="\t"} NR>1 && $3=="A" {print $1}' samples.tsv
10. Permissions and executable files
Unix permissions control who can read, write or execute a file. Use ls -l to see permissions.
Inspect permissions
ls -l script.sh
# Example output:
# -rwxr-xr-x 1 user user 120 Jan 01 10:00 script.sh
Permission
Meaning for files
Meaning for directories
r
Read file contents.
List directory contents.
w
Modify file.
Create, delete or rename files inside directory.
x
Execute file as a program or script.
Enter directory with cd.
Make a script executable
chmod +x script.sh
./script.sh
11. Processes, jobs and stopping commands
A running command is a process. Long analyses may run for minutes, hours or days. Learn how to monitor and stop commands safely.
Command / shortcut
Purpose
Example
Ctrl-C
Interrupt the current command.
Stop a command that is running in the terminal.
Ctrl-Z
Suspend the current command.
Pause a foreground process.
ps
List processes.
ps aux | grep python
top / htop
Monitor CPU and memory.
htop
jobs
List shell jobs.
jobs
fg / bg
Move jobs to foreground or background.
fg %1
12. Working on remote servers
Many bioinformatics analyses run on remote Linux servers or HPC clusters. You usually connect with SSH and transfer files with SCP or rsync.
SSH and file transfer examples
# Connect to a remote server
ssh username@server.example.org
# Copy one file to the server
scp sample.fastq.gz username@server.example.org:/project/data/
# Copy a folder recursively
scp -r results username@server.example.org:/project/results/
# Synchronize folders efficiently
rsync -avh --progress data/ username@server.example.org:/project/data/
For long-running remote work, tools such as screen, tmux, SLURM job scripts or workflow engines are safer than leaving commands in an ordinary SSH session.
13. PATH, software and help pages
The shell uses the PATH variable to find programs. When you type samtools, the shell searches directories listed in PATH until it finds an executable called samtools.
Inspect software and PATH
echo $PATH
which bash
which python
which samtools
bash --version
python --version
man ls
ls --help
whichShows which executable will run when you type a command.
manOpens the manual page for many Unix commands. Press q to quit.
14. Writing simple shell scripts
A shell script stores commands in a file so that they can be repeated. Scripts are a major step toward reproducible analysis.
Test scripts on small files before running them on full datasets.
15. Command-line examples for bioinformatics
The command line is especially useful for inspecting sequencing files and metadata. The following examples illustrate common operations. Adapt them to your data and tools.
Inspect compressed FASTQ files
# Show the first FASTQ record from a compressed file
zcat sample_R1.fastq.gz | head -n 4
# Count reads in a compressed FASTQ file
zcat sample_R1.fastq.gz | echo $(( $(wc -l) / 4 ))
List FASTQ file sizes
ls -lh *.fastq.gz
# Save file sizes to a report
ls -lh *.fastq.gz > fastq_file_sizes.txt
Check a tab-separated sample sheet
# Show column names
head -n 1 samples.tsv
# Count samples by group
tail -n +2 samples.tsv | cut -f 2 | sort | uniq -c
# Find samples from one batch
awk 'BEGIN{FS="\t"} NR>1 && $3=="batch1" {print $1}' samples.tsv
FASTQ and BAM files can be very large. Prefer streaming commands, compression-aware tools and project scratch storage when working with NGS data.
16. Safe command-line habits
The command line is powerful because it does exactly what you tell it. Develop careful habits from the beginning.
Check location firstRun pwd before moving, deleting or overwriting important files.
Preview wildcardsRun ls pattern* before using the same wildcard with rm, mv or cp.
Do not overwrite accidentallyRemember that > replaces files. Use >> only when you want to append.
Keep raw data read-onlyStore raw FASTQ data separately and avoid editing or deleting original files.
Use scripts and logsCommands saved in scripts are easier to review and reproduce than commands typed only once.
Back up important workUse external drives, institutional storage or version control where appropriate.
17. Mini exercises
Practice on a safe folder, not on real project data.
Exercise setup
mkdir -p ~/unix_practice/{data,results,scripts,logs}
cd ~/unix_practice
cat > data/samples.tsv << 'EOF'
sample_id group batch
S1 control A
S2 control A
S3 treated B
S4 treated B
S5 treated/C B
EOF
Use pwd and ls -lh to inspect the practice folder.
Use head and cat to view data/samples.tsv.
Use cut, sort and uniq -c to count samples per group.
Use grep to find treated samples.
Redirect the group counts to results/group_counts.txt.
Create a script in scripts/ that repeats the analysis.
18. Beginner Unix command cheat sheet
Task
Command
Example
Show current directory
pwd
pwd
List files
ls
ls -lh
Change directory
cd
cd ~/projects
Create directory
mkdir
mkdir results
Copy file
cp
cp file.txt backup.txt
Move or rename
mv
mv old.txt new.txt
Remove file
rm
rm old.txt
View long file
less
less report.txt
First lines
head
head -n 20 file.txt
Last lines
tail
tail -n 20 log.txt
Search text
grep
grep BRCA variants.tsv
Count lines
wc
wc -l samples.tsv
Extract columns
cut
cut -f 1 samples.tsv
Sort lines
sort
sort names.txt
Unique values
uniq
sort names.txt | uniq -c
Find program
which
which python
Get help
man or --help
man ls
Official resources and documentation
Use official documentation when you need exact options or advanced behavior for a command.
The Unix command line is a text-based interface for controlling an operating system. Instead of clicking through menus, you type commands to navigate folders, inspect files, run software, automate tasks and process data.
Is Unix the same as Linux?
Unix is a family of operating-system ideas and standards. Linux is a Unix-like operating system widely used on servers, workstations, clusters and bioinformatics systems. Many command-line skills are shared across Linux, macOS and other Unix-like systems.
Which shell should beginners learn?
Bash is a good default for beginners because it is widely available on Linux and common on bioinformatics servers. Zsh is also popular, especially on macOS. Most basic commands in this tutorial work in both.
Can I damage files with the command line?
Yes. Commands such as rm, mv, chmod and commands using wildcards can change or delete files quickly. Always check your current directory, inspect file names and avoid running destructive commands until you understand them.
Why is the command line important for NGS data analysis?
Most NGS and bioinformatics tools are command-line programs. FASTQ files, BAM files, VCF files and large result tables are often processed more efficiently with shell commands and reproducible scripts than with graphical tools.
How do I get help for a Unix command?
Use commands such as man command, command --help or command -h. You can also search official documentation for Bash, GNU Coreutils, SAMtools, Nextflow and the specific tool you are using.
Privacy noticeWe process contact-form data only to respond to your enquiry. Please review our Privacy Policy for details.