Skip to content

5.File Archives

Linux Compression & Disk Utilities

1. gzip (GNU zip)

  • Purpose: Compresses individual files using the DEFLATE algorithm.
  • File extension: .gz
  • Usage:
  • Compress: gzip file.txt → Creates file.txt.gz
  • Decompress: gunzip file.txt.gz → Restores file.txt
  • Keep original file: gzip -k file.txt
  • Compress multiple: gzip file1.txt file2.txt
  • Test integrity: gzip -t file.txt.gz

2. zip (Windows-Compatible Archive)

  • Purpose: Compresses multiple files and folders into a single .zip archive.
  • Compression: Uses the DEFLATE algorithm.
  • File extension: .zip
  • Usage:
  • Create a zip archive: zip archive.zip file1 file2
  • Compress a directory: zip -r archive.zip folder/
  • Extract a zip file: unzip archive.zip
  • View zip contents: unzip -l archive.zip

3. bzip2 (Better Compression)

  • Purpose: Compresses individual files using the Burrows-Wheeler algorithm.
  • Provides better compression than gzip but is slower.
  • File extension: .bz2
  • Usage:
  • Compress: bzip2 file.txt → Creates file.txt.bz2
  • Decompress: bunzip2 file.txt.bz2
  • Keep original file: bzip2 -k file.txt
  • Faster compression: bzip2 -1 file.txt
  • Best compression: bzip2 -9 file.txt

4. tar (Tape Archive)

  • Purpose: Bundles multiple files/folders into a single archive (.tar file) without compression.
  • Often used with gzip or bzip2 for compression:
  • .tar.gz → Compressed with gzip
  • .tar.bz2 → Compressed with bzip2
  • File extension: .tar, .tar.gz, .tar.bz2
  • Usage:
  • Create an archive: tar -cvf archive.tar file1 file2
  • Extract an archive: tar -xvf archive.tar
  • List archive contents: tar -tvf archive.tar
  • Create a compressed archive:
    • gzip: tar -czvf archive.tar.gz folder/
    • bzip2: tar -cjvf archive.tar.bz2 folder/
  • Extract a compressed archive:
    • gzip: tar -xzvf archive.tar.gz
    • bzip2: tar -xjvf archive.tar.bz2

5. dd (Disk Duplication)

  • Purpose: Copies raw data from one device or file to another.
  • Used for:
  • Disk cloning (backup entire drives or partitions).
  • Creating disk images (.img or .iso files).
  • Writing disk images (flashing OS images to USB drives).
  • Secure wiping of disks (dd if=/dev/zero to overwrite all data).
  • Usage:
  • Clone a disk: dd if=/dev/sdX of=/dev/sdY bs=4M status=progress
  • Create a disk image: dd if=/dev/sdX of=disk_image.img bs=4M
  • Write an image to a disk: dd if=disk_image.img of=/dev/sdX bs=4M
  • Securely wipe a disk: dd if=/dev/zero of=/dev/sdX bs=1M
  • Check progress: Use status=progress

Purpose of File Archives

A file archive is a single file that contains multiple files and directories, often compressed to save space. Archives serve various purposes, including:


1. File Organization & Storage

  • Groups multiple related files into a single archive for easy management.
  • Reduces clutter by storing multiple files in one place.

📌 Example:
- Storing project files in a .tar archive (tar -cvf project.tar project_folder/).


2. Compression for Space Savings

  • Reduces file size using compression algorithms (e.g., gzip, bzip2, xz).
  • Saves disk space and makes file transfers faster.

📌 Example:
- A tar.gz archive (tar -czvf logs.tar.gz /var/logs/) compresses logs for storage.


3. Efficient File Transfer

  • Makes it easier to transfer multiple files as a single unit.
  • Commonly used when sending files over networks or email.

📌 Example:
- Sending a .zip file via email instead of attaching multiple files.


4. Backup & Disaster Recovery

  • Archives help backup important files in a single package.
  • Often used in system backups, database dumps, and recovery plans.

📌 Example:
- Creating a backup of /etc/ config files:

tar -czvf etc_backup.tar.gz /etc/


5. Software Packaging & Distribution

  • Used to package software for distribution (e.g., .tar.gz, .zip, .deb, .rpm).
  • Helps keep software files together in a structured way.

📌 Example:
- Linux packages (.deb, .rpm) or compressed Python packages (.whl).


6. Version Control & Archival

  • Archives preserve old versions of files or projects.
  • Useful for long-term storage of software releases, documents, or logs.

📌 Example:
- Archiving a software release:

zip -r software_v1.0.zip source_code/


7. Encryption & Security

  • Archives can be encrypted for secure storage and transfer.
  • Some formats (e.g., zip, 7z) support built-in encryption.

📌 Example:
- Encrypting a .zip file with a password:

zip -e secure_docs.zip confidential.txt


Conclusion

File archives are essential for storage, compression, backup, software distribution, and security. Different formats (tar, zip, gzip, bzip2, 7z) serve different needs based on compression efficiency, compatibility, and encryption support.