Answers

Quick Questions

1. Which command shows all running processes with their resource usage in real-time?

htop provides an interactive, real-time view of processes with resource usage. Alternative options: - top - Classic process monitor - btop - Modern resource monitor with better visualization - glances - All-in-one system monitor - ps aux - Static snapshot of all processes

2. How do you make a script executable for the owner only while keeping it readable by the group?

chmod 750 script.sh sets permissions to rwxr-x— (owner: read/write/execute, group: read/execute, others: no access)

3. What’s the difference between apt, snap, and container-based package management?

apt: Traditional package manager using shared libraries, system-wide installation
snap: Containerized packages with bundled dependencies, sandboxed execution
Container packages: Docker/Podman images with complete runtime environments

4. Which command shows real-time system resource usage including CPU, memory, and I/O?

htop for processes, iotop for disk I/O, nethogs for network usage. Modern alternative: btop combines all metrics.

5. How do you create a service user for a web application with no shell access?

sudo useradd -r -s /usr/sbin/nologin -d /var/lib/webapp webapp creates a system user without shell access.

6. What does /etc/fstab contain and why is it critical for system boot?

Mount point definitions that specify which filesystems to mount at boot, their options, and mount order. Critical for system startup.

7. How do you follow logs from multiple services simultaneously?

journalctl -f -u nginx -u postgres or tail -f /var/log/{nginx,postgres}/* for multiple log files.

8. Which command identifies which process is using a specific network port?

lsof -i :PORT or ss -tulpn | grep PORT shows processes using specific ports.

9. What’s the difference between systemctl enable and systemctl start?

systemctl start: Starts service immediately (current session)
systemctl enable: Configures service to start automatically at boot

10. How do you securely store secrets in configuration files with proper permissions?

Store in separate files with chmod 600 (owner read/write only), use environment variables, or dedicated secret management tools like HashiCorp Vault.

Task Solutions

Task 1: User Management Automation

#!/usr/bin/env python3
# user_manager.py
import subprocess
import sys

def create_user(username, groups=None, home_dir=True):
    """Create a new user with optional groups"""
    cmd = ['sudo', 'useradd']
    if home_dir:
        cmd.append('-m')
    if groups:
        cmd.extend(['-G', ','.join(groups)])
    cmd.append(username)

    try:
        subprocess.run(cmd, check=True)
        print(f"User {username} created successfully")
    except subprocess.CalledProcessError as e:
        print(f"Error creating user: {e}")

def add_to_group(username, group):
    """Add user to a group"""
    try:
        subprocess.run(['sudo', 'usermod', '-a', '-G', group, username], check=True)
        print(f"Added {username} to group {group}")
    except subprocess.CalledProcessError as e:
        print(f"Error adding to group: {e}")

if __name__ == "__main__":
    create_user("serviceuser", groups=["docker", "sudo"], home_dir=True)

Task 2: System Monitoring Dashboard

#!/usr/bin/env python3
# system_monitor.py
import psutil
import time
import os

def display_system_info():
    """Display real-time system metrics"""
    while True:
        os.system('clear')

        # CPU usage
        cpu_percent = psutil.cpu_percent(interval=1)
        print(f"CPU Usage: {cpu_percent}%")

        # Memory usage
        memory = psutil.virtual_memory()
        print(f"Memory: {memory.percent}% ({memory.used/1024/1024/1024:.1f}GB / {memory.total/1024/1024/1024:.1f}GB)")

        # Disk usage
        disk = psutil.disk_usage('/')
        print(f"Disk: {disk.percent}% ({disk.used/1024/1024/1024:.1f}GB / {disk.total/1024/1024/1024:.1f}GB)")

        # Network
        net_io = psutil.net_io_counters()
        print(f"Network: {net_io.bytes_sent/1024/1024:.1f}MB sent, {net_io.bytes_recv/1024/1024:.1f}MB received")

        # Top processes
        print("\nTop 5 CPU processes:")
        for proc in psutil.process_iter(['pid', 'name', 'cpu_percent']):
            try:
                if proc.info['cpu_percent'] > 0:
                    print(f"  {proc.info['pid']}: {proc.info['name']} ({proc.info['cpu_percent']}%)")
            except (psutil.NoSuchProcess, psutil.AccessDenied):
                pass

        time.sleep(5)

if __name__ == "__main__":
    display_system_info()

Task 3: Log Analysis Tool

#!/usr/bin/env python3
# log_analyzer.py
import re
import sys
from collections import defaultdict
from datetime import datetime

def analyze_logs(log_file):
    """Analyze log files for patterns and issues"""
    error_count = 0
    warning_count = 0
    ip_addresses = defaultdict(int)
    error_messages = []

    try:
        with open(log_file, 'r') as f:
            for line in f:
                # Count errors and warnings
                if 'ERROR' in line.upper():
                    error_count += 1
                    error_messages.append(line.strip())
                elif 'WARNING' in line.upper():
                    warning_count += 1

                # Extract IP addresses
                ip_pattern = r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b'
                ips = re.findall(ip_pattern, line)
                for ip in ips:
                    ip_addresses[ip] += 1

    except FileNotFoundError:
        print(f"Log file {log_file} not found")
        return

    print(f"Log Analysis for {log_file}")
    print(f"Errors: {error_count}")
    print(f"Warnings: {warning_count}")
    print(f"Top IP addresses:")
    for ip, count in sorted(ip_addresses.items(), key=lambda x: x[1], reverse=True)[:5]:
        print(f"  {ip}: {count} occurrences")

    if error_messages:
        print(f"\nRecent errors:")
        for error in error_messages[-3:]:
            print(f"  {error}")

if __name__ == "__main__":
    log_file = sys.argv[1] if len(sys.argv) > 1 else "/var/log/syslog"
    analyze_logs(log_file)

Task 4: Automated Backup System

#!/usr/bin/env python3
# backup_system.py
import os
import tarfile
import datetime
import shutil

def create_backup(source_dirs, backup_dir="/backup"):
    """Create compressed backup of specified directories"""
    timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
    backup_name = f"system_backup_{timestamp}.tar.gz"
    backup_path = os.path.join(backup_dir, backup_name)

    # Ensure backup directory exists
    os.makedirs(backup_dir, exist_ok=True)

    try:
        with tarfile.open(backup_path, "w:gz") as tar:
            for source_dir in source_dirs:
                if os.path.exists(source_dir):
                    tar.add(source_dir, arcname=os.path.basename(source_dir))
                    print(f"Added {source_dir} to backup")

        print(f"Backup created: {backup_path}")

        # Cleanup old backups (keep last 5)
        cleanup_old_backups(backup_dir, keep=5)

    except Exception as e:
        print(f"Backup failed: {e}")

def cleanup_old_backups(backup_dir, keep=5):
    """Remove old backup files, keeping only the most recent ones"""
    backups = [f for f in os.listdir(backup_dir) if f.startswith("system_backup_")]
    backups.sort(reverse=True)

    for old_backup in backups[keep:]:
        os.remove(os.path.join(backup_dir, old_backup))
        print(f"Removed old backup: {old_backup}")

if __name__ == "__main__":
    important_dirs = ["/etc", "/home", "/var/log"]
    create_backup(important_dirs)

Task 5: Process Management Utility

#!/usr/bin/env python3
# process_manager.py
import psutil
import signal
import sys

def find_processes(name_pattern):
    """Find processes matching a name pattern"""
    matching_procs = []
    for proc in psutil.process_iter(['pid', 'name', 'memory_percent', 'cpu_percent']):
        try:
            if name_pattern.lower() in proc.info['name'].lower():
                matching_procs.append(proc)
        except (psutil.NoSuchProcess, psutil.AccessDenied):
            pass
    return matching_procs

def terminate_process(pid, force=False):
    """Safely terminate a process"""
    try:
        proc = psutil.Process(pid)
        if force:
            proc.kill()
            print(f"Forcefully killed process {pid}")
        else:
            proc.terminate()
            print(f"Terminated process {pid}")
    except psutil.NoSuchProcess:
        print(f"Process {pid} not found")
    except psutil.AccessDenied:
        print(f"Access denied to process {pid}")

def monitor_process(pid):
    """Monitor a specific process"""
    try:
        proc = psutil.Process(pid)
        print(f"Process {pid} ({proc.name()}):")
        print(f"  CPU: {proc.cpu_percent()}%")
        print(f"  Memory: {proc.memory_percent():.1f}%")
        print(f"  Status: {proc.status()}")
    except psutil.NoSuchProcess:
        print(f"Process {pid} not found")

if __name__ == "__main__":
    if len(sys.argv) < 2:
        print("Usage: python3 process_manager.py <process_name>")
        sys.exit(1)

    processes = find_processes(sys.argv[1])
    for proc in processes:
        monitor_process(proc.info['pid'])

Task 6: Service Health Monitor

#!/usr/bin/env python3
# service_monitor.py
import subprocess
import time
import smtplib
from email.mime.text import MIMEText

def check_service_status(service_name):
    """Check if a systemd service is active"""
    try:
        result = subprocess.run(['systemctl', 'is-active', service_name],
                              capture_output=True, text=True)
        return result.stdout.strip() == 'active'
    except Exception:
        return False

def restart_service(service_name):
    """Restart a systemd service"""
    try:
        subprocess.run(['sudo', 'systemctl', 'restart', service_name], check=True)
        print(f"Restarted service: {service_name}")
        return True
    except subprocess.CalledProcessError:
        print(f"Failed to restart service: {service_name}")
        return False

def send_alert(service_name, status):
    """Send email alert for service status"""
    # Configure your SMTP settings here
    smtp_server = "localhost"
    from_email = "admin@localhost"
    to_email = "alerts@localhost"

    subject = f"Service Alert: {service_name}"
    body = f"Service {service_name} is {status}"

    try:
        msg = MIMEText(body)
        msg['Subject'] = subject
        msg['From'] = from_email
        msg['To'] = to_email

        with smtplib.SMTP(smtp_server) as server:
            server.send_message(msg)
        print(f"Alert sent for {service_name}")
    except Exception as e:
        print(f"Failed to send alert: {e}")

def monitor_services(services):
    """Monitor a list of critical services"""
    while True:
        for service in services:
            if not check_service_status(service):
                print(f"Service {service} is down, attempting restart...")
                if restart_service(service):
                    send_alert(service, "restarted")
                else:
                    send_alert(service, "failed to restart")
            else:
                print(f"Service {service} is running")

        time.sleep(60)  # Check every minute

if __name__ == "__main__":
    critical_services = ["nginx", "docker", "sshd"]
    monitor_services(critical_services)

Task 7: Disk Space Management

#!/usr/bin/env python3
# disk_manager.py
import os
import shutil
import sys
from pathlib import Path

def get_directory_size(path):
    """Calculate total size of a directory"""
    total_size = 0
    try:
        for dirpath, dirnames, filenames in os.walk(path):
            for filename in filenames:
                filepath = os.path.join(dirpath, filename)
                try:
                    total_size += os.path.getsize(filepath)
                except OSError:
                    pass
    except OSError:
        pass
    return total_size

def find_large_files(path, min_size_mb=100):
    """Find files larger than specified size"""
    large_files = []
    min_size_bytes = min_size_mb * 1024 * 1024

    try:
        for root, dirs, files in os.walk(path):
            for file in files:
                filepath = os.path.join(root, file)
                try:
                    size = os.path.getsize(filepath)
                    if size > min_size_bytes:
                        large_files.append((filepath, size))
                except OSError:
                    pass
    except OSError:
        pass

    return sorted(large_files, key=lambda x: x[1], reverse=True)

def analyze_disk_usage(path="/"):
    """Analyze disk usage and provide recommendations"""
    total, used, free = shutil.disk_usage(path)
    usage_percent = (used / total) * 100

    print(f"Disk Usage Analysis for {path}")
    print(f"Total: {total/1024/1024/1024:.1f} GB")
    print(f"Used: {used/1024/1024/1024:.1f} GB ({usage_percent:.1f}%)")
    print(f"Free: {free/1024/1024/1024:.1f} GB")

    if usage_percent > 90:
        print("WARNING: Disk usage above 90%!")

        # Find largest directories
        print("\nLargest directories:")
        dirs_to_check = ["/var/log", "/tmp", "/home", "/var/cache"]
        for dir_path in dirs_to_check:
            if os.path.exists(dir_path):
                size = get_directory_size(dir_path)
                print(f"  {dir_path}: {size/1024/1024:.1f} MB")

        # Find large files
        print(f"\nLarge files (>100MB):")
        large_files = find_large_files(path, min_size_mb=100)
        for filepath, size in large_files[:10]:
            print(f"  {filepath}: {size/1024/1024:.1f} MB")

def cleanup_temp_files():
    """Clean up temporary files"""
    temp_dirs = ["/tmp", "/var/tmp"]
    cleaned_size = 0

    for temp_dir in temp_dirs:
        if os.path.exists(temp_dir):
            for item in os.listdir(temp_dir):
                item_path = os.path.join(temp_dir, item)
                try:
                    if os.path.isfile(item_path):
                        size = os.path.getsize(item_path)
                        os.remove(item_path)
                        cleaned_size += size
                except OSError:
                    pass

    print(f"Cleaned {cleaned_size/1024/1024:.1f} MB of temporary files")

if __name__ == "__main__":
    analyze_disk_usage()
    cleanup_temp_files()

Task 8: Network Configuration Manager

#!/usr/bin/env python3
# network_manager.py
import subprocess
import socket
import re

def get_network_interfaces():
    """Get all network interfaces and their status"""
    try:
        result = subprocess.run(['ip', 'addr', 'show'], capture_output=True, text=True)
        interfaces = {}

        current_interface = None
        for line in result.stdout.split('\n'):
            # Match interface line
            interface_match = re.match(r'^\d+: ([^:]+):', line)
            if interface_match:
                current_interface = interface_match.group(1)
                interfaces[current_interface] = {'ip': None, 'status': 'down'}

            # Match IP address
            if current_interface and 'inet ' in line:
                ip_match = re.search(r'inet ([^/]+)', line)
                if ip_match:
                    interfaces[current_interface]['ip'] = ip_match.group(1)

            # Check if interface is up
            if current_interface and 'state UP' in line:
                interfaces[current_interface]['status'] = 'up'

        return interfaces
    except Exception as e:
        print(f"Error getting network interfaces: {e}")
        return {}

def test_connectivity(host="8.8.8.8", port=53, timeout=5):
    """Test network connectivity to a host"""
    try:
        socket.setdefaulttimeout(timeout)
        socket.socket(socket.AF_INET, socket.SOCK_STREAM).connect((host, port))
        return True
    except socket.error:
        return False

def check_dns_resolution(domain="google.com"):
    """Test DNS resolution"""
    try:
        socket.gethostbyname(domain)
        return True
    except socket.gaierror:
        return False

def network_diagnostics():
    """Run comprehensive network diagnostics"""
    print("Network Diagnostics Report")
    print("=" * 30)

    # Check interfaces
    print("Network Interfaces:")
    interfaces = get_network_interfaces()
    for interface, info in interfaces.items():
        print(f"  {interface}: {info['status']} - {info['ip'] or 'No IP'}")

    # Test connectivity
    print(f"\nConnectivity Tests:")
    print(f"  Internet (8.8.8.8): {'✓' if test_connectivity() else '✗'}")
    print(f"  DNS Resolution: {'✓' if check_dns_resolution() else '✗'}")

    # Check routing
    try:
        result = subprocess.run(['ip', 'route', 'show', 'default'],
                              capture_output=True, text=True)
        if result.stdout:
            print(f"  Default Gateway: {result.stdout.strip()}")
        else:
            print("  Default Gateway: Not configured")
    except Exception:
        print("  Default Gateway: Unable to check")

if __name__ == "__main__":
    network_diagnostics()

Task 9: Package Management Automation

#!/usr/bin/env python3
# package_manager.py
import subprocess
import sys
import platform

class PackageManager:
    def __init__(self):
        self.detect_package_manager()

    def detect_package_manager(self):
        """Detect the available package manager"""
        managers = {
            'apt': ['apt', 'apt-get'],
            'dnf': ['dnf'],
            'yum': ['yum'],
            'pacman': ['pacman'],
            'snap': ['snap']
        }

        self.available_managers = {}
        for manager, commands in managers.items():
            for cmd in commands:
                if self.command_exists(cmd):
                    self.available_managers[manager] = cmd
                    break

    def command_exists(self, command):
        """Check if a command exists in PATH"""
        try:
            subprocess.run(['which', command], check=True,
                         capture_output=True, text=True)
            return True
        except subprocess.CalledProcessError:
            return False

    def update_package_list(self):
        """Update package list/cache"""
        if 'apt' in self.available_managers:
            return self.run_command(['sudo', 'apt', 'update'])
        elif 'dnf' in self.available_managers:
            return self.run_command(['sudo', 'dnf', 'check-update'])
        elif 'yum' in self.available_managers:
            return self.run_command(['sudo', 'yum', 'check-update'])
        return False

    def install_package(self, package_name):
        """Install a package using the appropriate manager"""
        if 'apt' in self.available_managers:
            return self.run_command(['sudo', 'apt', 'install', '-y', package_name])
        elif 'dnf' in self.available_managers:
            return self.run_command(['sudo', 'dnf', 'install', '-y', package_name])
        elif 'yum' in self.available_managers:
            return self.run_command(['sudo', 'yum', 'install', '-y', package_name])
        elif 'snap' in self.available_managers:
            return self.run_command(['sudo', 'snap', 'install', package_name])
        return False

    def remove_package(self, package_name):
        """Remove a package"""
        if 'apt' in self.available_managers:
            return self.run_command(['sudo', 'apt', 'remove', '-y', package_name])
        elif 'dnf' in self.available_managers:
            return self.run_command(['sudo', 'dnf', 'remove', '-y', package_name])
        elif 'yum' in self.available_managers:
            return self.run_command(['sudo', 'yum', 'remove', '-y', package_name])
        return False

    def run_command(self, command):
        """Execute a command and return success status"""
        try:
            result = subprocess.run(command, check=True, capture_output=True, text=True)
            print(f"Command successful: {' '.join(command)}")
            return True
        except subprocess.CalledProcessError as e:
            print(f"Command failed: {' '.join(command)}")
            print(f"Error: {e.stderr}")
            return False

if __name__ == "__main__":
    pm = PackageManager()
    print(f"Available package managers: {list(pm.available_managers.keys())}")

    if len(sys.argv) > 2:
        action = sys.argv[1]
        package = sys.argv[2]

        if action == "install":
            pm.update_package_list()
            pm.install_package(package)
        elif action == "remove":
            pm.remove_package(package)
    else:
        print("Usage: python3 package_manager.py [install|remove] <package_name>")

Task 10: System Security Auditor

#!/usr/bin/env python3
# security_auditor.py
import os
import subprocess
import pwd
import grp
import stat

def check_user_permissions():
    """Audit user accounts and permissions"""
    print("User Security Audit")
    print("-" * 20)

    # Check for users with UID 0 (root privileges)
    root_users = []
    for user in pwd.getpwall():
        if user.pw_uid == 0:
            root_users.append(user.pw_name)

    print(f"Users with root privileges: {root_users}")

    # Check for users with empty passwords
    try:
        with open('/etc/shadow', 'r') as f:
            empty_passwords = []
            for line in f:
                fields = line.strip().split(':')
                if len(fields) > 1 and fields[1] == '':
                    empty_passwords.append(fields[0])
            print(f"Users with empty passwords: {empty_passwords}")
    except PermissionError:
        print("Cannot check password file (insufficient permissions)")

def check_file_permissions():
    """Check critical file permissions"""
    print("\nFile Permission Audit")
    print("-" * 22)

    critical_files = {
        '/etc/passwd': 0o644,
        '/etc/shadow': 0o640,
        '/etc/group': 0o644,
        '/etc/sudoers': 0o440,
        '/etc/ssh/sshd_config': 0o644
    }

    for filepath, expected_perms in critical_files.items():
        if os.path.exists(filepath):
            current_perms = stat.S_IMODE(os.stat(filepath).st_mode)
            if current_perms != expected_perms:
                print(f"WARNING: {filepath} has permissions {oct(current_perms)}, expected {oct(expected_perms)}")
            else:
                print(f"OK: {filepath} permissions correct")
        else:
            print(f"INFO: {filepath} not found")

def check_open_ports():
    """Check for open network ports"""
    print("\nOpen Ports Audit")
    print("-" * 17)

    try:
        result = subprocess.run(['ss', '-tuln'], capture_output=True, text=True)
        listening_ports = []

        for line in result.stdout.split('\n'):
            if 'LISTEN' in line:
                parts = line.split()
                if len(parts) >= 5:
                    address = parts[4]
                    if ':' in address:
                        port = address.split(':')[-1]
                        listening_ports.append(port)

        print(f"Listening ports: {sorted(set(listening_ports))}")

        # Check for potentially dangerous ports
        dangerous_ports = ['23', '21', '69', '135', '139', '445']
        open_dangerous = [port for port in listening_ports if port in dangerous_ports]
        if open_dangerous:
            print(f"WARNING: Potentially dangerous ports open: {open_dangerous}")

    except Exception as e:
        print(f"Error checking ports: {e}")

def check_ssh_config():
    """Check SSH configuration security"""
    print("\nSSH Configuration Audit")
    print("-" * 24)

    ssh_config = '/etc/ssh/sshd_config'
    if not os.path.exists(ssh_config):
        print("SSH config not found")
        return

    security_checks = {
        'PermitRootLogin': 'no',
        'PasswordAuthentication': 'no',
        'PermitEmptyPasswords': 'no',
        'X11Forwarding': 'no'
    }

    try:
        with open(ssh_config, 'r') as f:
            config_content = f.read()

            for setting, secure_value in security_checks.items():
                if f"{setting} {secure_value}" in config_content:
                    print(f"OK: {setting} is securely configured")
                else:
                    print(f"WARNING: {setting} may not be securely configured")

    except PermissionError:
        print("Cannot read SSH config (insufficient permissions)")

def security_recommendations():
    """Provide security recommendations"""
    print("\nSecurity Recommendations")
    print("-" * 25)
    print("1. Regularly update system packages")
    print("2. Use strong passwords and consider key-based SSH authentication")
    print("3. Configure firewall (ufw/iptables) to restrict access")
    print("4. Monitor system logs for suspicious activity")
    print("5. Disable unnecessary services")
    print("6. Regular security audits and penetration testing")
    print("7. Implement intrusion detection system (fail2ban)")
    print("8. Keep backups and test recovery procedures")

if __name__ == "__main__":
    print("Linux Security Audit Report")
    print("=" * 30)

    check_user_permissions()
    check_file_permissions()
    check_open_ports()
    check_ssh_config()
    security_recommendations()

Discussion Topics

1. Linux distribution selection strategy

Consider these factors when choosing distributions:

Stability vs Features: Ubuntu LTS/RHEL for production stability, Fedora/Ubuntu latest for newer features
Package Management: apt (Debian-based) vs yum/dnf (Red Hat-based) vs pacman (Arch)
Support Lifecycle: Enterprise distributions offer 5-10 year support cycles
Container Optimization: Alpine Linux for minimal container images, CoreOS for container-focused infrastructure
Development vs Production: Different requirements for development flexibility vs production reliability

2. System hardening and security practices

Comprehensive security strategy should include:

Access Control: Principle of least privilege, regular user audit, strong authentication
Network Security: Firewall configuration, VPN access, network segmentation
System Monitoring: Log analysis, intrusion detection, file integrity monitoring
Regular Updates: Automated security patches, vulnerability scanning
Backup Strategy: Regular backups, tested recovery procedures, offline backup storage

3. Performance optimization methodology

Systematic approach to performance issues:

Baseline Measurement: Establish performance metrics before optimization
Bottleneck Identification: CPU, memory, disk I/O, network analysis
Tool Selection: htop, iotop, nethogs, perf, strace for different scenarios
Incremental Changes: Make one change at a time, measure impact
Monitoring: Continuous monitoring to catch performance degradation

4. Automation vs manual administration

Automation decision framework:

Automate First: Repetitive tasks, routine maintenance, deployment procedures
Manual for: Complex troubleshooting, one-time configuration changes, emergency response
ROI Calculation: Time investment vs time saved, error reduction benefits
Start Small: Begin with simple scripts, expand to configuration management tools
Documentation: Maintain runbooks for both automated and manual procedures

5. Disaster recovery and business continuity

Comprehensive disaster recovery planning:

Risk Assessment: Identify critical systems, acceptable downtime (RTO/RPO)
Backup Strategy: Multiple backup types (full, incremental, differential), offsite storage
Recovery Testing: Regular disaster recovery drills, documented procedures
Infrastructure Redundancy: High availability, geographic distribution, failover mechanisms
Communication Plan: Incident response team, stakeholder notification, status updates