DevOps Engineering
DevOps Engineering Guide
This comprehensive guide covers essential DevOps tools and practices including containerization, infrastructure as code, CI/CD pipelines, version management, and environment management.
Docker Containerization
Docker provides containerization technology for packaging applications with their dependencies into standardized units.
Basic Docker Commands
Container Management:
# Run container
docker run <image-name>
docker run -d <image-name> # Detached mode
docker run -it <image-name> /bin/bash # Interactive terminal
docker run -p 8080:80 <image-name> # Port mapping
# List containers
docker ps # Running containers
docker ps -a # All containers
# Container operations
docker start <container-id>
docker stop <container-id>
docker restart <container-id>
docker rm <container-id> # Remove container
docker rm -f <container-id> # Force remove
# Execute into running container
docker exec -it <container-name> /bin/bash
Image Management:
# List images
docker images
# Build image
docker build -t <image-name> .
docker build -t <image-name>:<tag> .
# Pull/push images
docker pull <image-name>
docker push <image-name>
# Remove images
docker rmi <image-name>
docker image prune # Remove dangling images
System Management:
# System cleanup
docker system prune # Remove unused containers, networks, images
docker system prune -a # Remove all unused resources
docker system prune -a -f # Force remove all
# System information
docker system df # Disk usage
docker system info # System info
Dockerfile Best Practices
Basic Dockerfile:
FROM ubuntu:20.04
# Set working directory
WORKDIR /app
# Install dependencies
RUN apt-get update && apt-get install -y \
curl \
git \
&& rm -rf /var/lib/apt/lists/*
# Copy application code
COPY . .
# Expose port
EXPOSE 8080
# Define startup command
CMD ["./start.sh"]
Multi-stage Build:
# Build stage
FROM golang:1.19-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o main .
# Production stage
FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
EXPOSE 8080
CMD ["./main"]
Docker Compose
Basic docker-compose.yml:
version: '3.8'
services:
web:
build: .
ports:
- "8080:8080"
environment:
- NODE_ENV=production
depends_on:
- db
db:
image: postgres:13
environment:
POSTGRES_DB: mydb
POSTGRES_USER: user
POSTGRES_PASSWORD: password
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:
Docker Compose Commands:
# Start services
docker-compose up
docker-compose up -d # Detached mode
# Stop services
docker-compose down
# Build and start
docker-compose up --build
# View logs
docker-compose logs
docker-compose logs -f # Follow logs
# Execute commands
docker-compose exec web bash
Infrastructure as Code with Terraform/OpenTofu
Terraform and OpenTofu provide infrastructure as code for provisioning and managing cloud resources.
Basic Terraform Commands
Initialization:
# Initialize working directory
terraform init
# Format configuration files
terraform fmt
# Validate configuration
terraform validate
Planning and Applying:
# Create execution plan
terraform plan
terraform plan -out=tfplan # Save plan to file
# Apply changes
terraform apply
terraform apply tfplan # Apply saved plan
# Destroy infrastructure
terraform destroy
State Management:
# Show state
terraform show
# List resources in state
terraform state list
# Remove resource from state
terraform state rm <resource>
# Import existing resource
terraform import <resource.address> <resource.id>
Terragrunt Commands
Terragrunt is a thin wrapper for Terraform that provides extra tools for working with multiple Terraform modules.
Basic Commands:
# Run Terraform commands through Terragrunt
terragrunt plan
terragrunt apply
terragrunt destroy
# Show Terragrunt configuration
terragrunt show
# Force unlock state (for stuck locks)
terragrunt force-unlock <lock-id>
Terraform Configuration Examples
Basic EC2 Instance:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
region = "us-east-1"
}
resource "aws_instance" "web" {
ami = "ami-0c55b159cbfafe1d0"
instance_type = "t2.micro"
tags = {
Name = "WebServer"
}
}
Variables and Outputs:
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t2.micro"
}
variable "environment" {
description = "Environment name"
type = string
}
output "instance_id" {
description = "EC2 instance ID"
value = aws_instance.web.id
}
output "public_ip" {
description = "Public IP address"
value = aws_instance.web.public_ip
}
Modules:
module "vpc" {
source = "./modules/vpc"
cidr_block = "10.0.0.0/16"
environment = var.environment
}
module "ec2" {
source = "./modules/ec2"
vpc_id = module.vpc.vpc_id
instance_type = var.instance_type
}
OpenTofu Specific Features
OpenTofu is an open-source fork of Terraform with additional features.
External Data Source:
data "external" "list_files" {
program = ["bash", "-c", "ls -la ../../../ | jq -R -s '{stdout: .}'"]
}
resource "local_file" "list_files_output" {
content = data.external.list_files.result["stdout"]
filename = "list_files_output.txt"
}
CI/CD with GitLab
GitLab provides comprehensive CI/CD capabilities for automated testing, building, and deployment.
GitLab CI/CD Pipeline
Basic .gitlab-ci.yml:
stages:
- test
- build
- deploy
variables:
DOCKER_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_SLUG
test:
stage: test
image: node:16
script:
- npm install
- npm run test
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
build:
stage: build
image: docker:latest
services:
- docker:dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker build -t $DOCKER_IMAGE .
- docker push $DOCKER_IMAGE
deploy:
stage: deploy
script:
- echo "Deploy to production"
environment:
name: production
url: https://myapp.com
only:
- main
GitLab Container Registry
Registry Management:
# List repositories
curl --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
"https://gitlab.example.com/api/v4/registry/repositories"
# List tags for repository
curl --header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
"https://gitlab.example.com/api/v4/registry/repositories/31?tags=true&tags_count=true&size=true"
# Delete images by regex
curl --request DELETE \
--data 'name_regex_delete=.*' \
--data 'older_than=3months' \
--header "PRIVATE-TOKEN: $GITLAB_TOKEN" \
"https://gitlab.example.com/contentlounge/lightning-drupal/container_registry/31"
Advanced Pipeline Features
Parallel Jobs:
stages:
- test
test:
stage: test
parallel: 3
script:
- npm run test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL
Environments and Deployments:
deploy_staging:
stage: deploy
script:
- ./deploy.sh staging
environment:
name: staging
url: https://staging.example.com
only:
- develop
deploy_production:
stage: deploy
script:
- ./deploy.sh production
environment:
name: production
url: https://example.com
when: manual
only:
- main
Include and Extends:
include:
- template: Security/SAST.gitlab-ci.yml
- local: .gitlab/ci/test.gitlab-ci.yml
.test_template: &test_template
image: node:16
before_script:
- npm install
unit_test:
<<: *test_template
stage: test
script:
- npm run test:unit
integration_test:
<<: *test_template
stage: test
script:
- npm run test:integration
Version Management with ASDF
ASDF is a version manager for multiple runtime versions on a single machine.
ASDF Installation and Setup
Plugin Management:
# Add plugins
asdf plugin add <name>
asdf plugin add terraform https://github.com/Banno/asdf-hashicorp.git
# List all available plugins
asdf plugin list all
# Update plugins
asdf plugin update --all
Version Installation:
# List available versions
asdf list-all <plugin>
asdf list-all terraform
# Install specific version
asdf install <plugin> <version>
asdf install terraform 1.7.0
# Install from .tool-versions
asdf install
Version Management:
# Set local version (project-specific)
asdf local <plugin> <version>
asdf local terraform 1.7.0
# Set global version (system-wide)
asdf global <plugin> <version>
asdf global terraform 1.7.0
# Show current versions
asdf current
asdf current terraform
Uninstallation:
# Uninstall specific version
asdf uninstall <plugin> <version>
asdf uninstall opentofu 1.8.1
# Uninstall all versions except global
asdf uninstall --not-global
.tool-versions File
Project-specific version pinning:
nodejs 18.17.0
terraform 1.5.0
python 3.11.4
awscli 2.13.0
Legacy version file:
nodejs 18.17.0
terraform 1.5.0
Environment Management with Conda
Conda is a package and environment management system for Python and other languages.
Conda Environment Management
Initialization:
conda init
Environment Creation:
# Create environment with Python version
conda create -n <env-name> python=<version>
conda create -n teams python=3.13
conda create -n rag python=3.9.20
# Create empty environment
conda create --name s3commands
Environment Activation/Deactivation:
# Activate environment
conda activate <env-name>
conda activate di
# Deactivate current environment
conda deactivate
# Stack environments
conda activate --stack <env-name>
Package Installation:
# Install package in current environment
conda install <package>
conda install scipy
# Install in specific environment
conda install -n <env-name> <package>
conda install -n myenv scipy
Environment Management:
# List environments
conda env list
# Remove environment
conda remove -n <env-name> --all
conda remove -n ENV_NAME --all
# Clean unused packages
conda clean --all
Environment Files
environment.yml:
name: myenv
channels:
- defaults
- conda-forge
dependencies:
- python=3.9
- numpy
- pandas
- scikit-learn
- pip
- pip:
- requests
- flask
Create/Update from file:
# Create environment from file
conda env create -f environment.yml
# Update environment
conda env update -f environment.yml --prune
DevOps Best Practices
Infrastructure as Code
- Version Control: Keep all infrastructure code in Git
- Modular Design: Use modules for reusable components
- Testing: Test infrastructure changes before applying
- Documentation: Document infrastructure decisions and processes
- Security: Follow principle of least privilege
CI/CD Pipeline Best Practices
- Fast Feedback: Run tests early in pipeline
- Parallel Execution: Run independent jobs in parallel
- Artifact Management: Store and version build artifacts
- Environment Promotion: Promote through dev → staging → production
- Rollback Strategy: Always have rollback plans
Container Best Practices
- Minimal Images: Use minimal base images (Alpine, Distroless)
- Multi-stage Builds: Separate build and runtime stages
- Security Scanning: Scan images for vulnerabilities
- Resource Limits: Set CPU and memory limits
- Health Checks: Implement proper health checks
Environment Management
- Reproducible Environments: Use version pinning and environment files
- Isolation: Keep environments separate for different projects
- Documentation: Document environment setup and dependencies
- Automation: Automate environment creation and updates
- Cleanup: Regularly clean up unused environments and packages
Integration Examples
Docker + Terraform Pipeline
.gitlab-ci.yml:
stages:
- validate
- plan
- apply
validate:
stage: validate
image: hashicorp/terraform:latest
script:
- terraform init
- terraform validate
- terraform fmt -check
plan:
stage: plan
image: hashicorp/terraform:latest
script:
- terraform init
- terraform plan -out=tfplan
artifacts:
paths:
- tfplan
apply:
stage: apply
image: hashicorp/terraform:latest
script:
- terraform init
- terraform apply -auto-approve tfplan
when: manual
only:
- main
Complete Development Workflow
- Development: Use ASDF for runtime versions, Conda for Python environments
- Containerization: Docker for application packaging
- Version Control: Git with feature branches
- CI/CD: GitLab pipelines for automated testing and deployment
- Infrastructure: Terraform/OpenTofu for cloud resource management
- Monitoring: Implement logging and monitoring in containers
This guide provides comprehensive coverage of DevOps tools and practices essential for modern software development and deployment workflows.