GNU Make¶

Learning Objectives

Understand how Make can automate and document workflows
Learn Make syntax and basic concepts
Create Makefiles for common scientific computing tasks
Integrate Make with version control and data analysis pipelines

Why Make?¶

Make is a powerful tool for:

Automation: Automate complex workflows with a single command
Documentation: Self-documenting build process; Makefiles are your cheat sheet
Dependency Management: Only rebuild what's necessary
Reproducibility: Ensure consistent builds across different environments

Make Basics¶

Makefile Structure¶

A basic Makefile consists of rules with targets, prerequisites, and recipes:

target: prerequisites
    recipe

Example:

paper.pdf: paper.tex references.bib
    pdflatex paper.tex
    bibtex paper
    pdflatex paper.tex
    pdflatex paper.tex

Key Concepts¶

Targets: Files to be created
Prerequisites: Files needed to create the target
Recipes: Commands to create the target
Phony Targets: Targets that don't create files

Common Workflows¶

Version Control Workflow¶

.PHONY: commit push pull clean

# Git commands with automatic messages
commit:
    git add .
    @read -p "Enter commit message: " msg; \
    git commit -m "$$msg"

push: commit
    git pull origin main
    git push origin main

pull:
    git pull origin main

# Clean generated files
clean:
    rm -f *.aux *.log *.bbl *.blg *.out

LaTeX Document Workflow¶

# Variables for file names
PAPER = paper
FIGS = $(wildcard figures/*.pdf)

# Main targets
.PHONY: all clean

all: $(PAPER).pdf

# Build PDF with automatic bibliography
$(PAPER).pdf: $(PAPER).tex bibliography.bib $(FIGS)
    pdflatex $(PAPER)
    bibtex $(PAPER)
    pdflatex $(PAPER)
    pdflatex $(PAPER)

# Clean LaTeX auxiliary files
clean:
    rm -f *.aux *.log *.bbl *.blg *.out $(PAPER).pdf

# Watch for changes and rebuild
watch:
    while true; do \
        make all; \
        inotifywait -e modify $(PAPER).tex bibliography.bib; \
    done

Data Analysis Workflow¶

# Directories
DATA_DIR = data
RESULTS_DIR = results
FIGS_DIR = figures

# Data files
RAW_DATA = $(DATA_DIR)/raw_data.csv
CLEAN_DATA = $(DATA_DIR)/clean_data.csv
ANALYSIS_RESULTS = $(RESULTS_DIR)/analysis_results.csv
FIGURES = $(FIGS_DIR)/figure1.pdf $(FIGS_DIR)/figure2.pdf

# Main targets
.PHONY: all clean

all: report.pdf

# Data processing pipeline
$(CLEAN_DATA): $(RAW_DATA) scripts/clean_data.R
    Rscript scripts/clean_data.R

$(ANALYSIS_RESULTS): $(CLEAN_DATA) scripts/analyze_data.R
    Rscript scripts/analyze_data.R

$(FIGS_DIR)/%.pdf: $(ANALYSIS_RESULTS) scripts/make_figures.R
    Rscript scripts/make_figures.R

# Generate report
report.pdf: report.Rmd $(ANALYSIS_RESULTS) $(FIGURES)
    R -e "rmarkdown::render('report.Rmd')"

# Clean generated files
clean:
    rm -f $(CLEAN_DATA) $(ANALYSIS_RESULTS) $(FIGURES) report.pdf

Advanced Make Features¶

Pattern Rules¶

Use pattern rules to handle multiple similar files:

# Convert all .tex files to .pdf
%.pdf: %.tex
    pdflatex $<

Variables and Functions¶

# Variables
CC = gcc
CFLAGS = -Wall -O2

# Functions
SOURCES = $(wildcard src/*.c)
OBJECTS = $(SOURCES:.c=.o)

Automatic Variables¶

$@: Target name
$<: First prerequisite
$^: All prerequisites
$*: Stem in pattern rule

Example:

%.o: %.c
    $(CC) -c $(CFLAGS) $< -o $@

Best Practices¶

1. Directory Structure¶

Organize your project with clear directory structure:

# Directory structure
DIRS = data src results figures docs
$(shell mkdir -p $(DIRS))

2. Documentation¶

Include comments and help target:

.PHONY: help
help:
    @echo "Available targets:"
    @echo "  all      - Build everything"
    @echo "  clean    - Remove generated files"
    @echo "  data     - Process raw data"
    @echo "  figures  - Generate figures"

3. Error Handling¶

Use error checking in recipes:

data/processed.csv: data/raw.csv scripts/process.R
    Rscript scripts/process.R || (rm -f $@; exit 1)

4. Dependency Tracking¶

Track both code and data dependencies:

results/model.rds: src/train_model.R data/training.csv
    Rscript $< data/training.csv $@

Example: Complete Research Project¶

# Configuration
R_SCRIPTS = $(wildcard scripts/*.R)
TEX_FILES = $(wildcard paper/*.tex)
BIB_FILES = $(wildcard paper/*.bib)

# Main targets
.PHONY: all paper data clean

all: paper data

# Paper compilation
paper: paper/manuscript.pdf

paper/manuscript.pdf: $(TEX_FILES) $(BIB_FILES) results/analysis.rds figures/*.pdf
    cd paper && pdflatex manuscript
    cd paper && bibtex manuscript
    cd paper && pdflatex manuscript
    cd paper && pdflatex manuscript

# Data analysis pipeline
data: results/analysis.rds figures/plot1.pdf figures/plot2.pdf

results/analysis.rds: scripts/analyze.R data/clean_data.csv
    Rscript $<

figures/%.pdf: scripts/plot.R results/analysis.rds
    Rscript $<

# Data cleaning
data/clean_data.csv: scripts/clean.R data/raw_data.csv
    Rscript $<

# Utility targets
clean:
    rm -f paper/*.{aux,log,bbl,blg}
    rm -f results/* figures/*

.PHONY: sync
sync: all
    git add .
    git commit -m "Update analysis and paper"
    git push origin main