Your First Project

Getting started with a project template

This guide walks through setting up your first project using lab templates.

What is `targets` (and Why We Use It)?

Most R users are used to running scripts in sequence: 01_clean.R, 02_analyze.R, 03_plot.R. This works until your project grows—then you forget which scripts to re-run when you change something, you accidentally use stale intermediate results, and reproducing the full analysis becomes error-prone.

targets solves this. It’s a pipeline tool that:

Tracks dependencies automatically — it knows that your plot depends on your model, which depends on your cleaned data
Only re-runs what changed — if you edit your plotting code, it won’t re-run the data cleaning or model fitting
Documents your workflow — the pipeline definition (_targets.R) serves as a readable map of your entire analysis

Think of it like a smart “Run All” button that skips work it doesn’t need to redo.

# A simple pipeline: _targets.R
list(
  tar_target(raw_data, read.csv("data/raw/study.csv")),
  tar_target(clean_data, clean_dataset(raw_data)),
  tar_target(model_fit, fit_model(clean_data)),
  tar_target(summary_table, summarize_results(model_fit))
)

Each tar_target() defines one step. The first argument is the name, the second is the R expression to run. targets figures out the order from the dependency chain (summary_table needs model_fit, which needs clean_data, etc.).

You’ll learn more as you use it. For now, just know that every lab project uses targets as the backbone of its analysis. See the Targets Pipeline Guide for the full reference.

Choose a Template

We have three main templates:

Template	Use Case
Research Project	General analysis projects
Methods Paper	Methodology papers with simulations
R Package Development	R packages accompanying a methods paper

Clone the Template

# Clone research project template
git clone git@github.com:rashidlab/template-research-project.git my-first-project

# Navigate into project
cd my-first-project

# Remove template git history
rm -rf .git

# Initialize fresh git repo
git init
git add .
git commit -m "Initial commit from template"

Set Up Data Directory

Data Storage Rule

Data files are never committed to Git. Use the lab project directory and symlinks.

# Create your project's data folder on Longleaf
# Use /proj/rashidlab/projects/ for private project data
# (the top-level /proj/rashidlab/ is publicly readable on the cluster)
mkdir -p /proj/rashidlab/projects/my-first-project/data/raw
mkdir -p /proj/rashidlab/projects/my-first-project/data/processed

# Create a symlink from your repo to the data
ln -s /proj/rashidlab/projects/my-first-project/data data

# Your .gitignore already excludes data/

Now place any data files in /proj/rashidlab/projects/my-first-project/data/raw/ and they’ll be accessible via data/raw/ in your code.

Note

The lab shared space (/proj/rashidlab/) has a 1 TB quota shared across all members and is not backed up. Check usage at service.rc.unc.edu. See Longleaf Setup for the full directory reference.

Set Up Dependencies

# Open project in RStudio
# Then restore packages:
# Install required packages listed in DESCRIPTION or README

Configure the Project

Edit config/config.yml:

project:
  name: "My First Project"
  author: "Your Name"

seed: 2024

analysis:
  alpha: 0.05

Run the Pipeline

# Load targets
library(targets)

# Visualize the pipeline
tar_visnetwork()

# Run the pipeline
tar_make()

# Check results
tar_read(results_summary)

Make Your First Change

Add a new analysis step in _targets.R:

tar_target(
  my_analysis,
  run_my_analysis(clean_data)
)

Create the function in R/analysis.R:

run_my_analysis <- function(data) {
  # Your analysis code (using base R - see R Style Guide)
  aggregate(value ~ group, data = data, FUN = mean)
}

Lab Style: Base R + data.table

We use base R and data.table instead of tidyverse. For large datasets, use:

library(data.table)
dt <- as.data.table(data)
dt[, .(mean_value = mean(value)), by = group]

See R Style Guide for details.

Run and verify:

tar_make()
tar_read(my_analysis)

Commit Your Changes

# Check status
git status

# Stage changes
git add R/analysis.R _targets.R

# Commit with descriptive message
git commit -m "feat: add group-wise mean analysis"

# Push to GitHub (after creating remote repo)
git push -u origin main

Project Structure

After setup, your project looks like:

my-first-project/
├── _targets.R          # Pipeline definition
├── R/                  # Your functions
│   └── analysis.R
├── config/
│   └── config.yml      # Configuration
├── data -> /proj/rashidlab/my-first-project/data  # Symlink (gitignored)
├── results/            # Outputs (gitignored)
├── figures/            # Plots
├── scripts/
│   └── download_data.sh  # For external users
├── DESCRIPTION         # Package dependencies
├── .gitignore          # Excludes data/, results/, _targets/
└── README.md           # Project documentation

Best Practices

Keep Functions Small

Each function should do one thing. If a function is longer than ~50 lines, consider splitting it.

Commit Often

Make small, frequent commits with descriptive messages. It’s easier to track changes and revert if needed.

Document As You Go

Add comments explaining why, not what. Update README when you add features.

Using Claude Code on Your First Project

Claude Code can help you get started and learn the codebase:

# Start Claude in your project
cd ~/rashid-lab-setup/my-first-project
claude

Try these prompts:

> What is the structure of this project?
> Explain how the targets pipeline works
> Help me add a new analysis function
> What do I need to do to run the pipeline?

Claude understands lab conventions and will guide you through using base R, data.table, and targets.

See Claude Code First Session for a detailed walkthrough.

Getting Help

Pipeline issues: Check targets::tar_meta() for errors
Package issues: Reinstall packages from DESCRIPTION
Git issues: Ask in #computing Teams channel
Claude Code: Claude Code Guide

Next: Coding Standards →

What is targets (and Why We Use It)?

Choose a Template

Clone the Template

Set Up Data Directory

Set Up Dependencies

Configure the Project

Run the Pipeline

Make Your First Change

Commit Your Changes

Project Structure

Best Practices

Using Claude Code on Your First Project

Getting Help

What is `targets` (and Why We Use It)?