Skip to main content
Pipeline Build 12 min read

CRISPR TP53 Guide Design: End-to-End Workflow

A complete walkthrough of designing CRISPR guide RNAs targeting TP53 using the Hordago pipeline — from gene input to ranked guides with off-target analysis and full provenance tracking.

Jeff Jaureguy
Human genome hg38, A549 cell line
CRISPRon Cas-OFFinder CHOPCHOP biocontext7

Overview

TP53 is the most frequently mutated gene in human cancers. Designing effective CRISPR guide RNAs to knock out TP53 requires balancing on-target efficiency with minimizing off-target effects. In this episode, we walk through the complete Hordago pipeline for CRISPR guide design.

Step 1: Define the Target

We start by specifying the gene, genome assembly, and cell line context:

hordago crispr design --gene TP53 --cell-line A549

The pipeline automatically:

  • Resolves TP53 to its canonical transcript (ENST00000269305)
  • Identifies all exons in the coding sequence
  • Scans for PAM sites (NGG) within exonic regions

Step 2: On-Target Scoring

Each candidate guide RNA is scored using CRISPRon, which predicts cutting efficiency based on sequence features and chromatin accessibility:

import pandas as pd

guides = pd.read_csv("guides.tsv", sep="\t")
print(guides[["guide_id", "sequence", "on_target_score"]].head())
guide_idsequenceon_target_score
TP53-g1GCAGCCTTTGTGAACCAACA0.92
TP53-g2TGGTTCTCACTTGGTGGAAG0.89
TP53-g3AGCAGGTCTGTTCCAAGGGA0.87

Step 3: Off-Target Analysis

Cas-OFFinder scans the entire genome for potential off-target sites, allowing up to 3 mismatches:

cas-offinder input.txt G output.txt

Results show TP53-g3 has the fewest off-target hits, while TP53-g1 has the highest on-target score — a classic tradeoff.

TP53-g1: 12 off-target sites (max 3 mismatches)
TP53-g2: 18 off-target sites
TP53-g3: 5 off-target sites

Step 4: Final Ranking

The pipeline combines on-target and off-target scores into a composite ranking:

RESULTS: 3 guides ranked
  TP53-g1  GCAGCCTTTGTGAACCAACA  on=0.92  off=0.02  rank=1
  TP53-g3  AGCAGGTCTGTTCCAAGGGA  on=0.87  off=0.01  rank=2
  TP53-g2  TGGTTCTCACTTGGTGGAAG  on=0.89  off=0.04  rank=3

Before & After

Before: Manual workflow

  • 3+ hours of manual tool switching
  • No reproducibility guarantee
  • Results scattered across browser tabs

After: Hordago pipeline

  • 4.2 seconds end-to-end
  • Full provenance manifest
  • Ranked output with composite scoring

Key Takeaways

  1. Automated pipelines eliminate human error in multi-tool workflows
  2. Provenance tracking ensures every result can be reproduced
  3. Composite scoring surfaces the best guide by balancing efficiency vs. safety
  4. The pipeline’s cell-line context (A549) factors in chromatin state, which improves on-target predictions

Provenance Manifest

{
  "workflow": "crispr-guide-design",
  "version": "1.3.0",
  "timestamp": "2026-03-14T09:12:43Z",
  "inputs": {
    "gene": "TP53",
    "genome": "hg38",
    "cell_line": "A549",
    "pam": "NGG"
  },
  "tools": {
    "CRISPRon": "1.0.0",
    "Cas-OFFinder": "2.4.1",
    "CHOPCHOP": "3.0.0"
  },
  "git_commit": "a3f8c12",
  "outputs": [
    "guides.tsv",
    "offtargets.tsv",
    "summary.pdf"
  ]
}