# How does Typo calculates PR health?

PR Health Scorer is a modular scoring engine that quantifies the overall “health” of a Pull Request (PR) based on a mix of code-level metrics and critical-issue signals.&#x20;

It produces a weighted score (0–100) and assigns a risk label (Excellent → Very Risky).&#x20;

It is invoked during the PR reviews to provide a holistic quality indicator that reflects:&#x20;

* PR scope and size (diff stats)
* Code complexity (cyclomatic complexity)
* &#x20;Presence of critical issues (from validation phase)

#### Core Configuration&#x20;

Categories impacting the score -&#x20;

```
CATEGORIES = { 
    "diff_size", 
    "files_changed", 
    "code_complexity_score" 
}
```

* Diff size - large diffs are risky.
* Files changed - more files = higher review complexity.
* Cyclomatic complexity - indicates logical complexity and maintainability risk.

Each metric is normalized to a 0–100 scale internally, then combined as a weighted average.

#### Label Definitions&#x20;

```
LABELS = { 
    "EXCELLENT": "`Excellent` 🔥", 
    "GOOD": "`Good` ✅", 
    "NEEDS_ATTENTION": "`Needs Attention` ⚠", 
    "RISKY": "`Risky` 🚫", 
    "VERY_RISKY": "`Very Risky` 🛑", 
}
```

These are human-readable labels used directly in the AI Code Review summary. They represent progressively decreasing confidence in the PR’s stability and reviewability.

#### Scoring & Aggregation Logic&#x20;

1. Category Scoring&#x20;

Each raw metric is normalized to a 0 - 100 score using hard thresholds designed around empirical risk points.&#x20;

| Lines Changed | Score | Interpretation          |
| ------------- | ----- | ----------------------- |
| < 350         | 100   | Compact, easy to review |
| < 700         | 80    | Manageable              |
| < 900         | 60    | Slightly heavy          |
| < 1280        | 30    | Harder to review        |
| < 1500        | 15    | Very heavy              |
| > 1500        | 0     | Overloaded PR           |

2. File Count

Files Changed Score Interpretation&#x20;

| Files Changed | Score | Interpretation     |
| ------------- | ----- | ------------------ |
| < 5           | 100   | Atomic PR          |
| < 15          | 80    | Slightly broad     |
| < 30          | 50    | Needs caution      |
| < 50          | 20    | Very large surface |
| > 50          | 0     | Unreviewable PR    |

3. Cyclomatic Complexity&#x20;

A Linear decay function is used to calculate this

#### Combining Scores&#x20;

Each category (diff size, files changed, and complexity) has an importance level assigned to it. The scorer takes each category’s score and factors in its importance, then combines them to get one overall health score.&#x20;

This final score is a number between 0 and 100 that represents the overall quality and risk of the PR. Higher numbers mean the PR is healthier and easier to review.

#### Label Assignment Logic&#x20;

Once the composite score is computed, it’s mapped to one of the five risk tiers:&#x20;

| Final Score | Label Key        | Meaning                   |
| ----------- | ---------------- | ------------------------- |
| > 90        | EXCELLENT        | Clean, reviewable PR      |
| > 75        | GOOD             | Solid, minor concerns     |
| > 50        | NEEDS\_ATTENTION | Needs some caution        |
| > 25        | RISKY            | Significant risk          |
| < 25        | VERY\_RISKY      | High complexity or volume |
