Understanding Your Security Score
MCP Security Score provides multiple metrics to help you understand your MCP server's security posture. This guide explains each score and how to interpret them.
The Security Score
Your primary security score is a number from 0 to 100, representing the overall security health of your MCP server.
How It's Calculated
The score starts at 100 points and deducts points for each security finding:
| Severity | Points Deducted | |----------|-----------------| | Critical | -20 | | High | -12 | | Medium | -6 | | Low | -3 | | Info | -1 |
Diminishing Returns
To prevent a single category from tanking your entire score, we apply diminishing returns:
- First 50 points deducted in a category: Full deduction
- After 50 points: Further deductions are reduced by 50%
This ensures that even a repository with many low-severity findings won't automatically get an F.
Score Floor
The minimum score is 0. You cannot go negative.
Letter Grades
For quick assessment, scores map to letter grades:
| Score Range | Grade | Meaning | |-------------|-------|---------| | 90-100 | A | Excellent security posture | | 80-89 | B | Good, minor improvements possible | | 70-79 | C | Fair, some issues to address | | 60-69 | D | Poor, significant issues found | | 0-59 | F | Critical vulnerabilities present |
What Each Grade Means
Grade A (90-100)
- No critical or high severity findings
- Few medium or low findings
- Well-structured, secure code
- Ready for production use
Grade B (80-89)
- No critical findings
- May have a few high severity issues
- Some medium/low findings to address
- Generally safe, but improvements recommended
Grade C (70-79)
- May have high severity findings
- Several medium severity issues
- Requires attention before production
- Consider a security review
Grade D (60-69)
- High severity vulnerabilities present
- Many security issues across categories
- Not recommended for production
- Needs significant remediation
Grade F (0-59)
- Critical vulnerabilities found
- Serious security risks present
- Do not deploy to production
- Immediate action required
Safety Score
When AI analysis is enabled, you'll also see a Safety Score. This composite score combines:
- 60% - Static analysis score (the main security score)
- 40% - AI trust score (Claude's assessment of behavioral safety)
Why Two Scores?
Static analysis excels at finding known patterns but can miss:
- Novel attack vectors
- Semantic issues in code logic
- Context-dependent vulnerabilities
AI analysis provides:
- Behavioral understanding
- Intent analysis
- Risk assessment based on capabilities
Together, they give a more complete picture.
When They Diverge
If your static score and AI trust score differ significantly:
High static, low AI trust:
- Code follows best practices but has risky capabilities
- May have legitimate but dangerous operations
- Review AI's specific concerns
Low static, high AI trust:
- Many pattern matches but code is contextually safe
- Could be false positives in test files
- Review static findings for accuracy
Category Breakdown
Your score is broken down across security categories:
Categories and Weights
| Category | Weight | What It Covers | |----------|--------|----------------| | Tool Permissions | 25% | RCE risks, authentication | | Input Handling | 25% | Filesystem, data validation | | Dependencies | 20% | Supply chain, CVEs | | Code Patterns | 15% | Network, secrets | | Transparency | 15% | MCP-specific best practices |
Per-Category Scores
Each category has its own 0-100 score:
- 90-100: Category is secure
- 70-89: Minor issues in this area
- 50-69: Significant concerns
- Below 50: Critical problems in this category
Which Categories Matter Most?
Focus on categories with the highest weights first:
- Tool Permissions (25%) - RCE vulnerabilities are the most dangerous
- Input Handling (25%) - Path traversal and injection attacks
- Dependencies (20%) - Supply chain attacks are increasingly common
- Code Patterns (15%) - Exposed secrets are easily exploitable
- Transparency (15%) - MCP best practices improve overall safety
Severity Levels
Individual findings are tagged with severity levels:
Critical
- Color: Red
- Impact: Immediate security risk
- Examples: Remote code execution, exposed secrets, known CVEs
- Action: Fix immediately before any deployment
High
- Color: Orange
- Impact: Significant vulnerability
- Examples: Path traversal, disabled TLS, SQL injection
- Action: Address before production deployment
Medium
- Color: Yellow
- Impact: Moderate risk
- Examples: Missing validation, excessive dependencies
- Action: Plan to fix in near-term
Low
- Color: Blue
- Impact: Minor issue
- Examples: Missing descriptions, code style concerns
- Action: Good to fix, but not urgent
Info
- Color: Gray
- Impact: Informational only
- Examples: Hardcoded URLs (for review), debug statements
- Action: Review and decide if intentional
Improving Your Score
To improve your security score:
Quick Wins (Big Impact)
- Remove hardcoded secrets and use environment variables
- Fix any
eval()orexec()usage - Add timeouts to network requests
- Update dependencies with known CVEs
Medium Effort
- Add input validation to tool handlers
- Implement path sanitization
- Add tool descriptions for transparency
- Review and limit tool permissions
Long-term
- Reduce dependency count where possible
- Add security-focused code review process
- Set up automated scanning in CI/CD
- Regular security audits
Score Caching
Scan results are cached for 24 hours by default:
- Re-scanning the same repository returns cached results
- Use Rescan button to force a fresh analysis
- Cache helps with rate limits and performance
Comparing Scores
When comparing MCP servers:
- Same category scores are directly comparable
- Different repository sizes may skew raw scores
- Focus on severity distribution not just the total score
- Consider the specific findings not just the number
Next Steps
- Reviewing Findings - How to analyze and fix issues
- Security Checks - Understand each check
- API Reference - Track scores over time