Post

[THM] SOC Metrics and Objectives

This is a full walkthrough with answers and explanations for the TryHackMe room "SOC Metrics and Objectives".

[THM] SOC Metrics and Objectives

Link to the room: https://tryhackme.com/room/socmetricsobjectives.

[Task 2] Core Metrics

MetricFormulaMeasures
Alerts CountAC = Total Count of Alerts ReceivedOverall load of SOC analysts
False Positive RateFPR = False Positives / Total AlertsLevel of noise in the alerts
Alert Escalation RateAER = Escalated Alerts / Total AlertsExperience of L1 analysts
Threat Detection RateTDR = Detected Threats / Total ThreatsReliability of the SOC team

Is zero alerts for one month a good sign for your SOC team? (Yea/Nay)

1
Nay

Too low count of alerts may indicate an issue in the SIEM or lack of visibility, leading to undetected breaches.

What is the False Positive Rate if only 10 out of 50 alerts appear to be real threats?

1
80%

TDR = Detected Threats / Total Threats

  • 50 Total Threats where 10 are real threats, so we have 40 false positives.
  • TDR = 40 Detected Threats / 50 Total Threats = 80%

[Task 3] Triage Metrics

img-description

MetricCommon SLADescription
SOC Team Availability24/7Working schedule of the SOC team, often Monday-Friday (8/5) or 24/7 mode
Mean Time to Detect (MTTD)5 minutesAverage time between the attack and its detection by SOC tools
Mean Time to Acknowledge (MTTA)10 minutesAverage time for L1 analysts to start triage of the new alert
Mean Time to Respond (MTTR)60 minutesAverage time taken by SOC to actually stop the breach from spreading

Imagine a scenario where the SOC team receives a critical alert on Saturday. If the team works 8/5, on which day of the week will they acknowledge the alert?

1
Monday

Monday will be the first day after Saturday in 8/5 work time.

Imagine a scenario where an employee was lured into running data stealer malware.

  1. The SOC team received the “Connection to Redline Stealer C2” alert after 12 minutes.
  2. One of the L1 analysts on shift moved the alert to In Progress 10 minutes later.
  3. After 6 minutes, the alert was escalated to L2, who spent 35 minutes cleaning the malware.

Provide the MTTD, MTTA, and MTTR via comma as your answer (e.g. 10,20,30).

1
12,10,51

MTTD = 12 min, MTTA = 10 min, MTTR = 10 min + 6 min + 35 min = 51 min

[Task 4] Improving Metrics

img-description

IssueRecommendations
False Positive Rate over 80%Your team receives too much noise in the alerts. Try to:
 1. Exclude trusted activities like system updates from your EDR or SIEM detection rules
 2. Consider automating alert triage for most common alerts using SOAR or custom scripts
Mean Time to Detect over 30 minYour team detects a threat with a high delay. Try to:
 1. Contact SOC engineers to make the detection rules run faster or with a higher rate
 2. Check if SIEM logs are collected in real-time, without a 10-minute delay
Mean Time to Acknowledge over 30 minL1 analysts start alert triage with a high delay. Try to:
 1. Ensure the analysts are notified in real-time when a new alert appears
 2. Try to evenly distribute alerts in the queue between the analysts on shift
Mean Time to Respondover 4 hoursSOC team can’t stop the breach in time. Try to:
 1. As L1, make everything possible to quickly escalate the threats to L2
 2. Ensure your team has documented what to do during different attack scenarios

What is the False Positive Rate limit you should aim not to reach?

1
80%

When false positive rate hits over 80% you should consider to try automating alert triage, or exclude trusted activities from your EDR or SIEM detection rules.

Should all SOC roles work together to keep metrics improving? (Yea/Nay)

1
Yea

Metrics are often used to evaluate your performance, and good results lead to career growth and a raise to more senior positions like L2 analyst.

[Task 5] Practice Scenarios

What flag did you get after completing the first scenario?

1
THM{mttr:quick_start_but_slow_response}

MTTR to high, documentation, send the docs to L2

What flag did you get after completing the second scenario?

1
THM{mttd:time_between_attack_and_alert}

delayed alert triage, tune to 5 min, SOC review

What flag did you get after completing the third scenario?

1
THM{fpr:the_main_cause_of_l1_burnout}

FPR to high, FP remediation process, assign SOC to exclude noise from the rules

This post is licensed under CC BY 4.0 by the author.