Microsoft Sentinel - Incident, Detection, Response and Threat Hunting Workflow.

kkalvani
Nov 5, 2024
11 min read

INTRODUCTION

There are various SIEM tools used in the industries i.e., Splunk, QRadar, RaptorX, etc. In this article, we're going to talk about Microsoft Sentinel and what steps do cybersecurity professionals follow in the process of threat hunting. We are going to focus on the following Tabs in Sentinel: Data Connectors, Analytics, Hunting, Incidents, and Automation.

WHAT IS A SIEM? HOW DOES IT WORK?

First off, a SIEM (Security Information and Event Management) is a platform/solution that is deployed within your organization's infrastructure to collect all security data from various authorized devices such as employee PCs, routers, servers, domain controllers, etc., in the form of data logs. These logs are usually parsed for easy manual/automatic analysis. This data is then used by the IDR team to monitor and address potential security threats.

SIEM works by aggregating and analyzing all the activity (the logs) from those authorized devices in your organization which in turn helps the SIEM find trends and discover threats.

These threats, a.k.a "alerts" in the SIEM indicate a potential threat or suspicious activity based on certain predefined rules or patterns. If a rule matches an attack signature or identifies a malicious pattern in the logs, the alert is generated. Multiple related alerts can be grouped into an "incident", which enables teams to respond in a unified manner rather than dealing with each alert individually.

When an incident occurs, the manual or automatic response mechanism will trigger. Some SIEMs have an automated response process which is done by the SOAR (Security Orchestration, Automation, and Response). The SOAR tool enables playbooks or workflows to address/respond to incidents efficiently, reducing response time. A playbook is a set of actions specified (predefined) to be performed based on the type of incident found. One of the reasons we use this is so that we can respond to security threats with minimum human assistance using AI and ML integration to automate this process.

The SOAR can be in a separate environment, or it can be integrated with the SIEM as one whole tool. That's what is different with Sentinel. It combines SIEM and SOAR capabilities within a cloud-native platform managed by Microsoft, providing both threat detection and automated response capabilities.

Fig.1. Microsoft Sentinel SIEM Environment

AN OVERVIEW OF A TYPICAL IT INFRASTRUCTURE WITH A CYBERSECURITY TEAM

In this diagram below, I have explained the entire workflow of how logs get collected and sent from an organization to their IDR Team's Azure Analytical Workspace and how those logs are accessed by Microsoft Sentinel for further threat hunting and management for the organization.

Fig.2. Organization Structure

IT Office Infrastructure -

The employee's PCs, server and domain controller in the IT office have the Azure Monitor Agent (AMA) installed and configured by the IDR team. This agent collects the activity/monitoring data from those devices. The IDR team would create Data Collection Rules (DCRs) to specify what data should be collected from the IT office systems.

They would associate these DCRs with the Azure Monitor Agents installed on the IT office devices. However, some firewalls and routers aren't compatible with AMA so it's our job to configure those devices to ensure that the remote logging is enabled along with the destination IP address and port.

The IDR team will be having their Azure monitor services ready i.e., their Azure Analytics Workspace, where all the logs collected from the IT office devices, are sent to this workspace. The Azure Analytics Workspace (often referred to as Log Analytics Workspace) acts as the central repository, where logs from all compatible devices are collected. The AMA uses HTTPS to securely transmit the data to this workspace over TCP port 443 by default.

(Note: The IDR team, in this case, is located near to the IT office, hence they could connect to the same network as the IT office via VPN for extra layer of security for secure data transmission).

The IDR team in the cybersecurity office will receive all the logs from the IT office, and the Azure Analytics Workspace processes all the data through an ingestion pipeline where it is routed and organized into their respective destination tables in the workspace based on their datatype (eg., SecurityEvent for Windows security logs, Syslog for syslog data from Linux systems, etc.).

In the Azure Analytics Workspace, the IDR team would have set up their SIEM tool - Microsoft Sentinel, which is natively integrated with Azure Log Analytics Workspace, meaning that once Sentinel is deployed, it connects directly to the workspace. Therefore, it can access the logs directly from the workspace.

The IDR team can then use Sentinel to analyze logs for threat hunting, detection, and response.

Fig.3. Summary mind-map of the workflow

THREAT HUNTING PROCESS & THE 4 PILLARS OF SENTINEL

Now that we have discussed how the logs are available to the IDR team, let us discuss what they do with these logs in Microsoft Sentinel and how they use it to make their organization more cyber secure.

In the following section, I have explained the threat hunting process (stages) that organizations usually follow using Microsoft Sentinel. Additionally, I have aligned the stages with the four pillars of Microsoft Sentinel. I figured this would be the best way to explain two concepts in one section using a real-world example.

Fig.4. Stages of Threat Hunting Fig.5. Four Pillars of Microsoft Sentinel

STAGE 1: DEVELOP HYPOTHESIS

There is certain process followed for threat hunting, detection and response. First, we need to know - What specific threat/threats we are looking for? Why now? What drove the organization to begin this stage of the hunt process? This is called a "hypothesis". Based on the current attacks and attack trends that cyber-criminals perform, the cybersecurity team keeps themselves up to date with this information and they develop a hypothesis, or we can say, an assumption that a specific cyber-attack is going on and they must review their logs and hunt for those threats, therefore performing their hunting process to prove (confirm or deny) their hypothesis.

For example: Suppose the IDR team recently learned about a new phishing campaign targeting organizations similar to theirs. This campaign is known for installing remote access trojans (RATs) on compromised devices.

Hypothesis: Employees may be susceptible to phishing emails. We suspect that an attacker may have installed a RAT on one or more employee systems following a phishing attempt.

In this example, the IDR team may have learned about this attack from various threat intelligent sources, recent phishing attack trends, or even a specific alert indicating unusual behaviour on an employee system. This hypothesis would drive them to look into relevant logs and begin their threat hunting process.

Note: Some hypotheses can also be created by continuously monitoring for any suspicious logs, for e.g., If an IDR team member notices a high volume of login attempts at an unusual hour, they may hypothesize a brute-force attack. This 'log-based' hypothesis allows them to then search for additional related indicators or logs, like failed logins across various accounts or unusual IP addresses attempting access.

Either way, we must perform regular threat hunting if needed. Better to be safe than sorry.

Returning to the phishing campaign example, the IDR team hasn't confirmed infections yet. Instead, they’re using intelligence about recent RAT attacks targeting similar organizations as a starting point or hypothesis. Since they know this campaign is a threat, they begin proactively monitoring relevant logs for any signs that might confirm or deny their suspicion.

STAGE 2: DATA REVIEW (PILLAR: COLLECT)

Here, once the hypothesis is created. The IDR team would look for logs relevant to the hypothesis. They start by connecting to appropriate data sources from the Data Connectors tab. Data connectors allow the team to gather logs from sources that are relevant to the threat, such as email, endpoint security, and network traffic.

Fig.6. Data Connectors Tab

Fig.7. Various Data Connectors

Some appropriate data sources in our example would be:

· Microsoft 365

· MS defender for endpoint

· Any data source specified by the organization.

Endpoint logs from Defender for Endpoint and email logs from Microsoft 365 are pulled specifically because these sources would likely contain relevant indicators for phishing or RAT activity.

The team ensures that Data Connectors are set up in Microsoft Sentinel to collect logs from employee PCs, network traffic, and endpoint security tools, so they have the relevant data to test the hypothesis, and this will ensure that they have comprehensive visibility for analyzing potential threats.

The team reviews email logs, looking for unusual email activity—specifically, emails containing malicious links, attachments, RAT-related events that have been flagged by employees, and activity logs from employee systems. By doing this, they position themselves to identify any indicators of compromise (IOCs) that might confirm or deny their hypothesis in the next stages.

Pillar used in this stage: Collect - they collect the appropriate logs based on their hypothesis and is stored in their SIEM solution.

STAGE 3: PLAN HUNT

At this stage, they will have many kinds of logs from various data sources chosen. They need to outline the strategy for hunting based on the data review and hypothesis. This includes determining the scope, tools, and specific data sources to investigate. The data review stage compliments this stage by ensuring the team knows what kind of logs they are looking for, therefore enabling them to plan what kind of logs they should focus on.

The team decides to focus their hunt on:

Email logs for phishing attempts.
Endpoint logs to check for any malicious activity, such as unexpected file changes or new processes.
Network traffic logs to identify communication with known malicious IP addresses.

STAGE 4: EXECUTE HUNT (PILLAR: DETECT)

Now that they know what exactly they are looking for, they need to query it. After collecting all the necessary logs they need, normally, manual checking isn’t recommended unless necessary. They would use Kusto-Query-Language (KQL) queries to query for specific logs with specific events that happened.

The team runs KQL queries to identify:

Emails with links to known phishing sites.
Any endpoint activity showing processes created from these email attachments.
Network traffic to see if any endpoints are communicating with suspicious external servers.

In Microsoft Sentinel, there are two ways of finding threats from the logs:

1) Analytics

Analytics is where threats are detected automatically. Sentinel uses predefined rules (some can be configured by the team) with KQL logic to continuously monitor logs and identify threat patterns.

If an Analytics rule is triggered (If any log pattern match any of the rules), it creates alerts, which can then be grouped into incidents. These incidents appear in the Incidents tab, where they can investigate and respond to them.

Fig.8. Analytics Tab

Fig.9. Built-in KQL rule query for attack technique T1098

In Figure 9, We can individually check rule queries for specific attack techniques. This is because Microsoft Sentinel integrates with the MITRE ATT&CK framework, which maps specific tactics, techniques, and procedures (TTPs) that attackers commonly use. By aligning hunting activities with MITRE techniques, analysts can target known attack behaviours. Sentinel includes queries mapped to MITRE techniques, so hunters can search for specific adversary behaviours relevant to their organization’s risk profile.

Note: The MITRE ATT&CK Tab, on the top left in figure 9 contains all the attack tactics and techniques updated with the framework.

2) Hunting

Hunting is for proactive, manual threat detection. Here, the team would use predefined hunting queries or create custom queries (using KQL) to search through data based on the hypothesis. It’s a way to look for suspicious patterns or activities that haven’t been automatically flagged by analytics.

The Hunting tab doesn’t generate alerts or incidents automatically. Instead, it displays query results based on your search criteria, allowing you to explore potential threats and validate your hypothesis. The team would construct specific KQL-based searches on their own to find signs of the RAT. This makes hunting manual and dynamic, as they customize searches based on known RAT behaviours and adapt as they gain insight.

If hunting reveals a pattern, the team may set up Analytics Rules in Sentinel to automatically detect similar patterns in future logs, shifting from manual detection to rule-based alerts. This helps in keeping the organization more cyber secure for future threats similar to our example and the same hunt procedure can be applied to any other attacker’s TTPs.

Fig 10. Hunting Tab- Built-in Hunting queries

Fig.11. Built-in query. Filter applied: T1098 Account Manipulation.

Fig.12. KQL code of the selected query.

Fig. 13. Creation of Custom hunting queries

INVESTIGATE PILLAR

Based on our threat hunting results, if threats like - increase in emails with attachments from unrecognized senders, unusual activity from the employee systems like process changing, or unexpected file changes, etc., are found from analytics or hunting (or both), we would find them grouped as an “Incident” in the “Incidents” tab. The incident can be titled “Potential system compromise by RAT”. They investigate this incident and then they would respond to it.

Incidents in the Incidents section compile alerts from both Analytics (predefined rules and KQL-based detections) and Hunting (manual KQL querying). When threats are detected through either of these methods, they can be aggregated into incidents, which provide a centralized view for further investigation.

The Incidents tab allows you to analyze these alerts in more depth, helping you trace attack paths, assess severity, and understand the context to decide on the appropriate response actions.

Fig.14. Incidents Tab

STAGE 5: RESPOND (PILLAR: RESPOND-AUTOMATION)

When the IDR team investigates the incident(s), the automation response process uses the SOAR capability to respond to the incident(s) automatically. This takes place in the “Automation” tab. Here, we have the predefined and custom-made automation rules, active playbooks – to specify response actions, and playbook templates.

Fig.15. Automation Tab

Continuing our example,

The hunt revealed that several employees clicked on a malicious link, leading to malware RAT installation on their machines. They found suspicious logs that had unusual activity from the affected systems which caused the “Potential system compromise by RAT” incident to exist. The team, using their appropriate playbook:

Isolated affected systems from the network.
Blocked the malicious IP addresses.
Notified impacted employees to change their passwords and follow up with incident response protocols.

These were the playbook-driven “actions” performed automatically based on the incident and this is how the automated response process works in Microsoft Sentinel.

STAGE 6: MONITOR

After responding to the incidents, the IDR team would continuously monitor Sentinel for any new signs of threats or anomalies to ensure no further suspicious activity arises related to the RAT or phishing attack. This helps ensure that the response measures are effective and that no new threats emerge.

The team sets up alerts for suspicious email activities and monitors for any unusual login attempts from the affected accounts or machines. They also check the endpoint security solutions for any signs of reinfection.

STAGE 7: IMPROVE

After the hunt is over through all the stages, while being vigilant through monitoring, the IDR team would analyze the hunting process and outcomes to identify areas for improvement, whether in tools, techniques, or procedures.

After the incident, they would conduct a post-incident analysis, discussing what went well and what could be improved. They might decide to:

Implement better email filtering solutions.
Conduct training sessions for employees on recognizing phishing attempts.
Refine their hunting hypotheses and queries for future hunts based on lessons learned.
Adjust Data Connectors to capture additional logs if needed.

Additionally, the IDR team would document insights from the investigation, improve analytics rules, and refine detection methods based on this incident to enhance future threat-hunting processes.

SUMMARY WITH THE 4 PILLARS

Collect: The team begins by connecting relevant data sources (employee PCs, network logs, firewall, and server logs) through Data Connectors. This makes sure they have a comprehensive dataset to support threat hunting.

Detect: They use both predefined Analytics Rules and custom Hunting Queries to find RAT-related patterns. Analytics Rules catch pre-defined patterns, while Hunting enables dynamic, hypothesis-based searches.

Investigate: As alerts escalate into an incident, the IDR team uses this pillar to analyze related activities, ensuring they understand the RAT’s presence, scope, and impact before taking further steps. What we’ve discussed in the Respond stage shows a simple example where the IDR team would find the incident and respond to it. However, in reality, it goes beyond simply viewing and responding; it’s about building a deeper understanding of the threat's behaviour and scope.

When an incident like a suspected RAT compromise is detected, the team would look into:

Event Correlation: They trace activities related to the incident across multiple sources or events, such as multiple failed login attempts, unusual file access, or communication with external IPs. This helps them understand if these activities are part of a coordinated attack.

Timeline Analysis: They piece together when each suspicious action occurred, determining how long the RAT has been active and if there are patterns that suggest additional risks.

Affected Systems and Users: By identifying all systems or users impacted by the RAT, they can prevent further spread and understand the full scope of the attack.

Lateral Movement and Persistence Techniques: They check if the RAT is spreading to other devices or if it’s establishing persistent access.

This investigation allows the IDR team to understand the “bigger picture” before taking action, ensuring they respond not just to isolated events but to the entire threat by understanding the full attack story to ensure effective and thorough remediation.

Respond: Through Sentinel’s SOAR playbooks, automated responses are triggered to block malicious IPs or quarantine compromised devices. This automation minimizes manual tasks and helps contain the threat swiftly.

REFERENCES

Microsoft Azure Sentinel Pillars | Collect | Detect | Investigate | Respond (2022). Available at: https://www.youtube.com/watch?v=pqQx595YrX0 (Accessed: 29 October 2024).

wwlpublish (no date) SC-200: Perform threat hunting in Microsoft Sentinel - Training. Available at: https://learn.microsoft.com/en-us/training/paths/sc-200-perform-threat- hunting-azure-sentinel/ (Accessed: 29 October 2024).

Microsoft Sentinel - Incident, Detection, Response and Threat Hunting Workflow.

Recent Posts

Comments