Incident Response as a Game - New England Safety Partners, LLC

Incident Response training and testing is always a bit of a black art. When a client suggested I look at Black Hills’ Backdoors and Breaches as a way to stage their annual incident response tabletop exercise, I was a bit skeptical. But I was also excited to try to find a way to make my table top exercises less arbitrary and subject to my own biases. I want them to be less a retelling of the most recent newsworthy breach and more relevant to addressing future compromise combinations.

Usually, when we run training exercises for our customers, we have prep sessions where we collaboratively decide what sort of incident may be best for the team to work through. Maybe they had a recent event that caught them off guard, or they have a newly revised policy or procedure that needs to be evaluated. Regardless, its up to our consultant to craft a scenario, based on their experiences. That sometimes leads to biases in the scenario, and sometimes means that we go down rabbit holes with the client during the exercise. Those explorations are valuable, and the results are actionable, but “doing the same thing every time” becomes a real possibility. Also, when running a scenario, we tend to use our instincts to determine results. Do I believe that the security analyst really knows how to take a forensic image? When was the last time they did it? These are surmountable, for sure, but the folks at Black Hills have an approach that might take some of the bias out of it, and its intriguing.

Take the following narrative:

The Cloud team is notified that significant web traffic was detected making unusual connections to the web server, causing the server performance to degrade. The company uses standard monitoring tools for general performance monitoring.

The Cloud team was unable to get information from the monitoring tools and elected to review the server logs directly to determine if this was malicious traffic. Their evaluation was determined to be successful by the moderator, that there was what appeared to be an attack underway, with abnormal URLs. At this point they contacted the network operations staff. Network operations staff works with cloud operations and collectively they determined that the traffic was associated with unfamiliar IP address source ranges.

The cloud team concurrently notifies the CISO of a possible event in progress. A group chat is created, and relevant parties are invited.
About this time, customer service contacts cloud operations about the service disruption and associated customer complaints.

Security Analyst resources join the group chat and begin a triage. They take an image of the affected host and verify that an attack had occurred and may have been successful in not only compromising the host, but possibly pivoting into additional network resources. They were also able to enumerate specific source IP address and countries of origin.
Security team instructs Firewall team to block the source IP addresses.

The Security team elects to conduct a thorough end point analysis and makes the final determination that the attackers had successfully installed malware on the webserver to enable the attack to persist through reboots. Cloud operations knows this is an HA pair and the application can operate effectively without much degradation in service and informs customer service that they are taking the affected host offline for isolation and investigation.

Customer service follows process to notify high profile customers of possible performance degradation and inform staff of how to respond to reactive customer inquiries.

Security continues to evaluate the affected host and is looking for signs that the attackers were able to pivot to other system resources beyond the webserver using the signature of the malware and the now known IP addresses. This investigation indicates that the attackers were indeed able to move into the shared environment.

Security asks Infrastructure team to review the active directory security logs. The Infrastructure team determines that a new administrator account has been created. This new account was then used to create additional accounts with database privileges. Cloud disables these new accounts.

It appears that the attackers created a database account and had access to the database. Security creates a forensic image of the database server. Cloud database resources review the database access logs.

Using the data discovered from the earlier efforts, Security was able to discover outbound connections via https to the command and control servers. Forensic analysis of the server reveals a large database dump file. Outbound firewall logs reveal data with a matching byte count to command control servers.

That scenario is pretty typical. We have an attack or Compromise, and after triage, the team discovers that a Pivot to internal systems had occurred, and malware was discovered to enable Persistence. And finally, their efforts revealed the Command and Control and Exfiltration (Exfil). A complete simulated breach. Initial Compromise, Pivot, Persistence, CC/Exfil. The folks at Black Hills have given us a way to create these, then have the actions the teams take feel more meaningful, without guaranteeing success.

Its a set of cards, with card categories for the elements on a breach (Compromise, Persistence, Pivot, Exfil), policies and procedures (Procedure cards) and random events to effect the teams ability to execute those procedures (called Injects). It uses a 20 sided die to determine success and failure of individual actions (more on that in a moment).

As mentioned above, four cards comprise a “complete” incident, and these cards are revealed to the participants through their own actions. A Moderator (for the gamers out there, think Dungeon Master) determines through the roll of the dice the success of those actions, adding a random element, regardless of an individual’s technical capability.

For our narrative above, the details of the exercise were selected ahead of time and at random and the attendees were not informed of the nature of the simulated attack. Attendees were reminded of several existing policies and procedures prior to the start of the exercise but were encouraged to recall the complete catalog as the event progressed. Results of actions were randomly determined to succeed or fail at the discretion of the moderator, and their successful actions to revealed the elements of the breach as they progressed through the exercise. The elements of the compromise were:

Initial Compromise: Web Server Compromise
Attackers have compromised a webserver with a zero day vulnerability and have used it to pivot to the production network. In this example, the server was one of two web servers. Those servers are the front end to a platform that contains proprietary and confidential information.

Pivot: Weaponizing Active Directory
Attackers created multiple trusted accounts and mapped trust relationships and privileges in the AD network, including creating SA accounts on SQL server.

Persistence: Malware
The attackers created new services that start every time the system restarts.

Command and Control and Exfiltration: https as exfil
The attackers are using outbound https for connections to command and control servers and exfiltration of sensitive data.

The Attackers
State actors in Somewhere in Asia (we decided not to pick on a specific country).

Each element of the compromise had a specific type of action or capability that might allow the team to detect it. These include things like End Point Analysis, SIEM Log Analysis, Firewall Review, DLP and other common forensic capabilities and those mapped directly to the available Procedure cards. On review of existing policies, we picked four Procedure cards based on those policies, even though this company had a mature set of procedures documented. These were not the only “actions” they could take to investigate and respond, but these cards gave an advantage when determining an outcome. As a context for that, the dice rolls represent a contributors ability to execute a particular action, with some uncertainty as to success. Will the participant be able to recall a process? Will they remember how to use a user interface or command line to execute a technical capability? Did the contributor have enough coffee, and were they awake and doing things “correctly” versus potentially mis-keying a command?

Lastly, Injects were selected ahead of time, understanding that at least one of the participants in management might be tempted to guide staff in a way that wouldn’t illuminate weaknesses in process.

So, with that context, we brought the technical team in the room, and we started the exercise. The “game” is supposed to end when all 4 of the breach cards are revealed to the participants through their successful actions.

Did it work?

Well, for starters, we didn’t play the game entirely as directed, some things were selected ahead of time, and the Moderator used some additional discretion in determining results. In true “gamemaster” form, our moderator adjusted things as we progressed. Certain activities would not be one and done, or subject to an arbitrary restriction (log reviews) and the team had backup resources available. If this was a newly formed team, the randomness of selecting events might be a good spanner to toss into the works, but we decided our modified approach was best for this team.

It was a great way to set up an exercise. Making a a pure game might be a good way to train new participants in thinking about approaches to response. One of the purposes of a tabletop is to make sure the team understands procedural capabilities, what tools they have at their disposal, what existing processes might be in place to alert on issues or assist in evaluation. It worked for them. We also gave them “credit” on their result attempts if they named and described an existing process that would be applicable to a certain part of the breach after they described it to each other during the event. Rolling dice got some funny looks from the less game savvy folks, but the context of “not enough coffee” seemed to mollify those naysayers.

We got good participation, enumerated a number of improvements both in existing process and skills development and created a credible report on the customer’s capabilities. It was really nice to have a selection of events to “pick” from so you can tailor the exercise, and make the experience unique.

I will use it again. Maybe we can play it at a conference together?

For more information, including “How to Play” and where to get your own deck, visit Black Hills Information Security!