UT Shield

UT Austin Information Security Office

The University of Texas at Austin

securus // vigilare // insanus

March 16, 2016, Filed Under: Reverse Engineering

Reverse Engineering Necurs (Part 3 – Patching)

Introduction

In the previous post, we started to step through the Necurs sample using WinDbg. We also used IDA Pro to perform static analysis of the malware sample so we could get an idea of where to set breakpoints. However, while stepping through the code, we jumped to a location in WinDbg that was not disassembled by IDA Pro. When we looked a little closer, the hex values in the memory window in WinDbg did not match up with the hex values displayed in IDA Pro. It looked like the malware sample had “unpacked” itself, and we had jumped to a location that’s called the “original entry point”, or OEP. This post is going to describe how to patch the original executable with the unpacked code so we can continue to use IDA Pro to perform static analysis.

I’m going spend some time talking about the information in the PE header. Since discussing the PE header in detail is outside of the scope of this blog post, I’d highly recommend taking a look at an article in CodeBreakers Magazine called Portable Executable File Format – A Reverse Engineer View. The article was written in 2006, but is applicable to this sample. It is pretty lengthy, but is well written and very interesting. It’s also a lot easier to understand than the formal specification of the PE file format.

Section Header Analysis

Before getting started, recall that we are paused at offset 0x40614C in WinDbg. Take a snapshot of the Windows VM and call the snapshot OEP. Then restore the snapshot of the VM that was paused at the entry point of the program. Enter !dh 0x400000 in the WinDbg command window. This was the command that was used to display the entry point of the executable. This command will also display information about the different sections in the malware sample.  The section we are interested in is section 1, the text section.

Section header for the "text" section
Section header for the “text” section
The first item of interested is the virtual address. It has a value of 0x1000. We can determine the location in memory of this section by adding the virtual address value to the module’s base address.  Since the start address of the module was 0x400000, this section was loaded in memory at address 0x401000. If we type this address into the Virtual textbox of WinDbg’s memory window, we can view the hex values that were loaded into memory at this location. If we jump to this location in IDA Pro, we will see that the values in IDA Pro match the values displayed in WinDbg.
WinDbg memory window. Hex values at offset 0x401000
WinDbg memory window. Hex values at offset 0x401000
IDA Pro Hex View tab. Hex values at offset 0x401000
IDA Pro Hex View tab. Hex values at offset 0x401000
The next item of interest in the section header data is the file pointer to raw data. This value is the offset of the section within the executable file itself.  It is located at offset 0x400. If we open the malware sample in a hex editor, and look at the values at this offset, we’ll see that they also match up with the values at offset 0x401000 in IDA Pro and WinDbg.
Hex dump of original exe at offset 0x400
Hex dump of original exe at offset 0x400
The final two items of interest are size of raw data and virtual size. The size of raw data is the number of bytes this section occupies in the executable (0xF200 bytes). The virtual size is the amount of space needed when loading this section into memory (0xF088 bytes). For performance reasons, sections are written to file in 512 byte increments. If a section does not end at a 512 byte increment, the remaining bytes are null padded. If we inspect the data at location 0x400 +  0xF088 = 0xF488 in the hex editor, we will see a series of null bytes used to pad the section to the 512 byte increment. The virtual size is smaller than the size of raw data because the null bytes do not need to be loaded into memory.
Null padding at end of section in executable file
Null padding at end of section in executable file
If we inspect the values at offset 0x401000 + 0xF088 = 0x410088 in IDA Pro and WinDbg, we will see the null padding as well. When an executable is loaded into memory, the memory is allocated in 4096 or 8192 byte chunks.  So, we should see null padding at the end of the text section because the section does not end at a 4096/8192 byte increment.
IDA Pro view of null padding at end of text section
IDA Pro view of null padding at end of text section
WinDbg view of null padding
WinDbg view of null padding
Patching Strategy

We know that the text section is loaded into memory at offset 0x401000. We also know that 0xF088 bytes of memory were allocated for this section.  Finally, we know that the executable has been unpacked in the OEP snapshot. So if we can dump the contents of memory at this offset into a file and replace the text section of the original executable with the memory dump, we should have an unpacked executable that can be disassembled by IDA Pro. We will also have to patch the entry point of the program so that IDA Pro knows where to start disassembling the new executable.

To dump the memory, restore the OEP snapshot. Then in the WinDbg command window, use the following command to dump the memory:

.writemem c:\necurs.bin 0x401000 0x410087.

The command window should display that F088 bytes were written. We will use the necurs.bin file to patch the text section of the malware sample.

The entry point of the malware sample is stored at offset 0x118 in the original executable. This offset shows a value of 0xd460. However, the actual value of the entry point was 0x60D4. The entry point is stored in little endian format, so when we patch this value, we will need to convert the new entry point into little endian format as well. Recall that we jumped to 0x40614C after the executable was unpacked, so we’ll change the bytes at offset 0x118 from 0x60D4 to 0x4C61.

Entry point value at offset 0x118 of malware sample
Entry point value at offset 0x118 of malware sample

The python script shown below was used to patch the malware sample. The original malware sample was named necurs-packed.exe. The memory dump was named necurs.bin, and the patched malware sample was called necurs-unpacked.exe.

python script used to patch packed malware sample
python script used to patch packed malware sample
If we use IDA Pro to disassemble the unpacked executable, and inspect the start subroutine (at offset 0x40614C), the disassembly in IDA Pro matches up with the disassembly in WinDbg. We’re now able to perform a static analysis of the unpacked code while stepping through the code with WinDbg.
IDA Pro disassembly at offset 0x40614C
IDA Pro disassembly at offset 0x40614C

 

WinDbg disassembly at offset 0x40614C
WinDbg disassembly at offset 0x40614C

Some Things To Keep In Mind

It was relatively easy to patch the Necurs malware sample so we could analyze it in IDA Pro. However, there are packing algorithms that also compress the executable code. If this happens, the size of the unpacked section may be larger than the size of raw data in the executable file. If this is the case, the memory dump will not “fit” into the text section of the original executable. If this happens, it’s still possible to patch the executable, but it gets a little bit more complicated. Specifically, the file pointer to raw data field of any other sections would need to be patched to make room for the contents of the memory dump.

Also, even though we patched the entry point of the unpacked executable, the unpacked executable may not run in a debugger in the same way as the unpacked code. Any subroutines that were executed before the jump to the OEP may have written data to memory, and the new start routine may need to access this data. Remember that the main purpose of unpacking the executable in this way was to allow IDA Pro to properly disassemble the malware sample.

Next Steps

In the next blog post, I’ll talk about using scripting in IDA Pro as an alternative to patching the executable.

March 10, 2016, Filed Under: Reverse Engineering

Reverse Engineering Necurs (Part 2 – Unpacking)

Introduction

In the previous post, we talked about using tcpdump on a VM to monitor network traffic produced by another VM infected with Necurs.  We noticed that some “weird” UDP packets were being generated after infection, and used this observation to verify that the malware sample would run in a virtual environment, as well as in a debugger.  In this blog post, we’re going to try to figure out the location of the instructions that are responsible for generating the UDP packets.  We’re not going to succeed (at least not yet), but we will see some interesting things.

Pausing at the Malware’s Entry Point

In the previous post, we located the entry point of the Necurs malware sample in WinDbg so we that we could use a debugger to analyze the sample.  While in WinDbg, I like to have several windows opened and docked so I can see the disassembled code, view memory, and issue commands to the debugger.  So I usually use the icons in the toolbar to open a memory window and disassembler window.  The command windows opens automatically when you open the executable.  An image of my layout is shown below.  After I setup this layout, I took a snapshot of the VM, so I could quickly return to the WinDbg session paused at the malware’s entry point.

WinDbgLayout
WinDbg Layout

Some things to note about the layout: a breakpoint was set at the entry point of the program, and the program has paused execution at the breakpoint.  This instruction is highlighted in purple in the disassembler window.  The display format in the memory window is “byte”, and the entry in the “virtual” text box is “@$scopeip”.  As long as these settings are used, the memory window will display the hex values currently in memory starting from the offset of the paused instruction in the disassembler.

Analyzing the Start Subroutine

We’re going to begin by analyzing the Start subroutine in IDA Pro.  A snapshot of the routine is shown below.  There are two instructions in the subroutine: a call instruction to sub_406E5D and a jmp to loc_403474.  Remember that our goal is to find the code that generates the UDP traffic we observed in tcpdump.  One would think that either the call to sub_406E5D will eventually execute the instructions that generate the UDP packets or the jmp instruction will do so.  In WinDbg, we are going to step over the call to sub_406E5D and then monitor the output of tcpdump on the Ubuntu VM.  If we start seeing the “weird” UDP packets that we saw when we ran the malware sample in the previous blog, we’ll know that if we further analyze sub_406E5D (and all the subroutines it may call), we should find the code that generates the UDP traffic.  If not, the jmp to loc_403474 should eventually lead us to the code that generates the UDP packets.

NecursEntryPoint
Start Subroutine as displayed in IDA Pro

To step over the call to sub_406E5D, enter p into the WinDbg command window.  If you wait a few seconds, you should see the “weird” UDP packets in the tcpdump output.  This means that we need to analyze sub_406E5D.  Next, we want to restore the snapshot so that we are in WinDbg at the malware’s entry point.  Instead of stepping over sub_406E5D, enter t into the command window to step into the subroutine.

Analyzing sub_406E5D

If we look at sub_406E5D in IDA Pro, we’ll notice that there are no call instructions.  Instead, there is a jmp instruction to an address that’s stored in the ecx register.  A few lines above the jmp instruction, you will see the address of sub_402C4B being loaded into the ecx register.  It’s pretty strange to use this set of instructions to jump to a subroutine instead of simply using a call instruction.  When a call instruction is executed, the address of the instruction after the call instruction is pushed onto the stack before code execution is transferred to the subroutine.  Once the subroutine ends, execution can be transferred to the address that was stored on the stack. We’ll see later in this post what happens when jmp is used instead of call in this subroutine.

sub_406e5d
IDA Pro View of sub_406E5D

Next, I used the p command in the WinDbg command window to step over instructions until I arrived at the first instruction in sub_402C4B.

Analyzing sub_402C4B

If we analyze sub_402C4B, we will see that it is a pretty long subroutine with a number of call instructions.  Many of the call instructions have text strings next to them (for example, the call to LCMapStringA at 0x402D38).  These are calls to Windows API functions.  If you are interested in learning more about the API calls, documentation for most of the API functions can be found on the Microsoft Developer Network website (msdn.com).  There are also calls to sub_xxx subroutines, where xxx is a location within the malware sample.  These subroutines have been disassembled by IDA Pro.  Since none of the Windows API functions in this subroutine appear to be network related, they probably do not generate the UDP packets we are seeing in tcpdump.

So, we are going to focus on the calls to the sub_xxx subroutines.  These are the subroutines that are called in sub_402C4B:

  1. sub_402BE2 (at 0x40308E and 0x4030AA)
  2. sub_401000 (at 0x403188)
One of these subroutines probably contains the instructions that will eventually generate the UDP traffic that we are interested in.  We will set breakpoints at these three locations, then step over the functions and observe the tcpdump output to see which subroutine generates the traffic.  In the WinDbg command window, type the following to set the breakpoints.
  1. bp 0x40308E
  2. bp 0x4030AA
  3. bp 0x403188
Then type g to continue the program. The first breakpoint we should hit is at offset 0x40308e. This line in the WinDbg disassembly window is highlighted in purple, which means this is the next instruction that will be executed when we continue executing the program and that there is a breakpoint set on the line as well. Note that the instruction at 0x40308e is red, which means a breakpoint is set on this line as well.
FirstBreakPoint
WinDbg – Program Paused on Break Point

Next, we type p in the command window to step over the subroutine. When program execution pauses, the instruction at 0x403093 is highlighted in blue. This means that this is the next instruction that will be executed, but its not purple because no breakpoint is set on that line. If you look at the tcpdump output on the Ubuntu VM, there should be no “weird” UDP packets generated by the infected VM. Since we did not see any UDP packets, the call to this subroutine is not responsible for the UDP traffic.

Pause At 403093
WinDbg – BreakPoint After stepping over subroutine call

Next, after typing g into the command window, program execution paused at 0x4030aa.  Once again we can type p to step over the subroutine and check the tcpdump output for UDP packets. Once again, no UDP packets are displayed in the tcpdump output, so the call to this subroutine is not responsible for the UDP traffic. Next, we type g to continue executing the program. However, this time, we do not hit another breakpoint. Instead, the program exits.  If we look at the tcpdump output, we will see UDP traffic being generated by the infected VM.

Why Didn’t We Hit Another BreakPoint

Based on the static analysis, it looked like either sub_402BE2 or sub_401000 was responsible for generating the UDP traffic.  But, we found that even though sub_402BE2 was called twice, it was not responsible for generating the UDP traffic, and that sub_401000 was never called.  What happened?

There’s a couple of reasons this may have happened. First, we may have hit a jump statement within sub_402C4B that took us someplace unexpected. The second is that we hit a retn instruction, and were returned to an unexpected place.  We will focus on finding a retn statement. First, restore the snapshot of the infected VM where the malware was paused at the entry point. Then, set a breakpoint at 0x4030AA, the last breakpoint that was hit in the previous paragraph. Type p into the WinDbg command window to step over the subroutine, then type pt. This command will continue execution until the next retn instruction, and then pause the program.  The program should pause at 0x403180. Then type p to execute the retn instruction. The program will then pause execution at instruction 0x402296.

Remember when we were analyzing sub_406E5D, and execution was transferred to sub_402C4B with a jmp instead of a call instruction? The malware placed the address of sub_402296 on the stack so that when sub_402C4B completed execution, code execution would be transferred to sub_402296. This is an example of a technique used to hinder static analysis in IDA Pro. However, stepping through the program allowed us to determine where code execution resumes, and we now have a new subroutine to analyze.

Analyzing sub_402296

Once again, we will start by looking at IDA Pro, and identify the possible subroutines that we want to step over and monitor. The following subroutines are called within sub_402296:

  1. sub_40200A (0x4023A1)
  2. sub_402241 (0x402449)
  3. sub_40320F (0x4024A6)
  4. sub_402AE8 (0x40298D)
  5. sub_4031D4 (0x4029F3)
Once again, we will set breakpoints on each of these locations, step over the function, and then monitor the Ubuntu tcpdump for UDP packets. When we do this, we hit breakpoints at the following locations:
  1. 0x4023A1
  2. 0x402449
  3. 0x4024A6
However, once we continue execution, we do not hit any of the other breakpoints. Once again, the malware completes execution and starts generating UDP packets. We can try using pt in the WinDbg command window to see if the malware hits another retn instruction. But, when we do this, the malware does not break on a retn instruction. Instead, Windbg shows that the malware is busy running. Something else weird is happening here.
MalwareBusy
WinDbg – *BUSY* – code execution is no longer paused

Once again, we will restore our snapshot at the program entry point. This time, we need to set a breakpoint at 0x4024A6, the last breakpoint that we successfully hit. Enter g in the WinDbg command window and the malware should pause at the breakpoint.

Next, we’ll take a closer look at the IDA Pro disassembly. We know that we’re not running into an unexpected retn instruction, so we’ll look at the jmp instructions instead. Most of the jmp instructions will jump to a known location in the malware sample. For example, the jmp instruction at 0x4025DB jumps to loc_4025EA. However, the instruction at 0x40290C is a little more interesting. It’s a location determined at runtime, as shown below.

dynamicjump
Jump to location determined at run time

If we set a breakpoint on this location, and then type g in the command window, the malware will pause execution at this point. Next, type p to execute the jump instruction. The next instruction that will be executed is at 0x40614c. If you look in IDA Pro, this portion of the malware was not disassembled, as shown below.

NonDisassembledCode
Code that was not disassembled by IDA Pro

There’s something else that is odd about this code. If you switch to the hex view in IDA Pro, and jump to address 0x40614C (by selecting “Jump to address” from the “Jump” menu item), you will be able to view a hex dump of the code at that offset. If you look at the hex output in the WinDbg memory window, you will see that the hex outputs do not match.

HexView
Hex Output In IDA Pro for Offset 0x40614C
WinDbgMemory
WinDbg Hex Output at Offset 0x40614C

Some of the previous subroutines that we stepped over probably unpacked this portion of the executable. Once unpacked, code execution was transferred to this portion of the malware. So, even if we tried to manually disassemble this portion of code with IDA Pro, since the values at this offset were changed, the disassembly would not match the code being executed in WinDbg.

Next Steps

In the next blog post, I’ll talk about some techniques for patching the malware sample and the data in IDA Pro. After doing so, we’ll be able to use IDA Pro to disassemble the unpacked code so we can continue with our static analysis.

 

 

March 2, 2016, Filed Under: Reverse Engineering

Reverse Engineering Necurs (Part 1 – Preliminaries)

A few weeks ago, a fellow analyst sent me a link to a write-up of a new peer-to-peer botnet called Necurs.  The write-up included a link to a SANS blog entry.  The blog entry included a pcap containing traffic captured from an infected host as well as a sample of the malware.  Since I’m pretty interested in malware that communicates via peer-to-peer mechanisms, I decided to take a look at this sample.  Hopefully, the notes I take during the analysis will be helpful to anybody else that is new to reverse engineering.

A little background about myself (as a reverse engineer): I took the SANS GREM course a number of years ago and learned quite a bit about reverse engineering.  But, even though I made it through the course, it was very difficult to get started analyzing malware samples because of the different types of obfuscation and evasion techniques that I encountered.  The SANS course mentioned a number of tools that could be used to make analysis easier, but I decided to start limiting myself to four tools: IDA Pro (a disassembler), WinDbg (a debugger), Process Explorer (a Windows Sysinternals tool for inspecting running processes on a Windows System), and tcpdump (for capturing/analyzing network traffic).

When I first started analyzing malware, I wanted to be able to analyze a sample even if I couldn’t use a tool that automated the unpacking of the malware.  So, instead of searching for automated unpackers, I spend a lot of time in WinDbg stepping through an executable.

I also make use of a pretty standard virtual machine setup to aid with the analysis.  I use two virtual machines with the following IP address configuration:

  1. WindowsXP With IP Address 192.168.1.2, subnet mask 255.255.255.0, default gateway 192.168.1.1, DNS server 192.168.1.1
  2. Ubuntu with IP address 192.168.1.1, subnet mask 255.255.255.0

The network cards on each of the machines is configured as on the “Host-only” network so that network traffic does not escape the virtual environment that has been created.  The Ubuntu VM will run tcpdump and capture traffic sent by the infected Windows XP VM after it has been infected with the malware.  The virtual environment is very important because you can create snapshots of the Windows XP virtual machine at various stages of the infection.  So, if you accidentally step too far in the debugger, you can return to a previous state in very little time.

A final note.  Many malware samples employ defenses to evade analysis.  Some will not run when executed in a debugger.  Some will not run while executing in a virtual environment. So, there are a couple of preliminary checks that I make to verify that the malware will run in the virtual environment before I start the lengthy task of stepping through the executable.   First, we’d like to verify that the malware will run in a virtual machine, then we would like to verify that the malware will run when executed in a debugger.  If we don’t do this, we may spend a lot of time in the debugger stepping through instructions that are used by the malware to evade analysis.  The steps that follow will not guarantee that you will not run into some other type of evasion technique, but they will hopefully verify that you can start stepping through the executable in a virtual environment.

First Run: execute the malware within VM

First, I need to be able to observe something that tells me that I have successfully infected the Windows XP VM.  Since this malware sample will try to send network traffic to other infected hosts within the botnet, we can use tcpdump to monitor the network traffic generated by the Windows XP VM.  I run tcpdump twice on the Ubuntu VM.  I use one tcpdump instance to write to a file for later analysis, and one tcpdump instance that outputs to the console.   Note that the Windows VM does generate some Netbios traffic that is unrelated to the malware.  You should see this traffic before executing the malware on the Windows XP VM.  The tcpdump commands used on the Ubuntu were as follows:

  1. tcpdump -nnSX -i eth0
  2. tcpdump -nS -i eth0 -w necurs.pcap

When I executed the Necurs sample on the Windows XP VM, the tcpdump instance displayed UDP traffic with a source port of 14820 with a payload size of 29 bytes.  This looks kind of strange, so hopefully we have verified that the malware will run within a virtual environment.

Second Run: Execute From Within WinDbg

Next, I executed the sample from within WinDbg while monitoring network traffic on the Ubuntu VM using tcpdump.  The following steps were taken after starting Windbg:

  1. File -> Open Executable …
  2. Selected the Necurs sample in the file dialog
  3. Type g in the command window

Once again, UDP packets from an odd port on the infected Windows VM were displayed in the tcpdump output.  This shows that the malware will run even if it is being executed in a debugger. However, this time, the source port of the packets was 5255.

ScreenShot of Packet Capture
Output of tcpdump on Ubuntu VM

This suggests that the malware chooses a random port upon infection as opposed to having a port hard-coded into the binary.Another odd thing happened. The UDP packets were being sent even after the malware had finished executing in the WinDbg command window.  This indicates that another process was spawned and that the newly spawned process was responsible for generating the network traffic.  I used Process Explorer to display information about the processes that were running in the infected VM. A process called syshost.exe was running with PID of 576. The process properties showed that this process was listening on port 5255 for both TCP and UDP traffic, so this is probably the process that the malware spawned.

ScreenShot of ProcessExplorer
Process Explorer and syshost.exe properties

Third Step: Break At Program Entry Point

Now that we have verified that the malware sample will run on a debugger in a virtual machine, we need to start stepping through the instructions in the executable.  This is a little bit difficult when using WinDbg because we’ll need to set a breakpoint on the entry point of the executable. The following steps can be used to accomplish this:

  1. Open the malware in WinDbg as described above.
  2. Type the command lmf into the command window.  This command will display modules loaded by WinDbg.  Ordinarily, I would hope to see a module with a name that matches the malware executable, but for some reason, this name was not displayed.  Instead, a module named xl4n6aq.exe with a base address of 0x400000 was included.
ScreenShot 3 Weird Module Name
Module with weird name displayed in WinDbg command window

 

  1. Next, I used the command !dh 0x400000 to determine the entry point of the program. The entry point is shown at offset 0x60D4, so the breakpoint should be set at 0x4060D4 (base address + entry point offset).
ScreenShot 4 Entry Point
Determining the entry point of the malware sample
  1. We can now set a breakpoint using the command bp 0x4060D4.
  2. We can type g into the console, and the malware will execute until we reach the breakpoint.

At this point, I took a snapshot of the VM so I can easily get back to this point.  I also opened an IDA Pro session so I could compare the disassembly produced by IDA Pro with the instructions shown in a disassembly window in WinDbg.  The first two disassembly lines in WinDbg matched the disassembly shown in IDA Pro, so we’ve reached a good starting point for debugging the malware.

  • « Previous Page
  • 1
  • 2
  • 3
  • 4
  • Next Page »

UT Austin ISO Blog

Find out more about the UT Austin Information Security Office at: security.utexas.edu

Facebook: facebook.com/utaustiniso

Twitter: @UT_ISO

Recent Posts

  • Spectre and Meltdown Vulnerabilities for IT Professionals
  • Analysis of False/Morel
  • Using NodeJS To Deobfuscate Malicious JavaScript
  • Reverse Engineering a Malicious MS Word Document
  • Spies and Social Media

Recent Comments

    Archives

    • January 2018
    • September 2016
    • July 2016
    • May 2016
    • March 2016
    • September 2015
    • September 2014

    Categories

    • Reverse Engineering
    • Security Alert
    • Uncategorized
    • Vulnerability

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    UT Home | Emergency Information | Site Policies | Web Accessibility | Web Privacy | Adobe Reader

    © The University of Texas at Austin 2022

    • UT Austin
    • UT Blogs
    • Log in