Reverse Engineering with Tycho

2019-06-04

by Sebastian Manns

Reverse engineering a software is not an easy task. Especially not if you do this for the first time.

Hi, my name is Sebastian Manns. I study "general and digital forensics". Since one month I am a trainee at Cyberus Technology and my job is Software/Malware Analysis with Tycho.

In my first blog entry I will show you how easy it is to evaluate and manipulate system calls with Tycho using Pafish as an example.

Motivation

In order to gain first experiences in software analysis with Tycho, I was looking for a simple project that would still be useful.

I chose Pafish, because it is a small tool similar to malware that performs various tests to detect if it is running in a virtual environment. When malware detects virtualization, it may not do what it is supposed to do, but behaves inconspicuously. That's why it's important that analysis tools like Tycho are as invisible as possible.

By comparing the two screenshots below, you can see the difference between pafish's output on a normal Oracle Virtualbox VM and on my Tycho Setup. Currently Tycho is not completely invisible for Pafish:

pafish output on virtualbox, 25 tests failed

pafish output on Tycho machine, only 4 tests failed

What about the tests that failed? There are four tests that trace the Tycho VM:

  1. "Checking the difference between CPU timestamp counters (rdtsc) forcing VM exit"
  2. "Using mouse activity"
  3. "Checking if disk size <= 60GB via DeviceIoControl()"
  4. "Checking if disk size <= 60GB via GetDiskFreeSpaceExA()"

Trace number 1 comes from pafish taking the difference between two CPU timestamp counter values. Between those two values, it forces a VMEXIT event of the VM it is potentially running in. Faking timestamp counters is a possibility, but this shall not be the scope of this article.

Trace number 2 comes from the fact that the mouse remains untouched on the analysis system.

I found traces number 3 and 4 most interesting, because they use two specific system calls to test if the hard disk's size is unusually small. My analysis hardware really has such a small disk, so i would need to patch this anyway if i did not get a larger one.

The idea is now to use Tycho to fake the disk size and delude pafish that it is much larger. This task is very manageable, and it is easy to get familiar with different functions of Tycho. And at the end, the Tycho Setup passes two more tests and a nice example script for Tycho is still finished.

Preparation

What exactly are system calls and why are they important for software analysis?

System Calls are used to call a kernel service from user land. The goal is to be able to switch from user mode to kernel mode, with the associated privileges. What system calls are available depend on your operating system.

Tycho gives us full control over the communication from a program to the operating system. We can stop the program at any system call, evaluate and manipulate it and let the program run again. And all this without even looking at a traditional debugger and dealing with assembler code.

What is required:

For this task we only need an typical Tycho Setup and Pafish.

First test: check disk size

This test checks if the hard disk is less than 60 GiB, because a normal computer rarely has such a small hard disk. To do this, Pafish uses the Windows function DeviceIoControl. To be able to work with Tycho we have to go deeper and use the system call that is called by the DeviceIoControl function, which is NtDeviceIoControlFile.

Now let's have a look at the system call. All we need to do is to pause Pafish when this system call occurs. How to pause a program with breakpoints on a system call is also described in the previous article Windows system call parameter analysis.

So now we begin to find the system call.

Python Example

First we import the most important libraries, which are needed for the breakpoints and the evaluation of the system calls. Then we connect to the Tycho server and with service.open_process("pafish.exe") we open a handle to the process. After that, we wait until pafish.exe is started. The pafish.pause() call will make sure that the process is stopped as soon as it is started.

import struct
import time

from binasci import unhexlify
from binasci import hexlify
from pyTycho.syscall_interpreter import SystemCallInterpretationFactory
from pyTycho import Tycho

service = pyTycho.Tycho()
pafish = service.open_process("pafish.exe")
pafish.pause()
while not pafish.is_running():
    time.sleep(1)
    print("pafish.exe is currently not running")

This piece of code creates a breakpoint for NtDeviceIoControlFile via our pafish-tycho-process, activates it and waits for the breakpoint.

breakpoint = pafish.get_syscall_breakpoint()
breakpoint.add_syscall_whitelist(pyTycho.Tycho.syscalls.NtDeviceIoControlFile)
breakpoint.enable()

In this snippet of code we create a factory object which retrieves the input and output parameters of a system call. And finally, we're waiting for Pafish to use it.

factory = SystemCallInterpretationFactory(pafish)
pafish.resume()
event = pafish.wait_for_breakpoint()
if event.event_category == event.SYSCALL_BP:
    handle_syscall_event(factory, pafish, event)

If Pafish then uses the system call, the script returns from the wait_for_breakpoint function and gets an event object with information about the system call. The handle_syscall_event function is invoked when a system call breakpoint is hit, records the name and input/output parameters for a given system call, and then prints all this information for further inspection.

    {'after': {'IoStatusBlock': {'type': 'IO_STATUS_BLOCK*', 'value': 0x8EC60},
               'OutputBuffer': {'type': 'VOID*', 'value': 0x28F930}},
    'before': {'ApcContext': {'type': 'VOID*', 'value': 0},
               'ApcRoutine': {'type': 'IO_APC_ROUTINE*', 'value': 0},
               'Event': {'type': 'HANDLE', 'value': 0},
               'FileHandle': {'type': 'HANDLE', 'value': 0xD0},
               'InputBuffer': {'type': 'VOID*', 'value': 0},
               'InputBufferLength': {'type': 'ULONG', 'value': 0},
               'IoControlCode': {'type': 'ULONG', 'value': 0x7405C},
               'OutputBufferLength': {'type': 'ULONG', 'value': 0x8}},
    'name': 'NtDeviceIoControlFile',
    'return_value': 0L}

Syscall interpretation

With the new information we acquired about the system call, the next task is to find out where the size of the hard disk is, and how we can change it.

Looking at NtDeviceIoControlFile in the MSDN docs again, we find the following information:

  • OutputBuffer: A pointer to a buffer that is to receive the device-dependent return information from the target device.

  • OutputBufferLength: Length of the OutputBuffer in bytes.

  • IoControlCode: Code that indicates which device I/O control function is to be executed.

The structure that this pointer points to is 8 bytes in size, consisting of a single member indicating the disk size.

Contains disk, volume, or partition length information used by the IOCTL_DISK_GET_LENGTH_INFO control code

Return value patching

Now that we have the right system call and know what it does, it is super easy to manipulate it with Tycho. To check the output from above, you can do the following steps.

hex(struct.unpack("<Q",pafish.read_linear(0x28F930, 8))[0])
    '0xEE8156000'

If you divide 0xEE8156000 bytes by 0x400³ you get 59 GiB which is the size of the hard disk.

And with just one more command it is possible with Tycho to change the size to 500 GiB.

pafish.write_linear(0x28F930, struct.pack("<Q", 0x7D00000000))

struct is a python module which converts binary data into a python object (or a python object into binary data) with the layout of binary specified by the format string. With struct.pack() we convert the new size value into binary data and with pafish.write_linear() we write the value into the memory.

Second test: check_free partition space

The second test is similar to the first. This time it checks how much space is available on the C: partition. If less than 60GiB are free the test is not passed. To check this, the function getDiskFreeSpaceExA is used. This function calls the system call NtQueryVolumeInformationFile which we can analyze with Tycho.

The interpretation works also similarly to the first test. All we have to do is look for another system call, but the interpretation is a bit different.

We add NtQueryVolumeInformationFile to the whitelist and the system call interpretation retrieves the following information after the breakpoint was hit:

     {'after': {'FileSystemInformation': {'type': 'VOID*', 'value': 0x28F8A8},
                'IoStatusBlock': {'type': 'IO_STATUS_BLOCK*', 'value': 0x8E328}},
      'before': {'FileHandle': {'type': 'HANDLE', 'value': 0xD0},
                 'FileSystemInformationClass': {'type': 'FS_INFORMATION_CLASS',
                                                'value': (0x3,
                                                          'FileFsSizeInformation')},
                 'Length': {'type': 'ULONG', 'value': 0x18}},
    'name': 'NtQueryVolumeInformationFile',
    'return_value': 0L}

Through Tycho, we know that the system call parameter FileSystemInformationClass has value 3, which corresponds to FileFsSizeInformation. And that means FileSystemInformation points to a FileFsSizeInformation object. This object contains the information that is interesting for us. We can find this 0x18 byte large object at address 0x28F8A8.

With Tycho, we can easily read the raw bytes from this object.

hexlify(pafish.read_linear(0x28F8A8, 0x18))
    'ff2f75000000000055421900000000000800000000020000'

But the 0x18 bytes of output aren't really easy to understand yet, because the FileFsSizeInformation is a structure like this:

         8 Byte          8 Byte           4 Byte   4 Byte
        +----------------+----------------+--------+--------+
        |total           |free            |sector/ |byte/   |
        |cluster:        |cluster:        |cluster:|sector: |
        |0x752FFF        |0x194255        |0x8     |0x200   |
        +----------------+----------------+--------+--------+
 0x28F8A8                                                   0x28F8C0

Let's read the individual parts of it step by step with struct.unpack().

hex(struct.unpack("<Q",pafish.read_linear(0x28F8A8, 8))[0])
    '0x752FFF'

hex(struct.unpack("<Q",pafish.read_linear(0x28F8B0, 8))[0])
    '0x194255'

hex(struct.unpack("<i",pafish.read_linear(0x28F8B8, 4))[0])
    '0x8'

hex(struct.unpack("<i",pafish.read_linear(0x28F8BC, 4))[0])
    '0x200'

Using the following formula:

free space = (free cluster * sector per cluster * bytes per sector) / 0x400³

You get the result 6.3148 GiB. That exactly describes the free space on the C: partition.

Here it is also super easy to change the return value with Tycho. As an example we want to manipulate the system call so that the partition is 300 GiB and still has 150 GiB free space. For this we have to change the total clusters and the free clusters. We need to know how many clusters we need for these values. For this we use the following calculation:

new_cluster = (new size*0x400³) / (byte per sector * sector per cluster)

As a result we get 0x4B00000 total cluster and 0x2580000 free cluster.

pafish.write_linear(0x28F8A8, struct.pack("<Q", 0x4B00000))
pafish.write_linear(0x28F8B0, struct.pack("<Q", 0x2580000))

Using this change, we let the partition to be 300 GiB in size, with 150 GiB of free space, and Pafish reports success.

Summary

Tycho is a cool tool for software analysis. In this blog entry I showed you how easy it is to find, stop, analyze and manipulate system calls with tycho.

Because you can easily get complete information from a system call before and after execution, you can quickly find out a lot about the software. And all this without using a classic debugger.

With Tycho's Python API it is also super easy to script analysis tasks and to integrate it into other tools. Furthermore Tycho exposes much less virtualization artifacts than other off-the-shelf virtualization software, which is a big advantage in malware analysis.


Share this article: