Malware Creation

Encoder and Decoder

Introduction:

Encoding shellcode as words or other non-executable formats can be a useful technique for bypassing certain types of endpoint detection and response (EDR) systems.

EDR systems typically rely on identifying known malicious code patterns or behavior to detect and prevent malware. By encoding shellcode into a non-executable format, the code's patterns and behavior can be obfuscated, making it more difficult for EDR systems to detect.

In this case, encoding shellcode as words, can help disguise the code's purpose and make it more difficult to detect. If the code is written in a way that appears innocuous, such as using language-like constructs, it may not raise any red flags with EDR systems that are scanning for more obvious indicators of malicious activity.

However, it's worth noting that encoding shellcode is just one technique that attackers may use to evade detection, and it's not foolproof. This is because EDR systems are multilayered approach and do not solely rely on static analysis of the file. This means that other techniques must be used in order to get around EDR completely.

My Code:

The Encoder

To start with we need some way of taking the raw shellcode bytes and turning them into something else. This is done prior to compiling the binary and such can be done in whatever language I wanted. Therefore, I chose to use python to write the encoder. I have never actually used python but it is really simple as I have programmed in javascript mainly.

The ideas of my encoder was to store the hex values of the shellcode as words such as:

0x10 -> one zero
0xf6 -> fruit six

There are multiple different formats of normal hex that I will need to convert so I made the script have a couple of different ways of reading the file. These are:

The script takes 2 arguments:

Python Code Overview:

The code effectively parses the file in the format that is decided earlier. This finishes with a list of strings that represent the hex values of the entire shellcode.

Then the list is passed to encrypt function. This function takes the list and loops over the strings separating the first and second character and comparing them to a hashmap. it then adds the corresponding word to a list and can randomize the most common zero nugget of shellcode to other words if needed.

After doing this the script then prints out the encoded shellcode and the size of the shellcode(needed sometimes).

The Decoder:

This C++ code decodes a sequence of strings using a pre-defined map of words to hexadecimal digits, and then converts the resulting decoded strings to an array of unsigned 8-bit integers (uint8_t).

std::string decode_word(const std::string &encoded)
{
    static const std::map hex_to_word = {
        {"zero", '0'}, {"one", '1'}, {"two", '2'}, {"three", '3'}, {"four", '4'}, {"five", '5'}, {"six", '6'}, {"seven", '7'}, {"eight", '8'}, {"nine", '9'}};

    std::vector words;
    size_t pos = encoded.find(' ');
    words.push_back(encoded.substr(0, pos));
    words.push_back(encoded.substr(pos + 1));

    std::string decoded;
    for (const auto &word : words)
    {
        if (hex_to_word.count(word))
        {
            decoded += hex_to_word.at(word);
        }
        else
        {
            decoded += word[0];
        }
    }
    return decoded;
}

The decode_word function takes a string encoded using the mapping of words to hexadecimal digits in the hex_to_word map, and returns the decoded string. If a word in the encoded string is not found in the map, the function simply takes the first character of that word as the decoded character.

The decode_words function takes a vector of encoded strings, calls decode_word on each of them, and returns a vector of the resulting decoded strings.

At this stage we have a decoded payload, however instead of an contiguous array of hex values we have a vector of strings. This means we need a way to turn this into raw values.

In this implementation i used std::uint8_t to store the values.

uint8_t hexToUInt8(std::string hexValue)
{
    uint8_t result = 0;
    for (const auto &c : hexValue)
    {
        result <<= 4;
        if (c >= '0' && c <= '9')
        {
            result |= (c - '0');
        }
        else if (c >= 'a' && c <= 'f')
        {
            result |= (c - 'a' + 10);
        }
        else if (c >= 'A' && c <= 'F')
        {
            result |= (c - 'A' + 10);
        }
        else
        {
            throw std::invalid_argument("Invalid hex character: " + c);
        }
    }
    return result;
}

The hexToUInt8 function takes a string representing a hexadecimal digit and returns the corresponding unsigned 8-bit integer value.

The convertStringsToUint8 function takes a vector of strings and a pointer to an array of same size of vector and runs turns the strings eg: "1f" into the uint8_t values.

The plan is then to implement this into the process injection code as the way of storing the payload.