Improving code performance by loading binary data instead of text and converting

Hi I am working with existing C++ code, I normally use VB.NET and much of what I am seeing is confusing and contradictory to me.

The existing code loads neural network weights from a file that is encoded as follows:

2
model.0.conv.conv.weight 5 3e17c000 3e9be000 3e844000 bc2f8000 3d676000
model.0.conv.bn.weight 7  4006a000 3f664000 3fc98000 3fa6a000 3ff2e000 3f5dc000 3fc94000

The first line gives the number of subsequent lines. Each of these lines has a description, a number representing how many values follow, then the weight values in hex. In the real file there are hundreds of rows and each row might have hundreds of thousands of weights. The weight file is 400MB in size. The values are converted to floats for use in the NN.

It takes over 3 minutes to decode this file. I am hoping to improve performance by eliminating the conversion from hex encoding to binary and just store the values natively as floats. The problem is I cant understand what the code is doing, nor how I should be storing the values in binary. The relevant section that decodes the rows is here:

while (count--)
    {
        Weights wt{ DataType::kFLOAT, nullptr, 0 };
        uint32_t size;

        // Read name and type of blob
        std::string name;
        input >> name >> std::dec >> size;
        wt.type = DataType::kFLOAT;

        // Load blob
        uint32_t* val = reinterpret_cast<uint32_t*>(malloc(sizeof(val) * size));
        for (uint32_t x = 0, y = size; x < y; ++x)
        {
            input >> std::hex >> val[x];
        }
        wt.values = val;

        wt.count = size;
        weightMap[name] = wt;
    }

The Weights class is described here. DataType::kFLOAT is a 32bit float.

I was hoping to add a line(s) in the inner loop below input >> std::hex >> val[x]; so that I could write the float values to a binary file as the values are converted from hex, but I dont understand what is going on. It looks like memory is being assigned to hold the values but sizeof(val) is 8 bytes and uint32_t are 4 bytes. Furthermore it looks like the values are being stored in wt.values from val but val contains integers not floats. I really dont see what the intent is here.

Could I please get some advice on how to store and load binary values to eliminate the hex conversion. Any advice would be appreciated. A lot.

Answer

Here’s an example program that will convert the text format shown into a binary format and back again. I took the data from the question and converted to binary and back successfully. My feeling is it’s better to cook the data with a separate program before consuming it with the actual application so the app reading code is single purpose.

There’s also an example of how to read the binary file into the Weights class at the end. I don’t use TensorRT so I copied the two classes used from the documentation so the example compiles. Make sure you don’t add those to your actual code.

If you have any questions let me know. Hope this helps and makes loading faster.

#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>

void usage()
{
    std::cerr << "Usage: convert <operation> <input file> <output file>n";
    std::cerr << "tconvert b in.txt out.bin - Convert text to binaryn";
    std::cerr << "tconvert t in.bin out.txt - Convert binary to textn";
}

bool text_to_binary(const char *infilename, const char *outfilename)
{
    std::ifstream in(infilename);
    if (!in)
    {
        std::cerr << "Error: Could not open input file '" << infilename << "'n";
        return false;
    }

    std::ofstream out(outfilename, std::ios::binary);
    if (!out)
    {
        std::cerr << "Error: Could not open output file '" << outfilename << "'n";
        return false;
    }

    uint32_t line_count;
    if (!(in >> line_count))
    {
        return false;
    }
    if (!out.write(reinterpret_cast<const char *>(&line_count), sizeof(line_count)))
    {
        return false;
    }
    for (uint32_t l = 0; l < line_count; ++l)
    {
        std::string name;
        uint32_t num_values;
        if (!(in >> name >> std::dec >> num_values))
        {
            return false;
        }

        std::vector<uint32_t> values(num_values);
        for (uint32_t i = 0; i < num_values; ++i)
        {
            if (!(in >> std::hex >> values[i]))
            {
                return false;
            }
        }

        uint32_t name_size = static_cast<uint32_t>(name.size());
        bool result = out.write(reinterpret_cast<const char *>(&name_size), sizeof(name_size)) &&
            out.write(name.data(), name.size()) &&
            out.write(reinterpret_cast<const char *>(&num_values), sizeof(num_values)) &&
            out.write(reinterpret_cast<const char *>(values.data()), values.size() * sizeof(values[0]));
        if (!result)
        {
            return false;
        }
    }
    return true;
}

bool binary_to_text(const char *infilename, const char *outfilename)
{
    std::ifstream in(infilename, std::ios::binary);
    if (!in)
    {
        std::cerr << "Error: Could not open input file '" << infilename << "'n";
        return false;
    }

    std::ofstream out(outfilename);
    if (!out)
    {
        std::cerr << "Error: Could not open output file '" << outfilename << "'n";
        return false;
    }

    uint32_t line_count;
    if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
    {
        return false;
    }
    if (!(out << line_count << "n"))
    {
        return false;
    }
    for (uint32_t l = 0; l < line_count; ++l)
    {
        uint32_t name_size;
        if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
        {
            return false;
        }
        std::string name(name_size, 0);
        if (!in.read(name.data(), name_size))
        {
            return false;
        }

        uint32_t num_values;
        if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
        {
            return false;
        }

        std::vector<float> values(num_values);
        if (!in.read(reinterpret_cast<char *>(values.data()), num_values * sizeof(values[0])))
        {
            return false;
        }

        if (!(out << name << " " << std::dec << num_values))
        {
            return false;
        }
        for (float &f : values)
        {
            uint32_t i;
            memcpy(&i, &f, sizeof(i));
            if (!(out << " " << std::hex << i))
            {
                return false;
            }
        }
        if (!(out << "n"))
        {
            return false;
        }
    }
    return true;
}

int main(int argc, const char *argv[])
{
    if (argc != 4)
    {
        usage();
        return EXIT_FAILURE;
    }

    char op = argv[1][0];
    bool result = false;
    switch (op)
    {
    case 'b':
    case 'B':
        result = text_to_binary(argv[2], argv[3]);
        break;
    case 't':
    case 'T':
        result = binary_to_text(argv[2], argv[3]);
        break;
    default:
        usage();
        break;
    }
    return result ? EXIT_SUCCESS : EXIT_FAILURE;
}

// Possible implementation of the code snippet in the original question to read the weights

// START Copied from TensorRT documentation - Do not include in your code
enum class DataType : int32_t
{
    kFLOAT = 0,
    kHALF = 1,
    kINT8 = 2,
    kINT32 = 3,
    kBOOL = 4
};

class Weights
{
public:
    DataType type;
    const void *values;
    int64_t count;
};
// END Copied from TensorRT documentation - Do not include in your code

bool read_weights(const char *infilename)
{
    std::unordered_map<std::string, Weights> weightMap;

    std::ifstream in(infilename, std::ios::binary);
    if (!in)
    {
        std::cerr << "Error: Could not open input file '" << infilename << "'n";
        return false;
    }

    uint32_t line_count;
    if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
    {
        return false;
    }

    for (uint32_t l = 0; l < line_count; ++l)
    {
        uint32_t name_size;
        if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
        {
            return false;
        }
        std::string name(name_size, 0);
        if (!in.read(name.data(), name_size))
        {
            return false;
        }

        uint32_t num_values;
        if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
        {
            return false;
        }

        // Normally I would use float* values = new float[num_values]; here which
        // requires delete [] ptr; to free the memory later.
        // I used malloc to match the original example since I don't know who is
        // responsible to clean things up later, and TensorRT might use free(ptr)
        // Makes no real difference as long as new/delete ro malloc/free are matched up.
        float *values = reinterpret_cast<float *>(malloc(num_values * sizeof(*values)));
        if (!in.read(reinterpret_cast<char *>(values), num_values * sizeof(*values)))
        {
            return false;
        }
        weightMap[name] = Weights { DataType::kFLOAT, values, num_values };
    }
    return true;
}