File I/O vs TCP socket, which is faster for data trasnfer within same machine?

I have about 10MBs of data to send from the server program to the client program, both of which run within the same hard drive.

The server is written in C++ while the client is written in python.

I have already created an architecture where the server writes the data to temporary binary files and then sends the names of the files via TCP to the client to let it know that the data has been written to the files and that it is now ready to be retrieved. And then the python client retrieves the data from these files.

This is slower than I expected. I was wondering if it would be faster and use less memory if I just send the data (in binary format, not string) to the client directly via TCP rather than writing/reading files?

Answer

Assuming your software is written to use TCP efficiently, TCP will be faster, since during a TCP transfer the data will never need to go out to a hard drive and then back in again. Rather, it will stay in RAM the entire time, and RAM is much faster than either a spinning disk or a solid-state disk.

Sending the data via TCP also allows your two programs to run in parallel, i.e. your consumer program can start consuming the data even before your producer program has finished producing it. That can provide some additional speedup (compared to an approach where the reader can’t safely read the files until the writer has finished writing them)

If you don’t think you’ll ever want to go over an actual network (i.e. your two programs will always be running on the same host) then you might look at using a pipe instead of a TCP socket, as it would be slightly more efficient than TCP. (A TCP socket also works fine, though).

(Btw another reason to avoid the write-temporary-files-to-disk approach is that you then don’t have to worry about what to do is the disk is full or read-only or if your program doesn’t have permission to write to the folder it wants to write to)