Apache Thrift For Big Gains

Beyond Languages… As an outsource software development team we often end up working with people having expertise from different frameworks and languages. Different tools have their own advantages and shortcomings, and it’s our responsibility to use the best tool suited for a particular task to provide the client with superior products. The first essential in that environment is to maintain modularity, separating things into distinct components. By doing that, it’s easier to comprehend what’s going on. Each team just focuses on the tasks assigned and having a high-level overview about other modules will be enough. Apache Thrift helps us achieve those goals. What is Thrift? Thrift is a set of code-generation tools for implementing RPC (Remote Procedure Call) in services, with cross-language support including C++, Python, C#, Java, PHP, Perl and Ruby. It was originally developed by Facebook and now it is an Apache open source project. The architecture of the Thrift stack is shown below-

Apache Thrift
Fig: Thrift architecture.

There are a many repeated works required while developing a web server based application such as designing a protocol and writing code to serialize and de-serialize messages on the protocol. Also we need to deal with sockets and manage concurrency while creating clients in many languages. Thrift automatically does all of this, given a description of the functions needed to expose from the server to clients. Through a simple and straight-forward Interface Definition Language (IDL), Thrift allows us to define and create services that are both consumable and serviceable by numerous languages. Using the code generation tool, Thrift immediately gives a set of files that can then be used to create clients and/or servers. In addition to interoperability, Thrift can be very efficient through a unique serialization mechanism that is efficient in both time and space. A Real-World Case At Codemen In one of our ongoing Computer Vision projects, we are utilizing the power of Apache Thrift in full flow. This solution incorporates OpenCV libraries at its core, which involves heavy computations associated with multithreading and concurrent algorithms. Our aim is to provide real-time analytics by processing live video feeds with this application. So we are using GPUs instead of CPUs for tackling the high load of computation. And for that we had to make the processes parallel in nature so that GPU utilization can be maximized. For these reasons, C++ is an automatic choice here for the backend core with its efficient implementations of OpenCV GPU libraries. The solution also involves Python web technologies for a convenient web frontend and cloud support. Thrift readily handles the communication with Python and C++ modules through RPC. It also manages multithreading and concurrency in the server operations with reasonable overhead. A Simple Example Generating codes for C++ and Python server-client pairs with RPC capability is as simple as the following example. Install Thrift on a Linux machine. Installation details can be found here (https://thrift.apache.org/tutorial/). Appropriate versions of the associated libraries for each language needs to be installed accordingly. The following example will require necessary libraries for C++ and Python with Thrift version 0.9.3. A simple interface definition file named Test.thrift containing Thrift types and Services may look like this- namespace cpp TestService namespace py TestService service TestService{ string get_id(1:string p_name), i32 ping_server(), oneway void start_process(1:string pid), oneway void stop_process(1:string pid) } This .thrift script can be used to generate skeleton files for creating a Python client app and a C++ server. The Python app can be customized to start/stop some independent background processes in the C++ server using RPC. The following commands will generate the skeleton files- thrift –gen cpp Test.thrift thrift –gen py Test.thrift Thrift data types include bool, byte, double, string and integer. Also special types like binary, structure (those are like classes but without inheritance) and containers (list, set, map) are supported. Those types correspond to commonly available interfaces in most programming languages. Careful! Several facts need to be considered before using Apache Thrift in production. There is not enough support for Windows system yet so you are bound to work with a Linux platforms. The documentations and online resources are also still somewhat limited. In some situations, you might need to be on your own for solving industry problems related to your domain with Thrift. But overall, it’s a great tool to have in your arsenal and anyone aiming at improving their productivity should have an in-depth look at it.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>