On FaceBook's Thrift semantics, code generation, and OCaml
Note: The obligatory TL;DR section is at the very end of this text.
I have my own personal score with the ASN.1 world. The standard is laden with design-by-committee complexity, no doubt evolved to “address the real world demands”. Its mind-numbing standardese and semantics effectively prohibit any newcomer from ever entering this field and producing a decent new compiler. So, we're stuck with something like 2.5 alive ASN.1 compilers covering some 3 mainstream languages (C++, C#, Java). The commercial products are often cost prohibitive: they're squarely aimed at rich telecom market. Where would a small Ruby or Python startup go? (While I am at it, Erlang has a free decent compiler, you know).
Yet, there's an opposite side to this complexity. Many things you struggle with or “invent” for the purpose of better data serialization have already been invented in the ASN.1 world. Things like broiled-to-perfection TLV-based encodings (BER/DER/CER), bitwise Packed Encoding Rules (competing with gzip'ing your serialized binary blob), Information Object Classes (think of SNMP MIB macros on steroids), or Encoding Control Notation have a lot to offer and learn from.
But I should stop kicking that dead horse. Let's try Thrift for a change.
( Collapse )
You can download the patch here: http://lionet.info/patches/thrift-trunk-962854.patch
https://issues.apache.org/jira/browse/THRIFT-827
https://issues.apache.org/jira/browse/THRIFT-860
P.S. The above patches have since been merged into Thrift.
The ASN.1 rant
After co-founding Echo, I had to put the asn1c's development on hold, for the sheer lack of time. (If you don't know what my asn1c is, think of it as the most evolved open source ASN.1 compiler.) Despite suspending development, I've been tracking the ASN.1 evolution, as well as the emergence of some newer technologies competing with what ASN.1 has to offer. I am referring to FaceBook's Thrift, Google's Protocol Buffers, Cisco's Etch, and the likes. Yet to this day I had no opportunity to actually use any of those in production.I have my own personal score with the ASN.1 world. The standard is laden with design-by-committee complexity, no doubt evolved to “address the real world demands”. Its mind-numbing standardese and semantics effectively prohibit any newcomer from ever entering this field and producing a decent new compiler. So, we're stuck with something like 2.5 alive ASN.1 compilers covering some 3 mainstream languages (C++, C#, Java). The commercial products are often cost prohibitive: they're squarely aimed at rich telecom market. Where would a small Ruby or Python startup go? (While I am at it, Erlang has a free decent compiler, you know).
Yet, there's an opposite side to this complexity. Many things you struggle with or “invent” for the purpose of better data serialization have already been invented in the ASN.1 world. Things like broiled-to-perfection TLV-based encodings (BER/DER/CER), bitwise Packed Encoding Rules (competing with gzip'ing your serialized binary blob), Information Object Classes (think of SNMP MIB macros on steroids), or Encoding Control Notation have a lot to offer and learn from.
But I should stop kicking that dead horse. Let's try Thrift for a change.
( Collapse )
You can download the patch here: http://lionet.info/patches/thrift-trunk-962854.patch
4. Obligatory TL;DR section
Thrift specification underspecifies several important aspects of the description language semantics. The Thrift target language code generators are inconsistent in the way they treat certain parts of the specification. I made an attempt to make the OCaml generator produce a bit safer and compliant code, and am sharing a patch with you.https://issues.apache.org/jira/browse/THRIFT-827
https://issues.apache.org/jira/browse/THRIFT-860
P.S. The above patches have since been merged into Thrift.