Turning automatic code generation upside down

Much ink has been spilled on the Next Big Thing in software development. One of these things has always been “automatic code generation” from high-level models (e.g., from state machines).

But even though many tools on the market today support code generation, their widespread acceptance has grown rather slowly. Of course, many factors contribute to this, but one of the main reasons is that the generated code has simply too many shortcomings, which too often require manual “massaging” of the generated code. But this breaks the connection with the original model. The tool industry’s answer has been “round-trip engineering”, which is the idea of feeding the changes in the code back to the model.

Unfortunately, “round-trip engineering” simply does not work well enough in practice. This should not be so surprising, considering that no other code generation in software history has ever worked that way. You don’t edit by hand the binary machine code generated by an assembler. You don’t edit by hand the assembly code generated by the high-level language compiler. This would be ridiculous. So, why modeling tools assume that the generated code will be edited manually?

Well, the modeling tools have to assume this, because the generated code is hard to use “as-is” without modifications.

First, the generated code might be simply incomplete, such as skeleton code with “TODO” comments generated from class diagrams. I’m not a fan of this, because I think that in the long run such code generation is outright counterproductive.

Second, most code generating tools impose a specific physical design (by physical design I mean partitioning of the code into directories, and files, such as header files and implementation files). For example, for generation of C/C++ code (which dominate real-time embedded programming), the beaten path is to generate <class>.h and <class>.cpp files for every class. But what if I want to put class declaration in a file scope of a .cpp file and not to generate the <class>.h file at all? Actually, I often want to do this to achieve even better encapsulation. A typical tool would not allow me to do this.

And finally, all too often the automatically generated code is hard to integrate with other code, not created by the tool. For example, a class definition might rely on many included header files. But while most tools recognize that and allow inserting some custom beginning of the file, they don’t allow to insert code in an arbitrary place in the file.

But, how about a tool that actually allows you to do your own physical design? How about turning the whole code generation process upside down?

A tool like this would allow you to create and name directories and files instead of the tool imposing it on you. Obviously, this is still manual coding. But, the twist here is that in this code you can “ask” the tool to synthesize parts of the code based on the model. (The “requests” are special tags that you mix in your code.) For example, you can “ask” the tool to generate a class declaration in one place, a class method definition in another, and a state machine definition in yet another place in your code.

This “inversion” of code generation responsibilities solves most of the problems with the integration between the generated code and other code. You can simply “ask” the tool to generate as much or as little code as you see fit. The tool helps, where it can add value, but otherwise you can keep it out of your way.

The idea of “inverting” the code generation is so simple, that I would be surprised if it was not already implemented in some tools. One example I have is the free QM tool from my company (http://www.state-machine.com/qm). If you know of any other tool that works that way, I would be very interested to hear about it.

Leave a Reply

Your email address will not be published. Required fields are marked *