UML in C Language — Class Diagrams
Traslating class diagrams into C source code.
Writing software requires creativity, and method. While it’s true that over-engineering should always be avoided, on the other hand messing around with the code without any kind of roadmap will likely put our project on a precarious path.
That’s the reason why every programmer should include a number of generic (yet powerful) software engineering techniques in their toolbelt, the most famous one perhaps being Object-Oriented Programming.
Many languages that natively support OOP do exist out there. However, in some real-world scenarios, we don’t have the privilege of choosing the programming language for a given software system: this is the case of legacy embedded systems, which are very often written in plain old C, a so-called “procedural” or “structured” language.
The purpose of today’s discussion is to provide fundamental concepts on how to reason in terms of classes inside your C programs, and how to statically represent your software using standard UML (Unified Modeling Language) Class Diagrams.
What is a class diagram?
A class diagram is a static representation of a system. Therefore, it focuses on the shape of a software module, but it doesn’t tell much about the logic of a system. Saying it in other words, what we will usually find in a class diagram is a synthesis of what a system does. It doesn’t contain (or contains very little) information on how the functionality is implemented.
Why should we care about that?
As software engineers, we sooner or later become conscious that OOP is not really about “representing real-world objects through software”.
Instead, Object-Oriented Design is a way to craft software as flexible and mantainable as possible, by splitting the system into submodules having well defined responsibilities, and letting them interact with each other to achieve the desired functionality.
Class diagrams help us to reason about our software architecture before diving into the nitty-gritty of the implementation (that is, writing the code). Such representation clearly shows what the submodules are, briefly states their responsibility, and also provides information on relationships with other modules.
How do they relate with C code?
Okay but… We’re C developers! No one provided us fancy keywords to explicitly declare an abstract class or an interface. What is a method?
While it’s true that C doesn’t include any idiom to implement OOP, the concepts are still valid here. We can achieve encapsulation and polymorphism: we only have to be more aware of what we are doing.
In the following sections, we’re gonna see how structs and functions can be effectively used in place of a class. Moreover, we’ll show how pure virtual classes (ie. interfaces) can be realized through function pointers, and how writing a plain C function translates to implementing an interface.
Finally, we’ll see how objects interact with each other by making use of composition and dependency mechanisms.
Alright, let’s move on!
1. Big Boxes aka Concrete Classes
In UML, a box is used to represent a class. This can be either a concrete class, an abstract class, or an interface. Here we will cover concrete classes and interfaces.
In pure-OO languages, classes are the atoms that make up our programs. Advanced functionality is achieved by allowing a set of functions to operate on a set of data. Having said that, a concrete class is nothing but a container for related functions and data.
A basic demonstration of UML concrete class is shown in Figure 1.1. The Rectangle
class exposes the function area()
and two attributes, namely width
and height
.
In C language, this can be represented with the listing in Figure 1.2. As you may already know, C offers no idiom to declare a class. Therefore, Rectangle_Area()
is not explicitly bound to the Rectangle
data structure.
I think it’s useful also showing how to invoke methods on Rectangle
instances. An example listing is shown in Figure 1.3. It’s worth mentioning that if we had Java, or C++, as our language of choice, the last line would have been equal to r.area()
.
2. Enter the Interface
Sometimes it’s useful to operate on different objects in an uniform way. As a famous example, consider area calculation for different geometric shapes: every shape should offer an area()
method, albeit using a different set of data. Our user code could reasonably expect to retrieve areas from a list of shapes, regardless of their specific details. This scenario is illustrated in Figure 2.1.
I suggest to focus on the first line and the last for
loop: an array of Shape
pointers is declared and, later in the code, the area()
method is invoked on each object (through the Shape_Area()
function call), no matter if it’s a Rectangle
or a Circle
.
It may be the case that our Shape
class offers a set of functions (ie. methods) but doesn’t implement any of them: in such situations we say that Shape
is an interface. UML notation for interfaces is similar to the one used for concrete classes, except that we must use a so-called stereotype (nothing but words enclosed by quotation marks) to indicate that our class is an interface, and every unimplemented method has to be written in italic.
Take a look at Figure 2.2. The “interface” stereotype is put above the class name, and the area()
method is written in italic.
To translate that in C, we make use of a struct that holds function pointers (see Figure 2.3). AreaFn
is just syntactic sugar for a function pointer of type double(void *)
, while ShapeInterface
holds a function pointer for each method exposed by Shape
.
That sounds good, but we’re still missing something that makes us able to call Shape_Area()
as we’ve seen in Figure 2.1. What we need is a struct that stores both interface functions and related data (ie. a struct Rectangle
instance and Rectangle-related methods).
The solution we’re looking for is shown in Figure 2.4. Here, instance will hold either a Rectangle
or Circle
data structure (this is an advantage of using the void *
data type).
Now it’s time to explain how to make up a Shape
, using a kind of constructor. Figure 2.5 points to the creation of a Rectangle
: notice how, in the last four lines, Rectangle
data and its associated methods are put inside the Shape
wrapper.
To conclude with this section, here is the pretty straightforward code for Shape_Area()
.
3. Diamond Arrows aka Inheritance
Arrows are used to represent relationships between classes. There are three main kinds of relationships: inheritance, composition, and dependency. Despite their differences, every relationship enforces the fact that a class knows about another class.
In UML, diamond arrows are used to represent inheritance between classes. Recalling our previous example, we say that Rectangle
inherits from Shape
: in fact, implementing an interface and inheriting from it is the exact same thing. The class diagram is shown in Figure 3.1.
4. Composition and Dependencies
In OOP, objects need to interact with each other. In a typical client/server application, the client needs to have a way to interact with the server and vice versa. However, this kind of relationship is not restricted to networking applications: if objects of class A make use of objects belonging to class B, then A is a client (or user) of B.
Recall the listing shown in Figure 2.1. Now suppose that the code is inside a GeometryApp
class, which makes use of rectangles and circles via the Shape
interface. In this situation, we say there is a composition relationship between GeometryApp
and Shape
. It’s worth to mention that the connection is not bidirectional: in other words, GeometryApp
knows that the Shape
class exists, but Shape
is completely independent from GeometryApp
.
UML composition is represented via straight arrows, as in Figure 4.1. You may have noticed that the attributes of GeometryApp
are “reacting” to the composition: the shapes array can be seen as the byproduct of the relationship. Moreover, the N_SHAPES
label close to the arrow is used to convey the so-called multiplicity, which simply clarifies how many server objects are known by each user object (in this case, every GeometryApp
object stores N_SHAPES
pointers to Shape
objects).
To conclude, I’d like to mention the situation in which an object comes to know about another object, yet not keeping it as an attribute. For instance, think of a Rectangle
object that gets passed as a parameter to function F
. In those cases, we say that F
has a dependency on the Rectangle
class. Take a look at Figure 4.2, in which F
has been put inside class A
.
Wrapping Up
That’s it for today. I hope this reading has helped to clarify the main concepts behind UML class diagrams, and how to apply them using C language. However, the Unified Modeling Language has much more capabilities, especially for dynamic modeling. I hope to be able to bring you soon a discussion about applying sequence and interaction diagrams to C programming, which is something I do very frequently in all of my projects!
Thanks so much for your attention. For any question, feel free to contact me wherever you feel more appropriate. See you next time! 😊