UML in C Language — Class Diagrams

Traslating class diagrams into C source code.

Published in

Level Up Coding

8 min readApr 25, 2022

Writing software requires creativity, and method. While it’s true that over-engineering should always be avoided, on the other hand messing around with the code without any kind of roadmap will likely put our project on a precarious path.
That’s the reason why every programmer should include a number of generic (yet powerful) software engineering techniques in their toolbelt, the most famous one perhaps being Object-Oriented Programming.

Many languages that natively support OOP do exist out there. However, in some real-world scenarios, we don’t have the privilege of choosing the programming language for a given software system: this is the case of legacy embedded systems, which are very often written in plain old C, a so-called “procedural” or “structured” language.

The purpose of today’s discussion is to provide fundamental concepts on how to reason in terms of classes inside your C programs, and how to statically represent your software using standard UML (Unified Modeling Language) Class Diagrams.

What is a class diagram?

A class diagram is a static representation of a system. Therefore, it focuses on the shape of a software module, but it doesn’t tell much about the logic of a system. Saying it in other words, what we will usually find in a class diagram is a synthesis of what a system does. It doesn’t contain (or contains very little) information on how the functionality is implemented.

Why should we care about that?

As software engineers, we sooner or later become conscious that OOP is not really about “representing real-world objects through software”.
Instead, Object-Oriented Design is a way to craft software as flexible and mantainable as possible, by splitting the system into submodules having well defined responsibilities, and letting them interact with each other to achieve the desired functionality.

Class diagrams help us to reason about our software architecture before diving into the nitty-gritty of the implementation (that is, writing the code). Such representation clearly shows what the submodules are, briefly states their responsibility, and also provides information on relationships with other modules.

How do they relate with C code?

Okay but… We’re C developers! No one provided us fancy keywords to explicitly declare an abstract class or an interface. What is a method?

While it’s true that C doesn’t include any idiom to implement OOP, the concepts are still valid here. We can achieve encapsulation and polymorphism: we only have to be more aware of what we are doing.

In the following sections, we’re gonna see how structs and functions can be effectively used in place of a class. Moreover, we’ll show how pure virtual classes (ie. interfaces) can be realized through function pointers, and how writing a plain C function translates to implementing an interface.
Finally, we’ll see how objects interact with each other by making use of composition and dependency mechanisms.

Alright, let’s move on!

1. Big Boxes aka Concrete Classes

In UML, a box is used to represent a class. This can be either a concrete class, an abstract class, or an interface. Here we will cover concrete classes and interfaces.

In pure-OO languages, classes are the atoms that make up our programs. Advanced functionality is achieved by allowing a set of functions to operate on a set of data. Having said that, a concrete class is nothing but a container for related functions and data.

A basic demonstration of UML concrete class is shown in Figure 1.1. The Rectangle class exposes the function area() and two attributes, namely widthand height.

In C language, this can be represented with the listing in Figure 1.2. As you may already know, C offers no idiom to declare a class. Therefore, Rectangle_Area() is not explicitly bound to the Rectangle data structure.

I think it’s useful also showing how to invoke methods on Rectangle instances. An example listing is shown in Figure 1.3. It’s worth mentioning that if we had Java, or C++, as our language of choice, the last line would have been equal to r.area() .

2. Enter the Interface

Sometimes it’s useful to operate on different objects in an uniform way. As a famous example, consider area calculation for different geometric shapes: every shape should offer an area()method, albeit using a different set of data. Our user code could reasonably expect to retrieve areas from a list of shapes, regardless of their specific details. This scenario is illustrated in Figure 2.1.

Figure 2.1: User code that manipulates shapes

I suggest to focus on the first line and the last forloop: an array of Shapepointers is declared and, later in the code, the area() method is invoked on each object (through the Shape_Area() function call), no matter if it’s a Rectangleor a Circle.

It may be the case that our Shapeclass offers a set of functions (ie. methods) but doesn’t implement any of them: in such situations we say that Shape is an interface. UML notation for interfaces is similar to the one used for concrete classes, except that we must use a so-called stereotype (nothing but words enclosed by quotation marks) to indicate that our class is an interface, and every unimplemented method has to be written in italic.

Take a look at Figure 2.2. The “interface” stereotype is put above the class name, and the area()method is written in italic.

To translate that in C, we make use of a struct that holds function pointers (see Figure 2.3). AreaFnis just syntactic sugar for a function pointer of type double(void *), while ShapeInterface holds a function pointer for each method exposed by Shape.

Figure 2.3: ShapeInterface struct definition

That sounds good, but we’re still missing something that makes us able to call Shape_Area() as we’ve seen in Figure 2.1. What we need is a struct that stores both interface functions and related data (ie. a struct Rectangle instance and Rectangle-related methods).
The solution we’re looking for is shown in Figure 2.4. Here, instance will hold either a Rectangleor Circledata structure (this is an advantage of using the void * data type).

Now it’s time to explain how to make up a Shape, using a kind of constructor. Figure 2.5 points to the creation of a Rectangle: notice how, in the last four lines, Rectangledata and its associated methods are put inside the Shapewrapper.

To conclude with this section, here is the pretty straightforward code for Shape_Area().

3. Diamond Arrows aka Inheritance

Arrows are used to represent relationships between classes. There are three main kinds of relationships: inheritance, composition, and dependency. Despite their differences, every relationship enforces the fact that a class knows about another class.

In UML, diamond arrows are used to represent inheritance between classes. Recalling our previous example, we say that Rectangleinherits from Shape: in fact, implementing an interface and inheriting from it is the exact same thing. The class diagram is shown in Figure 3.1.

Figure 3.1: Shapes hierarchy, using inheritance relationships

4. Composition and Dependencies

In OOP, objects need to interact with each other. In a typical client/server application, the client needs to have a way to interact with the server and vice versa. However, this kind of relationship is not restricted to networking applications: if objects of class A make use of objects belonging to class B, then A is a client (or user) of B.

Recall the listing shown in Figure 2.1. Now suppose that the code is inside a GeometryAppclass, which makes use of rectangles and circles via the Shapeinterface. In this situation, we say there is a composition relationship between GeometryAppand Shape. It’s worth to mention that the connection is not bidirectional: in other words, GeometryAppknows that the Shapeclass exists, but Shapeis completely independent from GeometryApp.

UML composition is represented via straight arrows, as in Figure 4.1. You may have noticed that the attributes of GeometryAppare “reacting” to the composition: the shapes array can be seen as the byproduct of the relationship. Moreover, the N_SHAPES label close to the arrow is used to convey the so-called multiplicity, which simply clarifies how many server objects are known by each user object (in this case, every GeometryAppobject stores N_SHAPESpointers to Shapeobjects).

To conclude, I’d like to mention the situation in which an object comes to know about another object, yet not keeping it as an attribute. For instance, think of a Rectangleobject that gets passed as a parameter to function F. In those cases, we say that Fhas a dependency on the Rectangleclass. Take a look at Figure 4.2, in which Fhas been put inside class A.

Wrapping Up

That’s it for today. I hope this reading has helped to clarify the main concepts behind UML class diagrams, and how to apply them using C language. However, the Unified Modeling Language has much more capabilities, especially for dynamic modeling. I hope to be able to bring you soon a discussion about applying sequence and interaction diagrams to C programming, which is something I do very frequently in all of my projects!

Thanks so much for your attention. For any question, feel free to contact me wherever you feel more appropriate. See you next time! 😊