The C++ Programming Language

Part 1: Basic concepts

The objective of this tutorial is to see the base elements of the language. We will not cover all the details of the language, but we will just scratch its surface. Our goal is to get you motivated and start developing your first programs in C++. If you are really interested in the language, and it fits your needs, then you will definitely find a lot of resources, either on the internet or in good bookstores, that will help you understand and utilize the language to its fullest potential.

The samples we will create will run on any operating system, since we are only working with the language features, and not any specific operating system features.

In this first part we are going to see some fundamental concepts that will show you the very basics of the language

Our first program

We are starting with the classic program every book or tutorial starts for as long as I can remember. A program that greets the user, or as it is known the hello world application.

/* this is our first program
    created: 2024
    purpose: show basic concepts of C++
    copyright: add your name here 
*/
#include <iostream>

// our main function, the entry point to our program
int main()
{
    std::cout << "Hello from part 1!\n";
    return 0;
}

Let us go through the code we see. In the top of the code, we see a part that is marked in different color and looks like plain English. It starts with /* and ends with */. This block is a comment. The compiler ignores it completely. So, we can write anything we want. We usually write comments to improve the readability of our code, or to keep track of the history of the file, as is the case with this comment. Further down the code we see another line with the same color that starts with a double slash //. This is another comment. In C++ we can start a comment with double slash and the compiler ignores whatever we write from that point until the end of the line.

The next is the line that starts win the directive #inlcude. This instructs the compiler to read another file. In this case it reads the file called iostream and it is part of the Standard Library of C++. It contains information on how to print information to the screen that we will need in our program. For the time being please accept it without much detail, it will become clear as we go along.

After the one line comment we see something strange, at least for a beginner. In C++, the code is organized in functions. This function is called main, and it is the most important function in C++ programming. According to the language specification the execution of our program starts from this function. Every program MUST have a main function, otherwise we get an error message that the main is not found.

A function has two main components. Its header and its body. The header is the first line int main(), and its body is enclosed in curly brackets. In the first line of the code, we greet the user by printing a simple message on the screen, and then we just return control to the operating system.

Keywords

C++ has a set of reserved keywords that have special meanings to the compiler and cannot be used for naming variables, functions, or other identifiers. Here are some of the commonly used C++ keywords:

All these will be explained in this tutorial.

Variables

Programs are used to handle data and perform operations on them. So, every programming language has to provide us with a mechanism to handle and manipulate data. This is done with variables. Variables are primarily used to store data and then allow us to access and manipulate them.

C++ is a strongly typed language. This means that we must specify variable types so that the compiler knows how much space is required to store them, and more importantly how to handle operations involving them.

C++ has predefined built-in data types. The size of these types depends on the underlying operating system and architecture. First we will introduce the data types and then we will see the space they take up and the values they can handle.

  1. Integers: int and long.
  2. Floating point numbers: float and double.
  3. Characters: char.
  4. Booleans: bool.

The system architectures, also referred to as data models, which determine C++ data sizes are:

  1. LP32 or 2/4/4: int is 2 bytes long, long and pointers are 4 bytes long. This is the Win16 API.
  2. ILP32 or 4/4/4: int, long and pointers are 4 bytes long. This is the Win32 and Unix like 32-bit OSes.
  3. LLP64 or 4/4/8: int and long are 4 bytes long, and pointers are 8 bytes long. This is the 64bit Win32 API.
  4. LP64 or 4/8/8: int is 4 bytes long, long and pointers are 8 bytes long. Unix and Unix like systems.

The size of long in cases 3 and 4 is a fundamental difference between Windows and Linux, which are the most common operating systems. We must be careful if we plan to port our code to both operating systems. Here is the length of each data type in bytes.

Type specifierC++ standardLP32ILP32LLP64LP64
char11111
bool11111
intAt least 22444
longAt least 44448
float4444
double88888

All data types except bool can have negative and positive values. Bool on the other hand can only be either true or false. We can modify the signed value behavior for int and long data types and use them to store unsigned values using the unsigned modifier. We can modify the storage requirements of int and long using the short and long modifiers.

Here is the list of the modified data types and their respective range:

data typesize in bytesRange
int4-2147483648 to 2147483647
unsigned int40 to 4,294,967,295
short2-32768 to 32767
Unsigned short20 to 65535
long4 or 8-2147483648 to 2147483647 if 4 bytes -(2^63) to (2^63)-1 if 8 bytes
unsigned long4 or 80 to 4294967295 if 4 bytes 0 to 18,446,744,073,709,551,615 if 8 bytes
long long8-(2^63) to (2^63)-1
unsigned long long80 to 18,446,744,073,709,551,615
float41.2E-38 to 3.4E+38
double81.7E-308 to 1.7E+308
char1-128 to 127
unsigned char10 to 255

Here are some example variables:

int a = -135;
unsigned int b = 782;
long c = 2837465;
double d = 1.234;
char e = 'a';

Constants

C++ introduced the concept of constant. These are values that remain fixed throughout the execution of the program. They are declared using the const keyword. They are assigned their values when they are initialized.

const int c = 10;     // const variable
const int& cr = c;    // reference to const variable
const int* cp = &c;   // pointer to const variable

Constants can be used anywhere within the code a constant or a variable value is expected.

Operators

Operating on our data is fundamental to every programming language. Operators are the special symbols we use to perform operations on our variables and data. These operations are usually mathematical or logical.

The operators operate on operands. A simple example we all learned early in our school years is addition:

int a, b, c;   // first declare some variables
a = 10;        // assign a value to a
b = a + 10;    // b is a plus a constant value
c = a + b;     // c is the sum of a and b

C++ has a rich repertoire of operators letting us manage our data with ease. Operators can be divided into categories depending on their nature.

Operators are evaluated in strict order defined by the standard. This order is called operator precedence. For arithmetic operators' precedence is the same as in mathematics.

First, we will go through the operators available in C++ and then we will examine their order of execution.

Arithmetic Operators

OperatorDescriptionSyntax
+Add two operandsa + b
-Subtract two operandsa - b
*Multiply two operandsa * b
/Divide two operandsa / b
%Modulus operator, returns the remainder of the integer divisiona % b
++Prefix / postfix increment++a / a++
--Prefix / postfix decrement--a / a--
-Unary minus, changes the sign of the numeric value-a

Arithmetic operators are used to perform classic arithmetic operations like addition. The only addition of the language are the increment/decrement operators that are used to increment or decrement integer values.

Relational Operators

OperatorDescriptionSyntax
==Checks if two operands are equal or not, if they are returns true, otherwise falsea == b
!=Checks if two operands are equal or not, if they are NOT returning true, otherwise falsea != b
>Checks if the value on the left is greater than the value on the righta > b
<Checks if the value on the left is less than the value on the righta < b
>=Checks if the value on the left is greater than or equal to the value on the righta >= b
<=Checks if the value on the left is less than or equal to the value on the righta <= b

Relational operators are used to compare variables. Nothing new was added to them by the language. They perform the comparison, and they return true or false.

Logical Operators

OperatorDescriptionSyntax
&&AND operator, returns TRUE if both operands are TRUEa && b
||OR operator, return TRUE if either of the operands is TRUEa || b
!NOT operator, negates logical status of expression!a

Logical operators perform as we have learned in Boolean algebra. Anything different than zero (0) is considered to be true and anything zero is false. The Boolean combination of such values produces the respective result.

Bitwise Operators

OperatorDescriptionSyntax
&AND between each bita & b
|OR between each bita | b
<<Shift bits to the left as many places as the second operand indicatesa << b
>>Shift bits to the right as places as the second operand indicatesa >> b
~Invert all the bits of the value following~a
^XOR between each bit a ^ b

These operators operate on the variables on bit level. This is so easy to explain right now since we have to explain how data is stored in binary format and then how these operators work. We will address this later.

Assignment Operators

OperatorDescriptionSyntax
=Simple assignment, assign value on right side to the variable on the lefta = b
+=Add and assign, add value on right side to the variable on the lefta += b
-=Subtract and assign, subtract value of right side from the variable on the lefta -= b
*=Multiply and assign, multiply value of right side with the variable on the left and store to the variable on the lefta *= b
/=Divide and assign, divide value of variable on the left side by the value on the right and store to the variable on the lefta /= b
%=Modulo and assign, take the remainder of the division of the variable on the left by the value on the right and store to the variable on the lefta %= b
<<=Left shift and assign, shift left the value of the variable on the left side as dictated by the value of the right and store to the variable on the lefta <<= b
>>=Right shift and assign, shift right the value of the variable on the left side as dictated by the value of the right and store to the variable on the lefta >>= b
&=Bitwise AND and assign, Bitwise AND value on right side to the variable on the left and store to the variable on the lefta &= b
^=Bitwise XOR and assign, Bitwise XOR value on right side to the variable on the left and store to the variable on the lefta ^= b
|=Bitwise OR and assign, Bitwise OR value on right side to the variable on the left and store to the variable on the lefta |= b

The first operator clearly assigns a value to a variable. The rest perform an operation on the variable on the left, using whatever is on the right, and then they assign the result to the variable on the left.

Misc Operators

OperatorDescriptionSyntax
sizeofsizeof(a)Returns the size of the variable
Conditional (? :)Condition ? a : bIf condition is TRUE returns a, otherwise b
Comma ,a = b , cCalculates b and ignores result, returns value of last expression, so a will take the value of c
(.) and (->)a.b or a->bMember access operators for structs, unions and classes
casttype_cast(a)Casting operators convert data to different types
(&) address of&aaddress of operator (&) returns the address of variable
(*) dereference*adereference operator(*) addresses the variable

Apart from the conditional operator which selects a result based on a logical condition, the rest of them ae used to convert variables, separate list items, or retrieve information about a variable.

Operator precedence

Operators are evaluated according to well defined precedence. For example, a=7+3*2 will result 42 and not 20, because multiplication has higher precedence over addition and is executed first. If we write a=(7+3)*2, then the result is 20 because parentheses have higher precedence and the addition in them executes before the multiplication. Mathematical operators in particular follow the precedence defined in mathematics, because otherwise we would end up with a language that could not be used for any scientific work.

Here is the list with operator precedence as defined in C++. Associativity is the order in which the operators are executed when more than one is in an expression like a+b+c+d.

In the table below precedence decreases as we go down. Remember that operators with higher precedence, so closer to the top of the list, will be evaluated first.

TypeOperator(s)Associativity
Postfix() [] -> . ++ --Left to right
Unary+ - ! ~ ++ -- cast * & sizeofRight to left
Multiplicative* / % Left to right
Additive+ -Left to right
Shift<< >> Left to right
Relational< > <= =>Left to right
Equality == !=Left to right
Bitwise AND&Left to right
Bitwise XOR^Left to right
Bitwise OR|Left to right
Logical AND&&Left to right
Logical OR||Left to right
Conditional?:Right to left
Assignment= += -= *= /= %= >>= <<=&=^=|=Right to left
Comma,Left to right

Statements

In C++, the statements are the building blocks of a program. They are executed sequentially and can be categorized into several types:

Expression statements: These include expressions followed by semicolons, like this:

a = 10;
b = a + 10;

Compound statements: These are groups of statements enclosed in curly braces {}. Here is an example:

{
    int a, b;
    a = 10;
    b = a + 10;
}

Declaration statements: These are the statements where we declare our variables:

int a, b;

All but the compound statements MUST be terminated with a semicolon. This is the way the compiler understands the end of a statement. This convention allows us to write many statements in one line of code:

int i; double d = 0.45;sum = a + b + c; diff = d - e;

Although this is a perfectly valid piece of code, it has proven to be a bad practice, because putting so much stuff together makes the code hard for humans to understand, and eventually upgrade or fix.

Functions

A function is a block of code that performs a specific task. It can be called multiple times within a program, which helps in reducing code redundancy and improving readability. Using functions, we can break down a big and complex task into smaller steps easier to implement and maintain.

We can pass them data, referred to as parameters or arguments. They are used to perform certain operations such as counting the words in a sentence or calculating the square root of a number.

As we said in the beginning, C++ code must be inside functions. The definition of the language does not support code scattered in the source files but only organized in functions. Every C++ has at least one function. That is the main function, and it is the entry point to our program.

Defining functions in C++ is simple. Here is the form of a C++ function:

return_type function_name(argument_list) 
{
    statement(s)
}

A function definition consists of two things. The function header and the function code.

The function header tells us all we need to know if we want to use the function. First, we see the return type of the function. Functions in C++ can return a value so their return type can be anything we can store in a variable or void if the function does not return anything. Then comes the name of the function. We use this name to call the function whenever we need it. At the end we have the list with the arguments the function expects. It is a comma separated list of variable declarations. This list can be empty.

The function code is made up from valid C++ statements. From what we have seen so far valid statements in C++ are variable declarations, assignment of values to variables and function calls. As we keep learning more about the language we will see more.

Here is an example of a function in C++.

void simple_function() {
    std::cout << "this is a simple function\n";
}

This is a simple function that just displays a message on the screen and does not return any value, thus it is declared as void.

Header files

Header files in C++ are essential for organizing and managing code, especially in larger projects. They typically contain declarations of functions, variables, classes, and other entities that can be shared across multiple source files.

What are header files

These files allow us to declare the interfaces of your modules or libraries. This means we can define functions, classes, and variables in one place and then include them wherever needed.

Header files usually have the extension (.h) or (.hpp).

By breaking our projects into multiple we make them modular and easier to maintain, while we make our code reusable. Header files are key files in this way of code organization.

Typically, we have two main categories of header files. These are the Standard headers versus the User-defined headers. The first are usually located where the Standard Library is and the compiler knows where to look for them, while the latter are located with our code, and we instruct the compiler where to look for them. Here is an example of header usage:

#include <iostream>   // Standard Library header
#include "part1.h"    // custom user-defined header

Namespaces

Large programs like a CAD application or a physics simulation can have several thousand files. The number of the names of all the functions and other globally defined objects, such as user defined types, grows rapidly. It is inevitable that the same name will come up twice creating conflicts.

Namespaces allow us to group together related entities and organize the global symbol names. This is evident in the standard library. It is in the std namespace. This ensures that we can use it in parallel with another library that performs similar tasks and be sure that under no circumstances will we encounter the same name and have a problem.

The syntax of the namespace is simple. All we must do is use the namespace keyword followed by the name we want to use and enclose everything in curly brackets.

In the header file where we declare our code we write:

namespace A {
    // declare some function
    return_type func_name(arg_type arg);
    // continue with other declarations
    other_declarations
}

And in the source file we write:

namespace A {
    return_type func_name(arg_type arg)
    {
        statements
    }
}

Accessing an entity within a namespace requires us to precede the entity name with the name of the namespace and the scope operator '::'.

Here we define our sample_namespace namespace and within this namespace a function that calculates the average of two numbers.

// the declaration in the header 
namespace sample_namespace
{
    float calculate_avarage(float a, float b);
}

// the definition in the source
namespace sample_namespace
{
    float calculate_avarage(float a, float b)
    {
        float sum = a + b;
        float avg = sum / 2;
        return avg;
    }
}

// the function call
float f = sample_namespace::calculate_avarage(12.3, 14.6);

We can avoid using the namespace name when calling the function if we use the using keyword:

using namespace sample_namespace;
float fv = calculate_avarage(2.3, 4.5);

Although it is very convenient I would not recommend using it, especially in large projects. It is very convenient to know where the function you are using comes from. Functions in the std namespace for example are known to be stable and you should elsewhere should a problem ever emerges.

Basic I/O

One of the basic things a program needs is communication with the user. Either in the form of input when it gets input from the user, or as output when presenting results or execution options, or anything the programmer could ever imagine or need. A simple keyboard input or a message printed on the screen are the most basic operations a program should perform.

C++ has no built-in mechanism or commands we can use to perform any of these tasks. The designer of the language took this characteristic from the C programming language. Instead of having a built-in mechanism in the language it completely relies on libraries, usually written in the language itself, to provide this functionality.

The library in C++ is called Standard Library. It is a really big library because over the years it has adopted a lot of functionality covering many areas of programming. It is under the control of the International C++ Standardization Committee so we can be sure that it contains essential functionality and stable code we can certainly rely on in our projects. All the entities of this library are in the std namespace.

This library will give us the input and output functions we are going to use throughout this tutorial. After all the only things we need are reading some user options from the keyboard and displaying some messages.

To use the input/output functionality of the library we need to include the header. This contains all the required information for the compiler about the I/O functionality. The Standard Library is broken down into many modules, based on the functionality they provide. This improves compilation time because otherwise it takes too long to read all definitions of the library.

In the library the input and output mechanisms are treated as streams of information, so the name iostream was given the header. From this functionality we will need only two streams, the cin for input, and the cout for output.

In our example we prompt the user to input a number, then we read it and finally we display it:

#include <iostream>
void basic_io()
{
    double a;
    std::cout << "insert a:";     // give the user a hint
    std::cin >> a;                // read the value of a from the keyboard
    std::cout << "input was:" << a << "\n";  // print what the user typed
}

Invoking the I/O functionality looks a little strange. We are not using a typical function call syntax. Instead, we are using these odd-looking operators. These are the streaming operators, pushing data to the output stream, or pulling from the input stream.

The direction of the operator is actually showing the flow of data. In the case of output, we have the line

std::cout << "insert a:";

Here we have the output stream, std::cout, and we are pushing the string “insert a:” to the stream to be displayed on the user’s screen.

In the next line

std::cin >> a;

we wait until the user has typed a number and the input stream, std::cin, has copied it to our variable.

Finally, we create a multiple output by sequentially streaming a prompt,

std::cout << "input was:" << a << "\n";

our variable and the new line character.

Escape Characters

Since we have presented it several times so far when we print something on the screen, it should be fair to explain it. At the end of our output, we issued the new line character. This strange two-character string instructs the compiler to send the new line character, a special character in computer systems that actually moves the cursor to the beginning of the next line. We must point out here that although we write two characters the std::cout stream translates it to the new line character.

This is not the only character sequence that gets converted into something else before it is sent to the system. This group of character sequences are called Escape Characters, and they are used to control text output. Initially they were designed for printers in the first days of computing when printer type devices were used instead of monitors. Back then it was essential to make a beep, move backwards or move to the next line.

Here is the list of the Escape Characters:

Escape sequenceAction / character represented
\aAlert Beep
\bBackspace
\nNew line, go to the start of the next line
\rCarriage return, go to the start of the current line
\\Print the backslash (\) character
\’Print single quote
\”Print double quote

Summary

The C++ Programming Language