Writing in C

There's strictly no warranty for the correctness of this text. You use any of the information provided here at your own risk.

About C
Compiler
A First Program - Hello World
for-Loops
while-Loops
If-Statements and Comments
break an continue
switch case
Simple Datatypes: int, double, char and size_t
Literals
Integer-Division and Modulo
Power and Square Root of a Number
The Preprocessor. #define. #include. Constants. Import of External Source Code
Read-Only Variables with const
Constant enumerations with "enum"
Functions. Passing Arguments by Value
Declaration, Definition and Initialization
Global Variables. Keyword "extern"
Keyword "static"
typedef
Arrays
Pointers. Passing Data to Functions by Reference
Memory Management with malloc(), realloc() and free()
Arrays and Functions
The Relation between Arrays and Pointers
Strings in C - The Whole Story
Arrays of Strings
Strings with more than one Line
Format String Symbols of "printf()" and others
Getting User-Input: getchar(), scanf(), fgets(), getline()
Example-Program "Cookie Monster"
Structures with struct
Unions
Emulating Object-Oriented Programming in C
Type Casting. Pointer to void. Array of Pointers to Different Datatypes
Iterating over a struct isn't supported
Memory Areas of C-programs
Managing More Complex Data Using an Array of Structs
Linked Lists (Dynamically Changeable Lists)
Getting a Random Number
Peek and Poke
Alternative Syntaxes
Working with Files
Keyword Summary of Variable Datatypes

1. About C

The programming-language C was created at the beginning of the 1970s by Dennis Ritchie (1941-2011) to (re-)implement the operating-system UNIX.
It is a somehow laconical programming-language, that has only a minimal set of instructions. Everything else, for example even functions for string-manipulation, has to be imported as libraries.

C-code is close to the system and runs fast. Only assembler-code would be faster. Programs can be linked together from numerous files, so huge applications can be written in C by large teams of programmers.

Someone once said "C is like a small, sharp knife". Data is usually manipulated on the level of bytes.

It's not always easy (or necessary) to handle such a "small, sharp knife". In many cases, C is not the easiest language to write code. In 1987 Larry Wall created the interpreted language "Perl" to make manipulation of text easier for the programmer than in C. He wrote the Perl-interpreter in C though and also borrowed a lot of C's syntax for his language. So it's in any case useful to be familiar with C.

If your task just requires a smaller script, that doesn't have to run too fast and can be altered or even thrown away later, and if you want to have faster development and an easier life, learn and use Perl. Or Python.

But if you want to write serious, compiled applications, that run at about maximum speed and will be used by everyone for many years, if you want your code as almost "cast in stone", take the time and make the effort and write in C.

C is not object-orientated. To add classes, the language C++ was designed by Bjarne Stroustrup in the 1980s, but C++, though compatible to C, can be seen as a different language. This text is just about C.

Quite a classic is the book "The C Programming Language" by Kernighan and Ritchie (often called "K&R").

2. Compiler

C is a compiled language. That means, you write some code with an editor (like vim) or with an IDE (like Geany) in a file called "h.c", for example. Then you run a compiler on that file. If everything is fine with your code, this creates an executable file, which is called "a.out", if you don't tell the compiler to use another name.

I use gcc, the C-compiler of the GNU-project. It is free software. It comes with basically every GNU/Linux-distribution.

So, if you have a file "h.c", you can run

gcc h.c

and get a file "a.out", which you can run then with

./a.out

As very large and complicated applications can be written in C, and the executable often has to be linked against many libraries, there are lots of options to gcc. This is a topic of its own.

But concerning the very small example-programs on this page, you should be fine with the simple gcc-command above. It may be a good idea to use the "-Wall"-option, so "gcc -Wall h.c", to get all warnings.

3. A First Program - Hello World

Here's a "Hello World" in C:

#include <stdio.h>

int main(void) {
    puts("Hello World!");
    return 0;
}

The first line imports the standard library for input/output. How this line works, will be explained later.

The function "main()" is found in every C-program. It is executed, when the program is run.
The datatypes and names of the arguments that are passed to a function are written in round brackets after the name of the function. The term "void" represents "nothing". So in this case, no argument will be passed to "main()".
The datatype of the value, the function returns, is written before the name of the function, in this case, it's "int" before the name "main", as the function will return the integer 0 at the end, "return 0;" (notice, that "return" is not a function).

C is case-sensitive.

4. for-Loops

Here's an example of a for-loop in C:

#include <stdio.h>

int main(void) {
    int i;
    for (i = 1; i <= 10; i++) {
        printf("%d\n", i);
    }
    return 0;
}

Variables, like the integer "i" here, have to be declared, before they can be used.

To print integers, the function "printf()" has to be used. The first argument to it can be a "format-string", like the one in the example: "%d" means, the second argument will be an integer, it means "print the next argument as an integer". "\n" is the newline-character on Linux.

The term in brackets of the for-loop "(i=1; i<=10; i++)" has the following meaning:

i=1: Iterator-variable is "i". Loop starts at 1.
i<10: Condition for loop to run. So if i is not smaller or equal to 10, the loop will stop.
i++: Step of the loop. "i++" is short for "i = i + 1" (which is also equal to "i += 1").

5. while-Loops

while-loops work similar. The while-loop runs, as long as the condition is met:

#include <stdio.h>

int main(void) {
    int x = 1;
    while (x ≤ 10) {
        printf("%d\n", x);
        x++;
    }
    return 0;
}

6. If-Statements and Comments

Comments are written between "/*" and "*/". They can be longer than one line.
In general, whitespace characters are ignored by the C-compiler.

#include <stdio.h>

int main(void) {
    int a = 1;
    int b = 2;

    /* If there is just one line after the condition, the curly brackets
       can be left out. Use with care: */
    if (a == 1)
        printf("a is %d.\n", a);

    if (b > a)
        printf("b is greater than %d.\n", a);

    /* Meanings of symbols for logical operators:
       == : equal
       != : not equal
       && : and
       || : or
        ! : not
    */

    if (a < b && b == 2)
        puts("a is less than b and b is 2.");

    if (a < b && b == 2)
        puts("a is less than b and b is 2.");

    if (a == 10) {
        puts("a is 10.");
    } else {
        puts("a is not 10.");
    }
    if (a != 10)
        puts("a is really not 10.");
    return 0;
}

At debugging, you want to comment out large ranges of code.
But when there's a comment inside that range, it won't work, because there's a "*/" inside the range.
In C++, you can also write "// ...", and the rest of the line is a comment then. Probably because of the debugging-problem, the compiler "gcc" knows this syntax too. I suggest using it.

There is also "else if":

if ( ... ) {
    ...
} else if ( ... ) {
    ...
}

Notice, that after the last command of the if-statement, that is before the right curly bracket, there has to be a semicolon.

As already shown in the example above, the curly brackets can be left out at all, if there is just a single line of code following the if-condition. This somehow breaks the general rule of writing code blocks. But if there are many conditions followed by just one line, it makes code more readable. But obviously this exception from the rule should be used with care.

7. break an continue

With "continue", a loop (a for-loop, a while-loop), can be skipped to the next round.

With "break", a loop can be left at all. Example:

#include <stdio.h>

int main(void) {
    int i;
    puts("");
    for (i=1; i<1000; i++) {
        if (i == 3) {
            puts("");
            continue;
        }
        printf("%d\n", i);
        if (i == 10) {
            puts("End.\n");
            break;
        }
    }
    return 0;
}

8. switch case

If a variable needs to be compared to several values (integers or single chars, that can be seen as integers), you could write many if/else if-statements. There's an alternative construction in C, called "switch case-statement":

#include <stdio.h>

int main(void) {
    int a = 3;
    int i;
    switch (a) {
        case 1:
            puts("a is 1.");
            break;
        case 2:
            puts("a is 2.");
            break;
        case 3:
            for (i=0; i<5; i++) {
                printf("%d\n", i);
            }
            puts("a is 3.");
            break;
        default:
            puts("a is none of these.");
            break;
    }
    return 0;
}

Notice the colons (":") at the end of the "case"-lines.
The "default"-part is executed, if no condition of any case is met (like the "else" in "if, else if, else").

9. Simple Datatypes: int, double, char and size_t

Built-in datatypes in C are:

int
Actually, there is:
- short int
- long int
- long long int
And also the attributes:
- signed
- unsigned
"int" means "signed int". Depending on the compiler, that can be "signed short int" or "signed long int"
"printf()"'s format-string: "%d".
To specify "long int", use "%ld".
(For "long long int", "printf"'s format-string is "%lld".)
float, double, long double
Datatypes for floating-point numbers. Often, "double" is, what you want.
"printf"'s format-string: "%f". "%.2f" rounds to two decimal places.
If you pass a "float" to printf, it becomes a "double" anyway.
char
Used to store ASCII-characters. Use single quotes for the character (like 'a'). It also can be interpreted as a number from 0 to 255.
"printf"'s format-string: "%c" as a character, "%d" as a number.
It may depend on the compiler, if a pure "char" is interpreted as signed or unsigned. Therefore, you should specify, if you want to use "signed char" or "unsigned char".
void
"void" means just "nothing". It can be used as the datatype of a return-value of functions (when nothing is returned). It can't be used for variables.

Let's see:

#include <stdio.h>

int main(void) {
    int a = 10;
    double pi = 3.14159265359;
    char c  = 'm';
    printf("%d\n", a);
    printf("%f \t %.3f \n", pi, pi);
    printf("%c \t %d \n", c, c);
    puts("");
    printf("%d\n", sizeof(a));
    printf("%d\n", sizeof(pi));
    printf("%d\n", sizeof(c));
    return 0;
}

"\t" in a format-string of "printf" means "tabulator", so it moves text away a bit in the line.

Notice, there are certain limits to the datatypes of numbers. So, if you want to use very large integers or floating point numbers with many decimal places, you'll have to read more about this.

"sizeof()" is useful sometimes: It returns the amount of memory, a variable occupies, in bytes. (Actually, "sizeof ()" is not a function, but an operator. But it works kind-of like a function.)
This amount (and therefore the output of "sizeof()") is dependent on the used operating-system.

"sizeof()" returns a special datatype called

size_t

Often this is just an "unsigned int" (it is on my system). (It's defined by a "typedef" (I'll explain later)). It's used to store the size of something in memory. As it's "unsigned", it mustn't get negative.

10. Literals

Stand-alone numbers and strings in the code like '5' or '"Hello"' are called "literals". They are constants, so they cannot be changed.

An integer literal without a prefix is decimal, with the prefix '0x' or '0X', it's hexadecimal, and with '0', it's octal.
So for example '0xA5' is a hexadecimal integer literal meaning '165' (as decimal).

An integer literal can have a prefix 'U', 'L' or 'UL', to indicate, that it is 'unsigned' or 'long'. So '0xAAB5UL' would be a hexadecimal integer literal, that is explicitly 'unsigned long'.

String literals are valid C-code, although C doesn't have a datatype "string" for variables (only "arrays of char"). This has a few consequences, that are described later.

Notice, that literals are also used, when initializing variables, like in "int a = 5;".

11. Integer-Division and Modulo

Mathematically, "10 / 3" would be 3.33333.... . And if you use the datatype "double" for the division, you will get that:

#include <stdio.h>

int main(void) {
    double a = 10;
    double b = 3;
    double c = a / b;
    printf("%f\n", c);
    return 0;
}

However, if you use the datatype "int", the decimal places are cut, and "10 / 3" would be just "3" then (which is mathematically incorrect):

#include <stdio.h>

int main(void) {
    int a = 10;
    int b = 3;
    int c = a / b;
    printf("%d\n", c);
    return 0;
}

When using literal numbers (without variables), there's a difference between for example "9" (an int) and "9." (a float):

#include <stdio.h>

int main(void) {
    printf("%d\n", 9 / 5);
    printf("%f\n", 9 / 5);
    printf("%f\n", 9. / 5.);
    return 0;
}

gives this output:

1
0.000000
1.800000

So actually, in first example above, there is an internal conversion, when the int number is assigned to a double variable.
You have to take care of that. In some situations, you even may take use of that though.

With the modulo-operator "%", you can get the "rest" of the division: "10 / 3" is "3, rest 1". With modulo, you get this "1":

#include <stdio.h>

int main(void) {
    int a = 10;
    int b = 3;
    int c = a % b;
    printf("%d\n", c);
    return 0;
}

12. Power and Square Root of a Number

You get power and square root of a number like this:

#include <stdio.h>
#include <math.h>

int main(void) {
    int a = 2;
    int b = 3;
    int c = 49;
    double d = pow(a, b);
    double e = sqrt(c);
    printf("%f\n", d);
    printf("%f\n", e);
    return 0;
}

Notice, that you have to import the "math"-library here. And you need to tell gcc this with the option "-lm" (which means: Link the program with the library "m"):

gcc -Wall -lm prog.c

Output is then: 8 and 7 (as float).

13. The Preprocessor. #define. #include. Constants. Import of External Source Code

Macros

In word processor programs like "Microsoft Word" or "LibreOffice Writer", there is a function, that automatically replaces text. You can tell it for example to replace "hl" to "hello", and the next time you write "hl", it is automatically turned into "hello".
In the process of compiling C programs, there is a stage where similar simple text replacement is done by the socalled "preprocessor".
In your program, you can write certain instructions for the compiler's preprocessor, and it will replace the texts as defined. These instructions are called "preprocessor directives". If you write for example

#define TRUE 1

every occurrence of "TRUE" in your code will be changed to "1" during compilation by the preprocessor.
You won't notice it though, because the texts won't be changed in the source files, and you usually don't look into the compiled executables as they are hardly readable for humans (you can only take a look with a hex editor like "ht").

These "#define"-lines, that are called "macros", are used, because you can then write for example the word "TRUE" in your source code instead of "1", when you want to check, if an expression evaluates to "true".
These macros are also used to define constants, that should be visible in the whole program, like for example:

#define GRAVITY 9.81

So, when the word "GRAVITY" appears lateron in the source code, it means "9.81". But not, because "GRAVITY" was a variable (which it is not), but because it is replaced to the term "9.81" by the preprocessor. It's also not necessary then to think about the way, that value is passed into functions. It's simply written directly into these functions as numbers by the preprocessor.

When defining macros for strings, quotation marks have to be used, like:

#define MESSAGE "Hello World"

According to Wikipedia, the word "macro" is an abbreviation of the general term "macroinstruction". When a macro(instruction) is applied to a text, the text is changed to another text.
The C compiler's preprocessor just does simple text substitution. (While a "macro" in "Microsoft Word" is a small program, that can change a text in a more complicated way. So that's something slightly different.)
The C macro is written with the keyword "#define" followed by what word is to be be replaced by what other word, separated by a space character.
Unlike regular lines of C code, lines with preprocessor directives like macros are not terminated by a semicolon.

Conditions for the preprocessor can be programmed with:

#if
#if defined
#ifdef
#ifndef
#elif
#else
#endif

This is especially useful in large programs: Often, macros are defined only under the condition, that they aren't defined already.
Macros can be unset with the keyword "#undef".
When using the compiler gcc, a list of the defined macros of file "hello.c" (for example) can be printed with:

gcc -dM -E hello.c

#include-Statements

There is another kind of preprocessor directive, that uses the keyword "#include". The "#include" directive is followed by the name of a text-file, either in "< ... >" or in quotation marks (" ... "). When the preprocessor reaches that line, it searches for the file of the given filename. When the file is found, the preprocessor replaces the line of the directive with the whole content of that file. That way, (the header files of) libraries are imported into the source code.

If the filename is passed within quotation marks, the same directory as the source code is searched for the file. (If the filename isn't found there, also the "< ... >"-path is searched, see below.)
If the filename is passed within "< ... >", several directories are searched for the file. The default directories to be searched by gcc are compiled into the program "cpp" and can be listed with the command "cpp -v". Usually, "/usr/include" and "/usr/local/include" are searched. Other directories can be passed to gcc using the "-I"-command.

For example, the line

#include <stdio.h>

is replaced by the preprocessor with the content of the file "stdio.h", which is "/usr/include/stdio.h" on Linux.
With the content of the header file "stdio.h", often necessary I/O routines like "printf()" are imported into the program.

14. Read-Only Variables with const

Back from the preprocessor to ordinary C code.
Variables can be declared as "const". That means, they can't be altered afterwards. Therefore, the initialization has to be in the same line as the definition:

#include <stdio.h>

int main(void) {
    const int a = 10;
    /* a = 15; Wouldn't work */
    printf("%d\n", a);
    return 0;
}

Variables declared that way are stored in a read-only-area of the program's memory.

If "const" doesn't behave as expected, it may be due to the problem described here.

15. Constant enumerations with "enum"

Constant integers can also be defined with the datatype "enum":

#include <stdio.h>

enum colours { black, blue, red, magenta, green, cyan, yellow, white };
enum months { january = 1, february };

int main(void) {
    printf("The number of cyan is: %d.\n", cyan);
    printf("February is month number %d.\n", february);
    return 0;
}

You can use these enum statements to define boolean values (like in Python):

enum bool {False, True};
enum none {None};

Then, in your code, "False" and "None" will be evaluated to 0, "True" to 1.

This kind of "new datatype" can then also be used like other datatypes, for example as a return-value of a function:

#include <stdio.h>

enum bool {False, True};

enum bool test(int a) {
    if (a == 5) {
        return True;
    }
}

int main(void) {
    enum bool r = test(5);
    printf("%d\n", r);
}

16. Functions. Passing Arguments by Value

The core of C-programming, which may have been new back in 1970, is structured programming. The functionality of a large program is separated into small pieces, that deal with more specific problems.
These small pieces are code-blocks called "functions". Functions take some arguments, process them and return a return-value. Consider this program:

#include <stdio.h>

int addTen(int b) {
    b += 10;
    return b;
}

int main(void) {
    int a = 5;
    a = addTen(a);
    printf("%d\n", a);
    return 0;
}

Now, this may be a bit complicated, but it is important, to understand it:

"main()" and "addTen()" are different functions, that are completely separated. With the line

a = addTen(a);

"addTen()" is called and variable "a" is passed to it by value (there's another possible way to pass arguments to functions in C, but for now, it's by value).
So variable "b" in "addTen()" gets the value of variable "a" from "main()".
But "addTen()" doesn't know "a" itself. On the other hand, "main()" is totally unaware of variable "b" in "addTen()".
Variable "b" in "addTen()" is created, when "addTen()" is called. When "addTen()" is finished, variable "b" is destroyed.

All "local variables" (these are the variables inside a function) are destroyed, when the function is finished.

17. Declaration, Definition and Initialization

The Declaration of a variable is the statement, that there is a variable of a certain name and type in the program.
Definition of a variable means having memory allocated for it.
Declaration and definition of a variable "a" would just be

int a;

Usually, the two may be only separated, when several source-code-files are used.

Initialization of a variable means giving it a value, like in:

a = 10;

In other words:
A declaration provides basic attributes of a symbol: Its type and its name. Memory isn't allocated for the variable yet.
A definition provides all of the details of that symbol; if it's a variable, where that variable is stored. Memory is allocated for the variable.

The more modern C-language definition "C99" allows to combine all three: "int a = 10;". If the compiler doesn't support C99, it may be required, that all declarations are written at the beginning of a function, and the initializations follow separately after the declarations:

void main(void) {
    int a;
    int b;
    float c;
    char d[50];
    a = 5;
    b = 10;
}

There can also be declarations of functions. They look like the first line of the function, but without the curly brackets, instead the line is terminated by a semicolon:

int addTen(int b);

If you write the function "addTen()" below the function "main()" and try to compile with "gcc -Wall ", you'll get a warning.
If you declare the function before it is called, it compiles without warning:

#include <stdio.h>

int addTen(int b); 

int main(void) {
    int a = 5;
    a = addTen(a);
    printf("%d\n", a);
    return 0;
}

int addTen(int b) {
    b += 10;
    return b;
}

There can be several declarations, but just one definition.

This is quite obvious for functions. For variables, it becomes important, if you compile from several files of source-code.
Then, all declarations outside functions are usually written into a central file, which is called "header-file" and has the suffix ".h". It is imported into the other source-files with an "#include"-directive.
If you want to use a variable across these files, you write a definition of it into one of the source-files. Then, you write a declaration of the variable with the keyword "extern" into the header-file. That means, this is just a declaration, the definition is somewhere else (i.e. in the source-file previously mentioned). Then you import the header-file into all source-files. Then, the variable is known everywhere in the program.

18. Global Variables. Keyword "extern"

It is possible to use global variables in C.

If you use just one source-file, you can declare (and define) such variables at the beginning of the file outside any function. Then they are known in every function, without any further declarations inside the functions.

If the global variable is for some reason declared below the function, where it is used, it has to be declared again inside the function using the keyword "extern". Although this case might be rare.
But this declaration with "extern" is also required, if the global variable is declared in another source-file. And this case is rather common.

#include <stdio.h>

int main(void) {
    extern int a;
    printf("%d\n", a); 
    return 0;
}

int a = 15;

Or here's another example, declaring the variable outside a function, but defining the variable (setting the value of the variable) inside a function. It is then known in other functions too, as it's a global variable after all:

#include <stdio.h>

int a;

void showA(void) {
    printf("%d\n", a);
}

int main(void) {
    a = 5;
    showA();
    return 0;
}

Of course, you should prefer local variables, but global variables are for example useful, if complex data shall be easily made available to several functions.
It's also alright to use global variables, in case you should write C-code for very small systems, like a vintage Sinclair ZX Spectrum with just 48K or an Atari 800 XL with 64K for example. The program can't become that long, that you get confused by your variables then.

19. Keyword "static"

If the keyword "static" is put in front of the declaration of a function, the function is hidden from code outside the concrete source-file.

If "static" is put in front of the declaration of global variables (outside any functions), the global variable is hidden from code outside the source-file.

If "static" is put in front of the declaration of a local variable , something totally different happens: The local variable doesn't get destroyed when the function closes. That means, when the function is called again, the static variable "remembers" the value it had, when the function closed last time. That is pretty weird. Better not use it.

20. typedef

With "typedef", you can define your own names of datatypes. "typedef" can create new names for existing datatypes, but not create new datatypes. If you encounter strange words in code, that look like unknown datatypes, look out for a typedef-declaration somewhere.

#include <stdio.h>

typedef int Number;

Number main(void) {
    Number n = 10;
    printf("%d\n", n);
    return 0;
}

21. Arrays

You can define static arrays of the different types of data. The drawback is, you have to know in advance, how much memory you want to reserve for the arrays. Example:

#include <stdio.h>

int main(void) {
    int n[5] = {10, 11, 12, 13, 14};
    double f[5] = {1.1, 1.2, 1.3, 1.4, 1.5};
    char c[5] = {'a', 'b', 'c', 'd', 'e'};
    int i;
    for (i=0; i<5; i++) {
        printf("%d \t %.1f \t %c \n", n[i], f[i], c[i]);
    }
    return 0;
}

When you combine declaration and definition of the array (like above), you can leave out the number in the declaration ("int n[] = ...";).

Of course you can (and probably should) reserve more memory for the array than is really needed. You just have to watch out, that you don't define more elements, than the array can carry, as that would lead to a severe bug.

When you just declare an array, simply nothing is written yet into the reserved memory. Whatever the bytes in the reserved area already hold, is still there, until you write something else into the array.
That means, if you declare an array of let's say 100 bytes, and you want to fill it with some numbers, there isn't any built-in way to tell, how many numbers the array already holds at the moment, and how many of the bytes are still uninitialized. You just have to keep track, of how many numbers you have already written, yourself:

#include <stdio.h>

int main(void) {
    int a[100];
    int anum = 0;
    a[0] = 10;
    anum++;
    a[1] = 20;
    anum++;
    a[2] = 25;
    anum++;
    printf("%d elements in the array.\n", anum);
    a[2] = 0;
    anum--;
    printf("%d elements in the array.\n", anum);
    return 0;
}

While some compilers allow to declare the size of an array with a variable (like "int arr[b];", others require constant values (like literal numbers) for that. It's a good idea to define constants (as compiler macros) for that, if possible:

#define ARR_SIZE 5

void main(void) {
    char arr[ARR_SIZE];
}

For using arrays of strings, see below.

22. Pointers. Passing Data to Functions by Reference

All variables in programming have certain features. You can add two integers and get another integer, according to mathematics. You may concatenate two strings and get a longer string - which isn't defined in mathematics at all, so it's quite a different process.

A pointer is a variable, that has features of its own, too.
It holds the memory-address of another variable. And it can be "dereferenced". Then the result represents the content of the other variable.
So, there's the pointer itself, and then there's the content, it points to.
Pointers have their own datatypes. These datatypes correspond to the content, they point to. An example:

#include <stdio.h>

int main(void) {
    int a = 10;

    /* The following two lines can also be written as:
       "int *p = &a;"
       But it is not obvious, what they mean then: */

    int *p;
    p = &a;

    printf("%d\n", *p);
    return 0;
}

"int *p" declares a "pointer to int". So pointer "p" is declared as of datatype "int *".
The identifier of the pointer is not "*p". In the declaration, the "*" is part of the name of the datatype called "int *".
The identifier of the pointer is "p".
"p = ...;" points the pointer to the memory-address of another variable.
"&a" means: "The memory-address, where variable 'a' is stored".
By using "*p", the pointer "p" is dereferenced.
"*p" then represents the content. The content at the memory-address, the pointer points to.

Often, functions take the pointer itself as an argument. If the function uses its content, the dereferencing happens inside the function then. So, though you want to do something with the content then, you often don't pass "*p" to the function, but just the pointer, which is "p".
"printf()" uses "*p" here, because with the format-string "%d" it expects an integer, and "*p" (the content) is an integer, while "p", the pointer is not (unless you mean the memory address, which is a very large integer too, but that's probably not what you wanted). printf's format string "%p" would expect a pointer, but would display the memory address, it points to.
printf's format string "%s", which is used to print strings, expects just the pointer to "char" called "p", although the content is printed, not the pointer itself. The pointer is dereferenced inside the "printf()"-function (I believe).

To every human being, pointers are confusing at first, sometimes even lateron. Just ask yourself, if you want to use the pointer itself (p) or its content (*p) and where the pointer points to at the moment.

Pointers have the advantage, that they can be return-values from functions. So you can return large amounts of data (large arrays for example) by returning a pointer. Often that's the only way to do that.

If you have an ordinary int, and point a "const int *"-pointer to it, you can inspect the value of the int, but not change it. That's sometimes useful.

Pointers are often used to pass larger data-structures to functions.
If you have for example an "array of int" called "a[]", you can point a pointer to that array and pass the pointer to a function. Then, the array is available inside the function through the pointer.
But it's the same array (at the same memory-address) like the one outside the function. So, by passing the pointer, the array doesn't get passed "by value" to the function, but "by reference". That means, if you change the array inside the function, outside the function the array gets changed too, as there is only one array.
Notice, that although the data, the pointer points to (an array for example), gets passed to the function "by reference", the pointer itself gets passed "by value". So you have a pointer inside the function that is different from the pointer outside the function, but both pointers point to the same memory-adress of the data-structure. If you change the pointer itself inside the function, the pointer outside the function remains unchanged:

#include <stdio.h>

void test(int *b) {
    /* "b" is different from "p".
       Changes to "b" don't change "p": */
    b++;
    printf("%d\n", *b);
    /* "a[]" can be accessed through "b".
       This effects "a[]" outside the function too.
       So this line changes "a[1]": */
    *b = 10;
}

int main(void) {
    int a[] = {1, 2};
    int *p = a;
    test(p);
    /* Function "test()" didn't change "p": */
    printf("%d\n", *p);
    /* Function "test()" did change "a[]" though: */
    printf("%d\n", a[1]);
    return 0;
}

23. Memory Management with malloc(), realloc() and free()

In C, you have to manage the memory, the data in your program needs, yourself.

For simple datatypes such as a single int, float, double or char, the memory is allocated automatically by declaring the corresponding variable.

When for example declaring an array as "int a[25];", then 25 bytes are reserved in memory for this variable.

But another common method to allocate memory for larger portions of data is to use the functions "malloc()" (= "memory allocation"), "realloc()" and ""free()" in combination with a pointer. For example

char *a = malloc(100);

reserves 100 bytes somewhere in memory, and defines a "pointer to char", that points to that memory.
(If the memory allocation fails, "malloc()" returns a NULL-pointer. It should be checked, if "malloc()" was successful or not.)
After the memory is allocated, it can be used. It is possible to access it, because of the pointer pointing to it.

strcpy(a, "Hello");

As pointers can easily be passed to functions and be returned from them, larger data structures in memory can be handed over to functions or received this way.
If the size of the memory shall be redefined, "realloc()" can be used:

a = realloc(a, 200);

When the data is no longer needed, the reserved memory has to be freed by calling "free()" with the pointer:

free(a);

It's not necessary to call "free()" in the same function, where the memory was allocated. "free()" can be called from anywhere in the program. It just frees the memory, to which the pointer, that is passed to it, points.

A single "int" occupies several bytes in memory. So when allocating memory for several ints, the operator "sizeof()" should be used to calculate the required number of bytes:

/* Allocating memory for 7 ints: */
int *a = malloc(7 * sizeof(int));

So that's quite a common routine, when writing a C-program: Allocate memory for the data, you want to use using "malloc()", and define a pointer, that points to that memory. Process the data in the reserved memory location using the pointer. Maybe pass the pointer to one or several functions, and process the data there. When done, free the memory by calling "free()" with the pointer as an argument.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *getWorldString(void) {
    char *s = malloc(6);
    strcpy(s, "World");
    return s;
}

void changeYello(char *s) {
    s[0] = 'H';
}

void main(void) {
    char *a = malloc(12);
    strcpy(a, "Yello ");
    printf("%s\n", a);
    changeYello(a);
    printf("%s\n", a);
    char *b = getWorldString();
    strcat(a, b);
    printf("%s\n", a);
    free(a);
    free(b);
}

24. Arrays and Functions

In C, arrays can't be passed to functions directly. But a pointer to an array can be passed to a function. As array-names can be somehow seen as pointers to the array, you can pass the name of the array (together with its size) to a function, that expects a pointer to that array. After that you can just use that pointer, as if it was the array:

#include <stdio.h>

void printArray(int *p, int psize) {
    int i;
    for (i = 0; i < psize; i++) {
        printf("%d\n", p[i]);
    }
}

int main(void) {
    int a[5] = {1, 2, 3, 4, 5};
    printArray(a, 5);
    return 0;
}

Of course this leads to "passing by reference", so any changes made to the array inside the function also effect the array outside the function.

25. The Relation between Arrays and Pointers

Pointers and array-names are closely related. But they are not identical.
When there is an array

char a[] = {'i', 'n', '\0'};

then "a" is a pointer to "&a[0]", that is, "a" is a pointer to the first address of the array.
So in a way you can say "The name of an array is a pointer to that array".

In particular, a function can have a parameter of type "pointer to char" ("char *"), and an "array of char" (char a[5];) can be passed in to it. Because "the name of an array is a pointer to that array". That is rather convenient and should be memorized.

But unlike an ordinary pointer, the array name always points to the starting address of the array. It can't be made pointing somewhere else.
So you can't do "a++" with the array-name, though you could do "p++" with a pointer to the array, even if it's const.

As there are these differences, I suggest, not messing the two. Access array-elements for example with

a[1] = 't';

and if you want to use pointers, point another pointer to the array:

char *p = a;

26. Strings in C - The Whole Story

Introduction. Two Ways to Create a String.

In C, there are two different ways to create a string:

The first one is to use an "array of char". For that, the maximum size of the string has to be known in advance. And such an array is only a good choice, if the string isn't supposed to be returned from a function.
In all other cases, the string should be created by allocating memory for it using the function "malloc()", and setting a "pointer to char" to the newly allocated memory. After that, "strcpy()" is used, to copy the content of the string into the reserved memory.
When the string isn't needed any more, the program has to explicitely set the used memory free by calling the function "free()" with the used pointer passed as an argument to the function.
If the size of the string changes throughout the program, different memory can be allocated for it using the function "realloc()".

C doesn't have a datatype "string". It only has the datatype "char", which represents only a single character. In memory, chars have a size of one byte. They have to be written in single quotes, like for example 'A'. This notation in single quotes corresponds directly to their ASCII-value. So in the code, 'A' can also be used to represent 65. chars are integers of one byte. With "printf()", they can be can printed as integers or characters, depending on the format string used:

#include <stdio.h>
void main(void) {
    char a = 'A';
    printf("%d\n", a); /* Result: 65 */
    printf("%c\n", a); /* Result: A */
}

In C, there are not just single chars, but also arrays of them. A static string is in C by definition a number of coherent chars in an array, with the last char being '\0' to terminate the string.

"printf()" with the Format String "%s"

That the function "printf()" can be used with the format string "%s" to print a string, doesn't mean, that there's a datatype "string" in C. Instead, in this case "printf()" expects a "pointer to char". That can be either such a pointer, or the name of an array (which can also be interpreted as a pointer to that array). "printf()" then follows the pointer, and prints everything it finds as a character, until the terminating char '\0' is reached.

String Literals

In C, there's also such a thing as "literals". There are "integer literals" like 5 or 25223 and string literals like "Hello" in the code. So there isn't a datatype "string" for variables in C, but there are string literals. Literals are stored in a memory region of the program, that can't be changed. As a result, literals themselves can't be changed, they are constant (const). It is possible though, to copy the data of the literal into other memory regions, especially into the memory of ordinary variables. These copys of the literal data in variables can then be changed.

Initialization of an "Array of char" using a String Literal

This is what happens, when you declare an "array of char" variable and initialize it with a string literal. The literal is stored in the read-only memory area, but its data is copied by the compiler into the memory area of the "array of char". Therefore the data in the array can be changed:

#include <stdio.h>
void main(void) {
    char a[6] = "Yello";
    printf("%s\n", a);
    a[0] = 'H';
    printf("%s\n", a);
}

Notice, that the string "Hello" has only 5 characters, but one more byte has to be reserved to store the terminating '\0'-character of the string.

To be able to initialize an "array of char" with a string literal is only a feature of the compiler for convenience. It is an abbreviation of:

char a[6] = {'H', 'e', 'l', 'l', 'o', '\0'};

In this case it is also possible to leave out the number of bytes ("char a[] = ..."), because the compiler can determine, how many bytes are needed, by the size of the string literal.

If the compiler didn't have these convenience features, the function "strcpy()" would have to be used, to copy the data of a string literal into an "array of char". It's not always possible to use a string literal for initialization, so the function "strcpy()" is still often needed to create a string.

Accessing the Elements of the "array of char"

The single elements of an array can be accessed by using numbers inside the square brackets of the array name, like "a[5]" for example. This is called "array notation". Here's a simple example:

#include <stdio.h>

void main(void) {
    int i;
    char a[] = "Yello";
    a[0] = 'H';
    for (i = 0; i &tl; 5; i++) {
        printf("%c\n", a[i]);
    }
}

This array notation also works, when using a pointer to char to access the string data.

Working with Strings in Functions

Strings are often manipulated inside functions. But how can the resulting string be retrieved from the function? "Arrays of chars" and other variables, that are declared locally inside a function, are deleted, when the function ends.
The answer is, to use the second method to create the string, that is using "malloc()" plus "strcpy()", and finally "free()".

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *getString(void) {
    char *a = malloc(20);
    strcpy(a, "Hello World");
    return a;
}

void main(void) {
    char *s = getString();
    printf("%s\n", s);
    free(s);
}

Notice, that the reserved memory doesn't have to be freed inside the function it was allocated in. It can be freed anywhere in the program, just as long as there's a pointer, that can be used to access it.

String Functions

You can then use functions on these strings, that are defined in the header file "string.h" of the standard library. Essential functions are:

strcpy()
Copies the data of a string or a string literal into the memory of a string (for example of an "array of char"), taking care of the ending '\0'. "strcpy()" is often used to build a string in the first place.
strlen()
Returns the number of characters of a string as an int. So 'strlen("abc")' would be 3.
strcat()
Adds one string to another, taking care of the ending '\0'.
strcmp()
Compares two strings and returns 0, if they are equal.
strstr()
Checks, if a substring is part of a string. Returns a NULL-pointer, if not.
strdup()
Returns a copy of the string with allocated memory.

The library "stdlib.h" also has the function:

atoi()
Converts a string, that holds a number, into an integer.

The library "stdlib.h" has a function, that does the opposite:

sprintf()
Writes "printf()"-type output (like for example integers) into a string. The function is quite powerful. For example gives you
```
char s[20];
sprintf(s, "%03d", 5);
```
trailing zeros, so that "005" is written into "s".
```
sprintf(s, "%.2f", 32.51245);
```
converts the floating point number into a string and rounds it to two decimal places (like used in currencies).

Here's an example on how to use these functions:

#include <stdio.h>
#include <string.h>

int main(void) {
    char a[20];
    char b[20];
    int c = 34;
    int d = 17;
    strcpy(a, "Hello");
    printf("%d\n", strlen(a));
    strcpy(b, "World");
    printf("%s\n", a);
    printf("%s\n", b);
    strcat(a, " ");
    strcat(a, b);
    printf("%s\n", a);

    strcpy(a, "Area ");
    sprintf(b, "%d", c + d);
    strcat(a, b);
    printf("%s\n", a);
    return 0;
}

Dynamic Strings with "malloc()" in Combination with a "Pointer to char"

When using "malloc()", the wanted number of bytes is passed to it as an argument. Then "malloc()" asks the operating-system, if the program can have that many bytes. If it can't, "malloc()" returns a socalled NULL-pointer, if it can, the bytes are provided:

char *mystring = malloc(100);

After that, "strcpy()" can be used to copy string data into that memory area.
When the string isn't needed any more, the memory, that was reserved for it, has to be freed, by passing the "pointer to char" to the function "free()".

But there's something, you have to keep in mind: If the string is supposed to be changeable, the "pointer to char" mustn't be assigned directly to a string literal. If you write

char *a = "Hello";

which is tempting of course, the string literal "Hello" is stored in the read-only memory area for literals, and a pointer called "a" is set to that unchangeable memory location.
Nothing more. The data of the string literal is not copied to a writeable memory area of a variable. So it stays a constant string literal, it doesn't become a changeable variable.
If you want a "real" string in a writeable area, you have to use "malloc()" and then "strcpy()" (and finally "free()" as described above.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void) {
    /* This is tempting, but don't do it: */
    char *a = "Hello 1";
    puts(a);
    /* Instead do this: */
    char *b = malloc(100);
    strcpy(b, "Hello 2");
    puts(b);
    free(b);
    return 0;
}

If the allocated memory isn't sufficient throughout the program, more memory can be requested with the function "realloc()".

Comparing the two methods to create a string in C: What is an array? It is a number of bytes in memory. They are associated with a certain datatype, so that the program knows, what kind of data is stored there. And the data is accessible through the name of the array.
When "malloc()" is succesful, you're again provided with a number of bytes. And with the pointer, there's also a way to access these bytes using a variable name. So in effect, this construction is quite close to an array. But it's dynamic, while arrays are static.

Accessing the String Elements by Pointer

When using dynamic strings, you can use socalled "pointer arithmetic" to access the elements of the string. If you have a "pointer to char" called "a", at first it points to element number 0 of the string. You can raise "a" by 1 ("a++;") or by 5 ("a += 5;") to reach element number 1, respectively 5 of the string.
You have to remember though, how far you went into the string, because at the end, you should set the pointer back to element number 0, to be able to call "free()" with the correct memory address.

But surprisingly, you can also use array notation with dynamic strings. Here's an example:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void main(void) {
    char *a = malloc(20);
    strcpy(a, "Hello");
    printf("%s\n", a);
    a++;
    printf("%s\n", a);
    a += 3;
    printf("%s\n", a);
    printf("\n");
    a -= 4;
    printf("Moved back to the beginning: %s\n", a);
    printf("\n");
    /* Also using array notation: */
    printf("%c\n", a[3]);
    printf("%c\n", a[4]);
    a[0] = 'Y';
    printf("%s\n", a);
    free(a);
}

27. Arrays of Strings

Often, you want an array of strings. You can have one like this, using the first method mentioned above to create the strings:

#include <stdio.h>

int main(void) {
    char array_of_strings[2][6] = {"Hallo", "Welt"};
    printf("%s\n", array_of_strings[0]);
    printf("%s\n", array_of_strings[1]);
    return 0;
}

In the declaration, the first number between the square brackets is the number of lines (= elements or the array). When you think of the array as a table, the first number inside the square brackets of the array is the y-coordinate (!). The second number is the x-coordinate. (Memorize that.)
The second number is the number of characters of the largest string in the array, plus one for the string's terminating '\0' character.

Arrays of strings can also be passed to functions, using the syntax shown in the following example. The size of the strings has to be passed to the function as well. This is mandatory. As you probably want to have the number of elements of the array in the function too, you pass both values of the array definition separately to the function. In gcc it is possible to pass the size of the strings as an integer first, and use the same integer variable for the argument of the string sizes of the array of strings. So this works in gcc:

#include <stdio.h>

void printStringArray(int arrlen, int strsize, char (*array_of_strings)[strsize]) {
    int i;
    for (i = 0; i < arrlen; i++) {
        printf("%s\n", array_of_strings[i]);
    }
}

int main(void) {
    char array_of_strings[2][6] = {"Hallo", "Welt"};
    printStringArray(2, 6, array_of_strings);
    return 0;
}

It doesn't work in CC65 though. But this does:

#define STRSIZE_ARR 6
void printStringArray(int arrlen, char (*array_of_strings)[STRSIZE_ARR]) {...}

28. Strings with more than one Line

If you want to write longer strings with many lines, you can use one of the following syntaxes:

#include <stdio.h>
 
int main(void) {
    /* First syntax: */
    char s[] = "This is the first line,\n\
this is the second line.\n";

    /* Second syntax: */
    char s2[] = "This is the third line,\n"
                "this is the fourth line.\n";
    printf("%s", s);
    puts("");
    printf("%s", s2);
    return 0;
}

I recommend the second one, because it can handle the indentation better.

29. Format String Symbols of "printf()" and others

We've used a lot of "printf()"'s format strings already. Time for an overview:

datatype format string

int %d

long long int %lld

unsigned int %u

unsigned long int %lu

size_t (usually "unsigned int") %zu

ssize_t (usually "signed int") (I'm not sure about this) %zd

hexadezimal %x

octal %o

float and double %f

float, rounded to two decimal places %.2f

float in exponential form %e

char %c

string (argument is the pointer) %s

pointer-address as returned by "malloc()" %p

Even more on format strings here.

30. Getting User-Input: getchar(), scanf(), fgets(), getline()

In C, the user can be asked for input on the console with the function "scanf()". It takes a format-string as the first argument (like printf()), and the memory-address of the variable to send the input to as the second argument (bit tricky). Example:

#include <stdio.h>

int main(void) {
    int a;
    printf("\nEnter number: ");
    scanf("%d", &a);
    printf("\nYour input was: %d.\n\n", a);
    return 0;
}

"scanf()" can be used for getting a single number or a single word.

For some weird reason "scanf()" stops reading, when it encounters a whitespace character like for example a simple "space" character.
It is said, that you could then use something like this:

char str[21];
scanf("%20[^\n]", str);

if you want to read a string of 20 characters plus the "\0" character with "scanf()".
But on my terminal, it unblocked the reading, when reading repeatedly. So that didn't work either.
Instead, the function "fgets()" can be used. This reads 20 characters:

char str[21];
fgets(str, 20, stdin);

Unfortunately, this also puts the newline character ("\n") into the string. So you have to chomp the string. By hand, because there isn't such a function in the standard libraries.

That's all too stupid. Probably the best way to read user input, is to write a custom function, that reads in one character after the other, and produces the wanted results:

#include <stdio.h>

int my_input(char *arr, unsigned int arrlen) {
    /* "arr" has to be a static array defined outside this function.
       It gets passed by reference to this function, so it's manipulated directly,
       and the results also take effect outside the function. */
    char c;
    int count = 0;
    c = getchar();
    while (c != EOF && c != '\n' && count < arrlen) {
        arr[count] = c;
        count++;
        c = getchar();
    }
    arr[count] = '\0';
    return count;
}


int main(void) {
    char a[20];
    int alen;
    printf("Please enter a line:\n");
    alen = my_input(a, 20);
    printf("You entered the string \"%s\", which has %d characters.\n", a, alen);
    return 0;
}

Another alternative: The function "getline()" (advanced topic): Since 2010, there's also a function "getline()" in "stdio.h". It can't be used with static arrays, but always uses a "pointer to char". "getline()" takes care of allocating memory for the string itself. But after using the string, you have to free that memory yourself again. Here's an example, how "getline()" can be used:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    char *a = NULL;
    /* Just an "unsigned int": */
    size_t buffsize;
    /* Just a "signed int": */
    ssize_t alen;
    printf("Please enter a line:\n");
    alen = getline(&a, &buffsize, stdin);
    /* Chomp the string: */ 
    alen--;
    *(a + alen) = '\0'; 
    printf("You entered the string \"%s\", which has %zd characters.\n", a, alen);
    free(a);
}

31. Example-Program "Cookie Monster"

This is a nice little program from a good Perl-book (Laura Lemay: "Sams Teach Yourself Perl in 21 Days"), I can recommend. If you get to this program and can run it, you already know some of the constructions, a programming-language uses:

#include <stdio.h>
#include <string.h>

int main(void) {
    /* Cookie Monster */
    char cookies[10];
    strcpy(cookies, "");
    while (strcmp(cookies, "COOKIES") != 0) {
        printf("I want COOKIES: ");
        scanf("%s", &cookies);
    }
    puts("Mmmm. COOKIES.");
}

Explanation: We declare an array of 10 char for the string (so the maximum is 9 letters plus '\0').
strcpy() copies a string literal into the string. We get user-input with scanf() as described above. strcmp() compares two strings and returns 0, if the strings are equal.

All in all already pretty cool, I think. :)

Note: As we want to get only one word from the command-line (stdin), it's ok to use "scanf()". If we wanted to get a string of several words, the problems described above would occur.

32. Structures with struct

Structures are probably one of the most often used datatype in C.

To be able to create static arrays is nice, but they can only hold data of the same type. Whereas structures can hold data of several types (they can hold even more structures).

C doesn't have classes (learn C++ if you want them) (but see below for an emulation of classes). Nevertheless, the easiest way to explain, what structures are (maybe), to think of them as "classes without methods". Actually, you can have something like methods too: How it's done is explained further down in the document.

In Python, defining a simple class and using it would be:

#!/usr/bin/python
# coding: utf-8

class Fruit:
    def __init__(self, name, colour, cent):
        self.name = name
        self.colour = colour
        self.cent = cent

apple = Fruit(name = "Apple", colour = "green", cent = 25)
print apple.colour
print apple.cent
print
banana = Fruit(name = "Banana", colour = "yellow", cent = 35)
print banana.colour
print banana.cent

The class "Fruit" describes in general, what fruits are about. Then, an object of the class "Fruit" called "apple" is instantiated. The attributes inside this object can then be accessed. The same is done with an object "banana".

Ok, now the struct in C:

#include <stdio.h>
#include <string.h>

struct Fruit {
    char name[100];
    char colour[100];
    int cent;
};

typedef struct Fruit Fruit;

int main(void) {
    Fruit apple;
    strcpy(apple.name, "Apple");
    strcpy(apple.colour, "green");
    apple.cent = 25;
    puts(apple.colour);
    printf("%d\n", apple.cent);
    puts("");
    Fruit banana;
    strcpy(banana.name, "Banana");
    strcpy(banana.colour, "yellow");
    banana.cent = 35;
    puts(banana.colour);
    printf("%d\n", banana.cent);
    return 0;
}

The name "Fruit" in "struct Fruit" is called a "structure tag".
As structure-definitions can be rather complicated, often "typedef" is used, to be able to identify the structure only by its name.
The line "Fruit apple;" inside "main()" (referring to the "typedef") works pretty much like the instantiation of an object.
The members of the struct are accessed with the "."-syntax (the "member access operator"), pretty much like the attributes of an object.

The "instantiation" can be done for one or more "objects" right after the declaration of the "class":

#include <stdio.h>

struct Fruit {
    char name[100];
    char colour[100];
    int cent;
} apple, banana;

int main(void) {
    return 0;
}

And the initialization of the members can be done right there too. Take a look at this syntax:

#include <stdio.h>

struct Fruit {
    char name[100];
    char colour[100];
    int cent;
} apple = {
    "Apple",
    "green",
    25},
  banana = {
    "Banana",
    "yellow",
    35
};

int main(void) {
    puts(apple.colour);
    printf("%d\n", apple.cent);
    puts("");
    puts(banana.colour);
    printf("%d\n", banana.cent);
    return 0;
}

The "object"-initializations can also be separated:

#include <stdio.h>

struct Fruit {
    char name[100];
    char colour[100];
    int cent;
} apple = {
    "Apple",
    "green",
    25
};

struct Fruit banana = {
    "Banana",
    "yellow",
    35
};

int main(void) {
    puts(apple.colour);
    printf("%d\n", apple.cent);
    puts("");
    puts(banana.colour);
    printf("%d\n", banana.cent);
    return 0;
}

And: The "class-name" can also be left out, if you want just a single "object" of the "class" (this is not possible in Python):

#include <stdio.h>

struct {
    char name[100];
    char colour[100];
    int cent;
} apple = {
    "Apple",
    "green",
    25
};

int main(void) {
    puts(apple.colour);
    printf("%d\n", apple.cent);
    puts("");
    return 0;
}

It is also possible to point pointers to structure-"objects".
Members are then accessed with the "->"-operator (which is short for "(*pointer).member"):

#include <stdio.h>

struct Fruit {
    char name[100];
    char colour[100];
    int cent;
} apple = {
    "Apple",
    "green",
    25
};

int main(void) {
    struct Fruit *a = &apple;
    puts(a->colour);
    printf("%d\n", a->cent);
    return 0;
}

33. Unions

Unions are declared in the same way as structures, you just have to use the keyword "union" instead of "struct".
The difference between the two has to do with memory allocation: For a structure, the sum of the memory size of all its members is allocated. For a union, memory only of the size of its largest member is allocated.
So you can save up some memory, when you use an union instead of a structure. The price for that is, that only one member of the union can be accessed at a time. It is shown here, how this is done, in case you wonder. If you try to access more members, they will be corrupted.
Although there may be situations, where memory has to be spared, the more memory computers have, the less likely is such a situation.

34. Emulating Object-Oriented Programming in C

As mentioned above, the data in structures can be accessed by using a pointer to them in combination with the "->"-syntax. Pointers to structures can also be passed as arguments to functions. That way, the contents of structures can also be available inside functions. Then, functions can manipulate the data inside the structures, so they basically work as methods then. This was pointed out by Simon Tatham in section 8 of his page about C.
On my page about Perl's OOP, I've posted an OOP-example with a "Lamp"-class (which is derived from an example in a textbook about Python). Actually it is possible to translate that example to C:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

/* class Lamp: */

    struct Lamp {
        char name[100];
        int lightintensity;
        char state[10];
    };

    struct Lamp *Lamp_new(char *name, int lightintensity, char *state) {
        struct Lamp *self = malloc(sizeof(struct Lamp));
        strcpy(self->name, name);
        self->lightintensity = lightintensity;
        strcpy(self->state, state);
        return self;
    }

    void Lamp_switchOn(struct Lamp *self) {
        strcpy(self->state, "on");
        printf("'%s' is on at %d Watt.\n", self->name, self->lightintensity);
    }

    void Lamp_switchOff(struct Lamp *self) {
        strcpy(self->state, "off");
        printf("'%s' is off.\n", self->name);
    }

    void Lamp_newLightBulb(struct Lamp *self, int light) {
        if (strcmp(self->state, "on") == 0) {
            puts("Light bulb can not be changed.");
            printf("First, '%s' has to be switched off.\n", self->name);
        } else {
            self->lightintensity = light;
            printf("Light bulb in '%s' has been changed.\n", self->name);
            printf("The new bulb has %d Watt.\n", self->lightintensity);
            Lamp_switchOn(self);
        }
    }

    void Lamp_destruct(struct Lamp *self) {
        free(self);
    }

/* End of class Lamp. */


int main(void) {
    struct Lamp *lamp1 = Lamp_new("First Lamp", 50, "off");
    struct Lamp *lamp2 = Lamp_new("Second Lamp", 40, "off");
    Lamp_switchOn(lamp1);
    Lamp_switchOn(lamp2);
    Lamp_newLightBulb(lamp2, 100);
    Lamp_switchOff(lamp2);
    Lamp_newLightBulb(lamp2, 100);
    Lamp_destruct(lamp1);
    Lamp_destruct(lamp2);
    return 1;
}

It seems, the names of the methods need the class-name at the beginning, like for example "Lamp_...". Because otherwise, if there were several classes, there would be several "new()"-methods, which couldn't be distinguished by the program. So it seems, different names like "Lamp_new()" versus "Car_new()" are needed. C++ solves this problem with "namespaces", but these aren't supported directly in C.

But as you can see, you can have basic object-oriented programming in C.

35. Type Casting. Pointer to void. Array of Pointers to Different Datatypes

It is possible, to change the type of an already existing variable in the program. This is done, by declaring the new type with round brackets around it, like:

int a;
(double) a;

This technique is often used with pointers. There is also a "pointer to void". It is a somehow generic pointer, that can be cast into another type of pointer lateron.

In an "array of int", only integers can be stored, in an "array of char", only chars can be stored.
But using an "array of pointers to void", it is also possible, to store pointers to different datatypes in one array:

#include <stdio.h>
#include <string.h>

struct Example {
    char name[5];
};

int main(void) {

    /* Declaring an array of pointers to void: */
    void *arr[3];

    /* Filling the array: */
    int n = 5;
    int *p = &n;
    (void *) p;
    arr[0] = p;

    char s[6];
    strcpy(s, "Hello");
    (void *) s;
    arr[1] = s;

    struct Example e;
    strcpy(e.name, "Test");
    struct Example *ep = &e;
    (void *) ep;
    arr[2] = ep;

    printf("%d\n", *((int *) arr[0]));
    printf("%s\n", (char *) arr[1]);
    printf("%s\n", ((struct Example *) arr[2])->name);

    return 0;
}

When using gcc, it seems to be possible to pass a pointer, that doesn't point to "void", to a function, that expects a "pointer to void". This works:

#include <stdio.h>

void test(void *a) {
    /* printf("%d\n", *((int *) a)); */
}

int main() {
    int a = 5;
    int *p = &a;
    test(p);
    return 0;
}

The "pointer to int" is automatically cast into a "pointer to void". If the pointer is to be used as a "pointer to int" again (in the "printf()"-function for example), it has to be recast into such a pointer.

36. Iterating over a struct isn't supported

Consider this Python-code:

#!/usr/bin/python3
# coding: utf-8

def printDictionary(d):
    for i in d.keys():
        print(i, ":" , d[i])

a = { "first"  : 1,
      "second" : "Hello",
      "third"  : (2, 3, 4) }

printDictionary(a)

b = { "first"  : 2,
      "second" : (5, 6, 7),
      "third"  : "Test" }

print()
printDictionary(b)

So in Python, it's possible, to iterate over dictionaries like this.
But could a similar function "printStruct()" be written for a struct in C? It seems, it isn't possible (easily). Because "C is a statically typed language without reflection".
Advanced programmers have come up with some "magical" solutions to this problem (1, 2, 3). But maybe it's better just to keep in mind, that it's not possible in an easy way.
It is possible to iterate over arrays though. As described above, arrays can also hold pointers to different datatypes. But then again, the datatype of the data, a pointer in the array points to, should be known in advance, when trying to iterate over such an array.

C++ has the feature of "reflection", that would be needed here. So iterating over unknown structs or "arrays of pointers to void" without knowing the type of data, the pointers inside the array point to, in advance, is (probably) possible in C++.
But in C, it is difficult. That is, not easily possible.

37. Memory Areas of C-programs

Let's take a look at the memory-layout of a C-program. When a C-program is started, it builds up the following memory-segments in cooperation with the operating-system:

Text segment
Contains executable instructions. Usually read-only.
Data segment:
- Initialized Data Segment
  Contains global variables and static variables, that are initialized. It is divided into a read-write-part and a read-only-part.
- Uninitialized Data Segment
  Contains global variables and static variables, that are initialized to 0 or not initialized. Also called "bss"-segment (especially in the output of the command "size").
Stack
Contains local variables of functions and additional information on functions, arguments and function-calls. Writeable.
Heap
Contains memory allocated by "malloc()". Also used by shared libraries. Writeable.

The shell-command "size" displays the sizes of the text- and data-segments.
The shell-command "nm" displays more information about memory-usage of variables and other symbols.

Pointers inside functions are usually stored on the stack, but the memory they point to is usually on the heap.

The string literal of

char *a = "Hi";

is stored in the read-only-section of the initialized data segment or even in the (read-only) text-segment.
The string literal of

char a[] = "Hi";

is short for

char a[] = {'H', 'i', '\0'};

and therefore stored on the stack.

38. Managing More Complex Data Using an Array of Structs

When it comes to managing more complex data, the first idea may be to use arrays. But handling arrays of more than one dimension can quickly become difficult and frustrating, especially when trying to pass them to functions using pointers.
Another idea would be a one dimensional array, that holds structs. When you think of the array as a database, each struct then represents one data record inside of it. Here's an example:

#include <stdio.h>
#include <string.h>

#define DATA_ROWS    3
#define DATA_COLUMNS 2

struct Person {
    char name[30];
    char date[11];
};

void getData(struct Person *persons) {
    int i;
    /* 6 strings with a maximum length of 24 + '\0': */
    const char data[DATA_ROWS * DATA_COLUMNS][25] = {"John Smith", "12.03.1975",
                                                   "James Jones", "24.07.1987",
                                                   "Henry Newman", "15.07.1979"};
    for (i = 0; i < DATA_ROWS; i++) {
        strcpy(persons[i].name, data[i * DATA_COLUMNS]);
        strcpy(persons[i].date, data[i * DATA_COLUMNS + 1]);
    }
}

int main(void) {
    struct Person persons[3];
    getData(persons);
    int i;
    for (i = 0; i < DATA_ROWS; i++) {
        printf("%s\n", persons[i].name);
        printf("%s\n", persons[i].date);
        puts("");
    }
    return 0;
}

Since I discovered class emulation (see above), I may just use that.
On the other hand: If I create an array of (emulated) objects, for real the array also just stores "pointers to structures". So it's basically the same idea, just seen from a different point of view.

39. Linked Lists (Dynamically Changeable Lists)

In languages like Perl or Python, there is the datatype "list". It can just hold different kinds of data, like integers or strings. And these lists can easily be changed lateron, for example extended or reduced at runtime. Consider this fundamental Python code:

#!/usr/bin/python3
# coding: utf-8

a = []
a.append(3)
a.append(6)
a.append("Hello")
print(len(a))
print(a)

a.pop()
print(len(a))
print(a)

The C-language itself doesn't provide such "linked lists" with these features. However, on Linux-systems, there's an implementation, that can be included with "#include <sys/queue.h>". Here would be examples, on how to use that library. Not so easy, it seems.

In the first terms of university-level programming classes, it is a common exercise to write such an implementation of a linked list. My approach can be found on my GitHub-page (doubly linked list). But that code is just experimental, don't use it for productivity.

40. Getting a Random Number

Here's an example of getting a random number in the range of 0 to 9.
There are several ways in C to achieve this. This is just a basic one for average use. If you need higher level randomness, you'll have to look for more sophisticated ways (see discussions like this one).

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int main(void) {
    int i;
    int r;
    /* srand() seeds the random generator. Only use it once:  */
    srand(time(NULL));
    for (i = 0; i < 20; i++) {
        r = rand() % 10;
        printf("%d\n", r);
    }
    return 0;
}

On Linux, there are also the functions "random()" and "srandom()". Not sure, what their advantage is. The example above works also on Linux.

41. Peek and Poke

On the old home computers, you could view the content of a certain memory address with BASIC's PEEK-command.
And insert a certain value into a memory address with the POKE-command.
This is, how you would do something similar in C:

#include <stdio.h>
#include <string.h>

typedef unsigned char     byte;
typedef unsigned long int addrtype;

byte my_peek(addrtype address) {
    byte *sysvar;
    sysvar = (byte *) address;
    return *sysvar;
}

void my_poke(addrtype address, byte value) {
    byte *sysvar;
    sysvar = (byte *) address;
    *sysvar = value;
}

addrtype getMemoryAddress(char *a) {
    char num[20];
    sprintf(num, "%p", a);
    /* sprintf() with "%p" provides a string of a hex value,
       so this string has to be converted with base 16: */
    return (addrtype) strtoll(num, NULL, 16);
}

int main(void) {
    char a[] = "Hello";
    addrtype l = getMemoryAddress(a);
    printf("%lu\t%d\n", l, my_peek(l));
    my_poke(l, 66);
    printf("%lu\t%d\n", l, my_peek(l));
    printf("%s\n", a);
    return 0;
}

This is probably, what happened (I'm not 100% sure though):
If you do "pointer = &intvariable;" (like you usually do in example programs), the pointer is pointed to the memory address of "intvariable" (wherever that may be) and then represents the content of that address - which is the value of the variable.
When you do "pointer = intvariable;", the pointer is pointed to the value of the variable seen as a memory address.
The datatype of the variable has to be recasted, to make it compatible with the pointer (somehow).
At the memory address, the pointer finds a single value in the range from 0 to 255 (a byte). This kind of value is usually represented by the datatype "unsigned char". That's why a "pointer to unsigned char" is used.

In the example, I changed the string "Hello" to "Bello" by poking directly into its memory address. That's alright, but I can achieve the same thing, by setting a pointer on that memory address without knowing its decimal or hexadecimal value. You know, like this:

#include <stdio.h>

int main(void) {
    char a[] = "Hello";
    *a = 'B';
    printf("%s\n", a);
    return 0;
}

That's probably the C-way to do it - slightly more high-level, but only a bit. Still very close to manipulating bytes directly.

The C-cross-compiler for several old homecomputers called "CC65" offers "PEEK()" and "POKE()" in the corresponding file. But they're implemented as "#define"-macros there (actually, I don't really understand yet, how this is done):

#define PEEK(addr)         (*(unsigned char*) (addr))
#define POKE(addr,val)     (*(unsigned char*) (addr) = (val))

42. Alternative Syntaxes

The "?"-operator can be used to write certain conditions in a different way. You may see that syntax in other people's code. I wouldn't use it, I'd write the condition in the ordinary way, even if it's longer:

#include <stdio.h>

int main(void) {
    int a = 2;
    int b = 10;
    int max;

    /* Ordinary syntax: */
    if (a > b) {
        max = a;
    } else {
        max = b;
    }
    printf("%d\n", max);

    /* With ? : */
    max = (a > b) ? a : b;

    printf("%d\n", max);
}

43. Working with Files

This is how you read from a text-file (called "test"):

#include <stdio.h>
#include <stdlib.h>
 
int main(void) {
    char fname[] = "test";
    FILE *fp = fopen(fname, "r");
    if (fp == NULL) {
        printf("\nCouldn't read file \"%s\". Aborting.\n\n", fname);
        exit(1);
    }
    int c;
    while ((c = getc(fp)) != EOF) {
        putc(c, stdout);
    }
    fclose(fp);
    return 0;
}

"EOF" and "stdout" are defined in "stdio.h". "EOF" means "end of file" (it's just an integer, of "-1" in my case).
"stdout" is the program's standard output-channel. When you send data to stdout, it is usually printed on the screen (in a terminal).
"getc()" reads in one character.
"FILE" is a typedef'ed name for a structure defined in "stdio.h". "fp" is a pointer that points to such a structure. "fopen()" returns such a pointer.

And this is how you write to a text-file (called "test2"):

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
 
int main(void) {
    char schiller[] = "Und blicket sie lange verwundert an.\n"
                      "Drauf spricht er: \"Es ist euch gelungen,\n"
                      "Ihr habt das Herz mir bezwungen,\n"
                      "Und die Treue, sie ist doch kein leerer Wahn ...\"\n";
    char fname[] = "test2";
    if (access(fname, F_OK) == 0) {
        printf("\nFile \"%s\" already exists. Aborting.\n\n", fname);
        exit(1);
    }
    FILE *fp = fopen(fname, "w");
    if (fp == NULL) {
        printf("\nCouldn't open file \"%s\" for writing. Aborting.\n\n", fname);
        exit(2);
    }
    int i;
    for (i=0; i < strlen(schiller); i++) {
        putc(schiller[i], fp);
    }
    fclose(fp);
    return 0;
}

44. Keyword Summary of Variable Datatypes

Variable Declarations:

Elementary datatypes:

int (= signed long int (usually))
short int, long int, long long int
float, double, long double
char
size_t
void

A variable of each of these datatypes (except void) can be declared as

signed, unsigned
const
extern (mostly used in header-files)
static (mostly used in function declarations)
an array, for example "int a[10]" or "char b[10]"
a pointer, for example "int *" or "char *"; also "void *"

Special Declarations:

typedef
enum
struct, union

Back to the computing-page

Author: hlubenow2 {at-symbol} gmx.net

datatype	format string
int	%d
long long int	%lld
unsigned int	%u
unsigned long int	%lu
size_t (usually "unsigned int")	%zu
ssize_t (usually "signed int") (I'm not sure about this)	%zd
hexadezimal	%x
octal	%o
float and double	%f
float, rounded to two decimal places	%.2f
float in exponential form	%e
char	%c
string (argument is the pointer)	%s
pointer-address as returned by "malloc()"	%p

Writing in C

Contents: