Compiling a Program With a Single Source Code File:
to compile:
#this creates a program a.out:
gcc myprogram.c
#this creates a program myprogram:
gcc myprogram.c -o myprogram
#run:
./a.out
./myprogram
Compiling a Program With Multiple Source Code Files
gcc file1.c file2.c
Options
-o specify the executable name
-E preprocess only: send preprocessed output to STDOUT--no compile (try it!)
-M produce dependencies for make to stdout(voluble)
-C keep comments in output (used with -E above)
-H printer Header dependency tree (you're including this file that includes that file etc...
-dM Tell preprocessor to output only a list of macro defs in effect at end of preprocessing. (used with -E above)
Compiler Options
-c compile only. Use this when making libraries, for example. Your .c files will be compiled into object (.o) files, but the -c will prevent the compiler from trying to create a complete program.
-S send assembler output source to *.s
-w Suppress All Warnings
-W Produce warnings about side-effects (sort of like -w in Perl)
-I Specify additional include file paths
-Wall Produce many warnings about questionable practices; implicit declarations,newlinesin
comments, questionable lack of parentheses,uninitializedvariable usage, unused variables, etc.
-pedantic Warn on violations from ANSI compatibility (only reports violations required by
ANSI spec).
-O optimize (1,2,3,0)
-O0,-O1 base optimizations, no autoinlines, no loops
-O2 performs additional optimizations except inline-functions optimization and loop optimization
-O3 also turns on inline-functions (so instead of calling a function, it just puts the function code right there) and loop optimization
-O1 is the default
-g include debug info
-save-temps save temp files (foo.i,foo.s, foo.o)
-print-search-dirs print the install, program, and libraries paths
-gprof create profiling output forgprof (how much time you're spending in various functions)
-v verbose output (useful at times)
-nostartfiles skip linking of standard start files, like /usr/lib/crt[0,1].o, /usr/lib/crti.o, etc.
-static link only to static (.a=archive) libraries
-shared if possible, prefer shared libraries over static (this is the default)
Assembler Options
To use these, type gcc -Wa, instead of gcc (yes, leave that comma at the end)
-ahl generate high level assembly language source
-as generate a listing of the symbol table
Linker Options
-l lib (default naming convention is lib[libname].a)
-L lib path (in addition to default /usr/lib and /lib)
-s strip final executable code of symbol and relocation tables
You need to include stdio.h:
#include <stdio.h>
Using putchar()
Use this for just printing single characters. This is getchar()'s sibling (and puts()'s cousin).
char a = 'A';
putchar(a);
Using puts()
Use this for printing strings. This is gets()'s sibling (and putchar()'s cousin).
char * str = 'Leonardo da Vinci';
puts(str);
Using printf
printf("this is a test.");
printf("I'm %d years old",myAge); // printing variables
printf("I'm 100 years" "old"); // note lack of '+' sign between strings
Formatting Type
%c Character
%d or %i Signed decimal integer
%f Decimal floating point
%o Signed octal
%s String of characters
%u Unsigned decimal integer
%x Unsigned hexadecimal integer
%X Unsigned hexadecimal integer (capital letters)
%p Pointer address
%ld Long decimal
%lx Long hexadecimal
%lo Long octal
%% Just prints a '%'
Flags
Flag: '-'
Means: Left-justify.
Flag: '+'
Means: The result will include a sign.
Flag: ' '
Means: The result will include a leading space (no sign) for positive values and a minus sign for negative values.
Flag: '0'
Means: The result will be zero-padded.
Flag: '#'
Means: Adds an initial 0 for octal values, an initial '0x' or '0X' for hex values, and a decimal character for floating point values (even if there are no numbers after the decimal).
Notice that the [width] param above is the minimum width. So what happens if you want to print something exactly that width? For example, suppose you want to print only 12 characters of a string. What do you do?
You would use this for that:
%12.12s
This prints a string 12 chars wide and LIMITS it to 12 chars.
# note no semicolon at the end:
#include <stdio.h> #include a system header file
#include "myhed.h" #include one of your own header files
Basic Format
[return type] [function name] ( [params] )
{
....
}
void butler()
{
}
When you create a function, you also need to create a function prototype. Look at the 'Function Prototypes' entry for more information.
If you absolutely, positively want to make sure that your function doesn't actually modify any of the values passed in, declare those parameters as constants:
void butler (const int name, const int age)
{
// you can't change the values of name or age here
}
Note that this doesn't mean that the values you are passing in have to be constants, it just means that they are treated as constants in the function.
You must use single quotes when assigning to a char. Otherwise C thinks you're trying to assign a string.
char myChar = 'A';
Arrays in C are just pointers to the first element in the array.
You can have two-dimensional, three-dimensional arrays, four-dimensional etc arrays in C.
# an array of chars (a string)
char name[40];
# an array of 100 ints
int hundred[100];
# initializing the array
char name[40] = {'E','n','i','d'};
# if you don't specify an array length, it is automatically as long as the # of elements it's initialized with:
char name[ ] = {'E','n','i','d'};
# two-dimensional array
# int myarray[rows][columns]
int myarray[100][100];
#initializing:
int myarray[][] =
{
{00,01,02,03,04},
{10,11,12,13,14},
{20,21,22,23,24}
}
Initialize Array To Zero
If you initialize an array, C will automatically set the rest of the elements in the array to zero. So if you want to initialize all the elements in an array to zero, you just need to set the first one to zero:
int myarray[100] = {0}; // all 100 elements are now zero.
Getting The Size Of An Array
This is a convoluted operation in C. Here's how you do it:
int foo[10]; // declare the array
int foo_size = sizeof(foo)/sizeof(*foo); // get the size
Pointers To Arrays
Since all an array really is is a pointer to the first element in the array, you don't use the '&' operator to get the address of the array...you just use the array name. Example:
int test[100];
test[1] = 444;
int * hello = test;
hello++;
printf("%d",*hello);
This will print '444'.
You can't assign pointers to arrays though. That is, suppose we have a function copy_uppercase that returns a pointer to an array. Then:
// this is fine, a pointer assigned to a pointer
char *array2;
array2 = copy_uppercase(array);
// this is not fine, a pointer assigned to an array
char array2[];
array2 = copy_uppercase(array);
Returns the size of a variable in bytes.
int myInt = 400000;
int size = sizeof(myInt); # size now probably holds '4'
You use char arrays to represent Strings in C. For example:
char name[100]; // this is a string
There are a lot of limitations on what you can do with a string. In particular, you cannot copy one string to another, so although you can do this:
char name[100] = "test";
You cannot do this:
char name[100];
name = "test";
Instead, you have to do this:
#include <string.h>
char name[100];
strcpy(name,"test");
Strings in C are always stored with the terminating null character "\0". This character marks the end of the string. This means that the array must always have at least one more cell than the number of characters to be stored. This is how the "name" String is stored:

char name[100];
name = "ENID BLYTON";
int len = strlen(name); // len now contains '11'.
You need to include stdio.h:
#include <stdio.h>
General
Use the scanf function.
Note the '&' in this code:
scanf("%d",&income);
You don't give scanf a variable, but rather a pointer to a variable. But this is only if it's one of the basic variable types.
So, dont use a pointer for character arrays (strings).
scanf("%s",&name); // incorrect!
scanf("%s",name); // correct
scanf returns the number of values read. Here are some examples. Suppose you have this code:
int len = scanf("%d %d",&num,&num2);
Here's the value of len for the following inputs:
123 123 // 2
asd // 0, because we've specified that we're expecting integers.
So this is a good way to test if the user has given you the input you were looking for.
Single Characters
getchar() returns the next character from input:
char ch = getchar();
Strings
You can get strings easily using the gets() function:
char* mystr;
gets(mystr);
But the problem with gets() is it doesn't check how big the input is. It's possible for someone to exploit this by putting in a lot of code, so it leaks into memory that's not yours. This is why it's recommended to use fgets instead of gets:
fgets()
fgets() takes three parameters:
[char * variable][max # of characters to read][which file to read (to read from the terminal, use stdin)]
So:
char * name;
fgets(name,50,stdin);
The only problem with fgets is it returns the newline. So you have to remove that before you can use it. Here's how you remove it:
#include <string.h> // this is needed for strchr
char name[50];
char * find;
fgets(name,50,stdin);
find = strchr(name,'
');
if (find)
{
*find = '\0';
}
strchr() returns a pointer to the newline character. We set that character to null. Of course, this only works when you have only one newline character in your string and it's at the end of your string. And that's exactly how it will be when you use fgets.
In an assignment, the left side if called an Lvalue and the right side is called an Rvalue. For example, in this:
caryear = 2000;
'caryear' is the Lvalue and '2000' is the Rvalue. Technically, 'caryear' is the modifiable Lvalue. This distinction is made because not all Lvalues can have their values changed. Those that can are called modifiable Lvalues.
Casting is how you convert from one type to another. Here's an example of converting a char to an int:
char myc = 'A';
int myi = (int) myc; // myi now contains '65'
In C,
any nonzero value = true and
0 = false.
You can use ints, but C also has another type:
_Bool.
You use it like so:
_Bool mybool = 1;
The advantage of using a _Bool is if you assign any nonzero numeric value to it, it's automatically set to 1.
// note that the variable has to be initialized separately.
int i;
for (i=0;i<10;i++)
{
}
// you can also use chars instead of ints
// this prints all lowercase alphabets from a to z.
char i;
for (i='a';i<'z';i++)
{
printf("%c
",i);
}
Use the isalpha() function:
if (isalpha(myChar))
{
//do something if it's a letter
}
Use the isdigit() function:
if (isdigit(myChar))
{
//do something if it's a number
}
To change case, use tolower() or toupper():
myC = 'A';
lwC = tolower(myC);
upC = toupper(myC);
Or if you wanted to be unnecessarily obscure / cool, you could use these functions written by Tom Duff of Duff's Device fame:
int lower(int c){
return 'A'<=c && c<='Z'?c+'a'-'A':c;
}
int upper(int c){
return 'a'<=c && c<='z'?c+'A'-'a':c;
}
The general form is:
[expression] ? [do this if true] : [do this if false]
Example:
x = (y < 0) ? -y : y;
This is the same as:
if (y < 0)
{
x = -y;
}else
{
x = y;
}
continue;
Skips the rest of this iteration, starts the next one.
break;
Ends the loop.
When you create a new function, you need to put a function prototype at the top of the .c file (after #includes, before main). This lets the compiler know what functions you are declaring in this file. Example:
#include <stdio.h>
void butler(int num);
int main()
{
printf("Test.");
butler();
}
void butler(int num)
{
printf ("butler %d here.",num);
}
Functions With No Arguments
For functions with no arguments, put "void" in the function prototype. Example:
void butler (void);
void butler()
{
//function code here
}
Functions With Variable Arguments
Use (...). Example:
int printit(char *,...);
You can create constants in two ways:
Manifest Constants
// note that we don't end these in a semi-colon.
#define BEEP '\a'
#define TEE 'T'
The const Modifier
const int MONTHS = 12;
If you put main() in one file and your function definitions in a second file, the first file still needs the function prototypes. Rather than type than in each time you use the function file, you can store the function prototypes in a header file. That is what the standard C library does, placing I/O function prototypes in stdio.h and math function prototypes in math.h, for example. You can do the same for your function files.
Also, if you use the C preprocessor to define constants (using '#define'), you should put those in a header file so you don't have to retype the constants for all your files.
Header files end in a .h extension. You include them in your .c files using;
#include "myheader.h"
...assuming that the header file is in the same folder as your .c file. Note that the double-quote syntax is only for your own headers. For system headers, use angle brackets instead:
#include <stdio.h>
The & operator gives you the address where a variable is stored. If pooh is the name of a variable, &pooh is the address of the variable.
Example:
int pooh = 20;
int* loc = &pooh;
printf("loc is %p\n",loc);
Here, loc is a pointer. See the entry on pointers for more info.
A pointer is a variable whose value is a memory address.
Creating a Pointer
Use the * operator:
int bah = 100;
int * bah_ptr = &bah;
float fah = 200.00;
float * fah_ptr = &fah;
Getting The Value Stored In The Memory Address That The Pointer Holds
Again, use the * operator:
int bah = 100;
int * bah_ptr = &bah;
int value_of_bah = *bah_ptr;
That's the same as just saying:
int bah = 100;
int value_of_bah = bah;
Using Pointers To Communicate Between Functions
int x = 5, y = 10;
interchange(&x,&y);
void interchange(int * u, int * v)
{
int temp;
temp = *u;
*u = *v;
*v = temp;
}
Printing Pointers
You use the '%p' type to print pointers:
int pooh = 20;
int* loc = &pooh;
printf("loc is %p
",loc);
Include string.h:
#include <string.h>
Use strcmp():
char * str1;
char * str2;
int cmp = strcmp(str1,str2);
In the above example, cmp will be 0 if the strings are the same.
Include string.h:
#include <string.h>
Use strcpy(). Notice how we're using an array with a specific size for 'copyto'. Declaring an array allocates storage space for data; declaring a pointer only allocates storage space for one address.
char * copyfrom = "hello";
char copyto[100];
strcpy(copyto,copyfrom);
Use stdio.h:
#include <stdio.h>
sprintf is exactly the same as printf, except it writes to a string instead of writing to a display. The first argument is the string to write to, and the rest is just the same as printf:
char name[100];
sprintf(name,"my name is %s.",MY_NAME);
Include stdlib.h:
#include <stdlib.h>
Use atoi() (alphanumeric to integer) to convert to integers:
char * anum = "1";
int num = atoi(anum);
If variables are declared in a function, they are local variables. If they are declared outside a function (usually before the first function), they are global variables:
int globalVar = 10;
int main()
{
int localVar = 5;
}
Global variables are also called "variables with file scope".
Local variables are also called "variables with block scope".
You can specify if you want a global variable to be available to all the files in your program, or just this file. You do that by specifying its linkage.
There are three types of linkage:
external linkage,
internal linkage and
no linkage.
External Linkage
External linkage means that a variable is available in all files in a multifile program. Global variables (aka a varible with file scope) have external linkage by default.
If you are using a variable that is defined in another file, you need to mark is as external using the extern keyword:
int myExt; // a variable with external linkage created in this file
extern int yourExt; // a variable with external linkage created in some other file
int main(void) {...}
Internal Linkage
Internal linkage means that a variable is only available in the current file in a multifile program. If you don't want the default external linkage on a global variable, you can specify internal linkage using the static keyword:
static int internal_int = 10;
int main(void)
{
...
}
No Linkage
All local variables have no linkage.
A C variable has one of the following two storage durations:
static storage duration
automatic storage duration.
Static Storage Duration
A variable with static storage duration exists throughout program execution. A global variable has static storage duration.
Automatic Storage Duration
A variable with automatic storage duration exists only while its needed. A local variable has automatic storage duration. If our butler() function defines a "napkins" variable, memory is allocated for napkins when butler is called, and the memory is freed with the function exits.
You can mark variables explicitly auto using the audo keyword:
auto int napkins;
But this is not required. You only need to do it if you want to make it really obvious to someone that this variable is an automatic variable.
Static Variables With Block Scope
If you want a local variable to have static storage, you just mark it as static:
void butler()
{
static int napkins = 0;
napkins++;
printf("%d
",napkins);
}
int main()
{
butler();
butler();
butler();
}
This code will print out:
1
2
3
Because we are incrementing napkins with each call to butler and that value is stored in memory. Note that even though we have this line:
static int napkins = 0;
napkins is not reset to 0 each time. It is just initialized once and then after that we just ignore that line.
Variables are normally stored in computer memory. With luck, register variables are stored in the CPU registers or, more generally, in the fastest memory available, where they can be accessed and manipulated more rapidly than regular variables. To declare register variables:
register int quick;
Three Important Things To Remember About Register Variables:
1. When you mark a variable as "register", it's a request to the compiler, and not an order. It's up to the compiler to weigh the pros and cons of making it a register. If it doesn't make it a register, it'll just be a regular variable.
2. If it is made a register, it will be faster than a regular variable.
3. Regardless of whether or not it is actually made a register, when you mark a variable "register" you can not get the address of that variable.
We have to include stdlib.h:
#include <stdlib.h>
This allows you to allocate memory dynamically as the program runs.
malloc() takes one argument: the number of bytes of memory you want. It then finds a suitable block of free memory and returns the address of the first byte of that block.
For example, here we are creating an array of doubles:
double * ptd;
ptd = (double *) malloc (30 * sizeof(double));
# since it's an array, you can now use it like this:
ptd[0] = 1234;
ptd[1] = 5678;
malloc() will return the null pointer if it couldn't find the required memory.
We have to include stdlib.h:
#include
When you allocate some memory with malloc(), you should free it with free(). free() takes a pointer to a block of memory allocated by malloc(), and frees up that memory:
double * ptd;
ptd = (double *) malloc(30 * sizeof(double));
// do stuff with ptd here
free(ptd);
Suppose you keep allocating memory in a loop but forget to free it:
// BAD CODE
int i;
for (i=0;i<100;i++)
{
double * ptd;
ptd = (double *) malloc(3000 * sizeof(double));
}
the ptd variable only has scope within the for loop so you can no longer free it. This is called a memory leak and it's not a good scenario.
From Wikipedia:
"A memory leak or leakage in computer science is a particular type of memory consumption by a computer program where the program is unable to release memory it has acquired."
Command-line arguments are passed as arguments to the main function. The standard way of handling them is to declar main like this:
int main(int argc, char * argv[]) {
// some code
}
argc contains the # of command-line arguments, and argv is an array containing the command-line arguments. Every program is always passed in at least one argument — the name of the program itself. This is stored in argv[0]. So to find out if the user passed in any command-line arguments, you have to do this:
if (argc > 1)
{
//user passed in command-line args
}
Use exit() anywhere, anytime, to stop the program.
If it was a normal termination, pass in 0.
If it was an abnormal termination, pass in a non-zero value.
exit(0); // terminated normally
exit(1); // terminate with an error.
Use the fopen() function:
FILE * fp;
fp = fopen("test.txt","r");
The general format is:
fopen([file to open],[mode to open in]);
See the list of modes below.
fopen returns a file pointer, which the other I/O functions can then use to specify the file.
fopen returns the null pointer if it cannot open the file.
File Modes in fopen()
Mode: "r"
Meaning: Open a text file for reading.
Mode: "w"
Meaning: Open a text file for writing, deleting whatever is currently in the file. Create the file if it doesn't exist.
Mode: "a"
Meaning: Open a text file for writing, appending to the end of an existing file, or creating the file if it doesn't exist.
Mode: "r+"
Meaning: Open a text file for reading and writing.
Mode: "w+"
Meaning: Open a text file for reading and writing, deleting whatever is currently in the file. Create the file if it doesn't exist.
Mode: "a+"
Meaning: Open a text file for reading and writing, appending to the end of an existing file, or creating the file if it doesn't exist.
Use the fclose() function. This function takes one argument, a file pointer. fclose() returns 0 if successful, EOF if not.
if (fclose(fp) != 0)
{
printf("error closing file.
");
}
Use strcat():
strcat(name," is my name");
Here, " is my name" is added to the end of the name string. It is your job to make sure that the name char array has enough empty spots to hold " is my name".
You can use fread(), fscanf() or fgets() to read from a file.
fread()
Use this after you have a file descriptor. Use fopen() to get the file descriptor (not open).
fread takes four arguments:
The first is a char * - this could be an actual char pointer or a char array.
The second argument is the max amount to read.
The third argument in the number of elements to read. This only has an effect if you passed a char array as opposed to a char pointer; if you passed a char pointer, it doesn't matter what you put here.
The last argument is the FILE pointer.
Example:
char *buf = (char *) malloc (20 * sizeof(char));
bytes_read = fread(buf,20, 1, fp);
Upon a successful read, fread() returns the # of bytes read.
fscanf()
Works just like scanf, except the first argument is a file pointer:
FILE *fp;
fp = fopen("test.txt","r");
fscanf(fp,"%s",somestring);
fgets()
fgets() takes three parameters:
[char * variable][max # of characters to read][file pointer]
fgets(somestring,50,fp);
You can use fwrite(), fprintf() or fputs() to write to a file.
fwrite()
This function has the same four arguments to fread:
The first is a char * variable and is what you want to write into a file.
The second is the size of char, i.e. 1.
The third is the number of characters to write - this does come into effect when passing a char * as opposed to a char array.
Finally, the last argument is the FILE pointer to the file to write to.
Example:
fwrite(buf, 1, strlen(buf), fp);
On a successful write, fwrite() returns the # of bytes successfully written.
fprintf()
Works just like printf, except the first argument is a file pointer:
FILE *fp;
fp = fopen("test.txt","w");
fprintf(fp,"hello there!");
fputs()
fputs() takes two parameters:
[char * variable][file pointer]
fputs(somestring,fp);
Structs in C are like very, very weak objects. Not even that — a struct is just a way to organize data.
Structure Declaration
Suppose you manage a bookstore and want to keep track of each book's name, author and popularity on a scale from 1-10, 10 being the highest. Here's the struct you would probably use:
struct book {
char title[100];
char author[100];
int popularity;
}; // note the semicolon after the brace.
That's called a structure declaration. This works the same as variables; if you declare a structure in a function, you can only use it in a function. If you declare it outside any functions, you can use it anywhere in your file.
Using Structs
And that's it! When you want to create a new book variable use this:
struct book mybook;
mybook.title = "Programming Perl";
mybook.popularity = 10;
Here's how to initialize a structure:
struct book wallsbook = {
"Programming Perl",
"Larry Wall et al",
10
}; // note the semicolon
Arrays of structs
struct book library[100];
Pointers To Structures
struct book wallsbook;
struct * ptr = &wallsbook;
(*ptr).title = "Programming Perl";
ptr->title = "Programming Perl"; // another way of writing the same thing
Copying Structs
This is perfectly legal, even though you can't do the same thing with arrays:
struct book first_book = {"Sweet William","Richmal Crompton",10};
struct book copy_of_first_book = first_book;
Structs As Return Values
struct book getinfo(void)
{
// return a struct book somewhere here
}
When you are passing structs to a function, should you pass-by-reference or pass-by-value?
Pass-by-reference results in code that's a little more convoluted, and you need to make sure you don't accidentally modify something you don't want to (maybe declare the argument const to avoid that).
But pass-by-value wastes time and memory especially if it's a large struct and you only want to access one or two fields.
Notice that when we're using strings inside a struct, they're generally character arrays:
char author[100];
and not pointers-to-char:
char * author;
you can use pointers-to-char but for various reasons you absolutely should not unless you know what you're doing.
The entire set of information held in a structure is termed a record, and the individual items are fields. This probably sounds familiar to database people.
Use fwrite to write a struct to a file and fread to read a struct from a file:
struct book wallsbook = {
"Programming Perl",
"Larry Wall et al",
10};
fwrite(&wallsbook,sizeof(struct book),1,some_file_pointer);
This code will write the wallsbook struct out to whatever file is pointed to by 'some_file_pointer'. The general form of the fwrite statement is this:
fwrite([address of struct],[size of struct],[how large a block to copy (since we specified the size of the struct as the second ar, this is just 1. 99% of the time, it will be 1.)],file pointer);
fread() takes the exact same parameters and reads the contents of a file into a struct.
fread() and fwrite() are for binary files, so make sure you specify binary in the file mode (whatever mode you choose, just tack on a 'b' at the end...without and spaces).
Reading Many Structs From a File
What if you had to read more than one struct from the file? One way to do so is to use a for loop:
while(fread(&library[count],size,1,file_pointer)==1)
{
// do something with &library[count]
count++;
}
You can use enums to declare symbolic names to represent integer constants.
For example:
// enum declaration...we now have a new variable type enum spectrum (really just an int)
enum spectrum {red,orange,yellow,green,blue,violet};
// some enum spectrum variables
enum spectrum color1 = red;
enum spectrum color2 = blue;
if (color1==color2)
{
// do something.
}
typedef enables you to create your own name for a type.
For example, suppose you want to refer to ints as MONKEYs in your program. Here's how:
// new typedef declared
typedef int MONKEY;
MONKEY a1, a2; // really just ints
You can also typedef structs:
typedef struct book {int a; int b} BOOK;
typedef struct {int a; int b} BOOK; // exact same thing
BOOK wallsbook; // notice we dont have to write "struct book wallsbook" anymore
Use getopt:
#include
// this tests getopt. Use it with some flags!!
extern int optopt;
extern char * optarg;
int main(int argc, char * argv[])
{
char opt;
if (argc > 1)
{
// the third argument in getopt specifies which flags you can use.
// so for example, here we're using the 'h','f', and 'v' flags.
// the 'f' is followed by a semicolon, which means it will be followed
// by an argument for 'f'. Look in the switch statement to see how we deal with this.
// every time we call getopt, it returns the next flag, or returns -1 if it's out of flags.
while ((opt= getopt(argc,argv,"hf:v"))!=-1)
{
switch (opt)
{
case 'h':
printf("just put some flags, dammit!");
break;
case 'v':
printf("Made by Aditya Bhargava. All Rights ––– reeee served!");
break;
// if a flag takes an argument, that argument is stored in the
// optarg variable.
// Look at how we first had to predeclare the external variable
// optopt at the top of the file.
case 'f':
printf("Here's the file you passed in: %s",optarg);
break;
// if getopt sees a flag it doesn't recognize, it passes in a '?'
// and stores the unrecognized flag in optopt.
// Look at how we first had to predeclare the external variable
// optopt at the top of the file.
case '?':
printf("I don't know what '%c' means. The two I know are 'v' and 'h'.",optopt);
break;
}
}
}else{
printf("Gimme some flags!");
}
}
Use getenv():
char * home = (char *) getenv("HOME");
printf("your home directory is: %s
",home);
You need to include fcntl.h when using these:
#include <fcntl.h>
These modes are only used when opening a file with open(). You will most likely want to use fopen() instead. But just in case, here are your choices:
Mandatory Modes
You need to specify one of these mode:
O_RDONLY: Open for read-only
O_WRONLY: Open for write-only
O_RDWR : Open for reading and writing
Optional Modes
In addition, you can specify one or more of these modes:
O_APPEND: Place written data at the end of the file.
O_TRUNC : Set the length of the file to zero, discarding existing contents.
O_CREAT : Create the file, if necessary, with permissions given in mode (see below).
O_EXCL : Used with O_CREAT, ensures that the caller creates the file. This protects against two programs creating the file at the same time. If the file already exists, 'open' will fail.
To use more than one of these modes at the same time, OR them together:
int in = open("jokes.txt",O_WRONLY | O_APPEND);
mode for O_CREAT
These modes are defined in sys/stat.h, so you need to include that before you can use these:
S_IRUSR - Read permission, owner.
S_IWUSR - Write permission, owner.
S_IXUSR - Execute permission, owner.
S_IRGRP - Read permission, group.
S_IWGRP - Write permission, group.
S_IXGRP - Execute permission, group.
S_IROTH - Read permission, others.
S_IWOTH - Write permission, others.
S_IXOTH - Execute permission, others.
All these modes depend on the umask of the user (see the umask section in the Unix basics for more info).
Example:
int in = open("jokes.txt",O_WRONLY | O_APPEND | O_CREAT, S_IRUSR | S_IWUSR);
What Is a Static Library?
A static library allows you to reuse code. If you have some functions you'd like to use over and over, make a static or a shared library out of them.
Static vs. Shared
What's the difference? Well,
static = sort of like pass-by-copy
shared = sort of like pass-by-reference
In that suppose you have a static library and a shared library, and each has a function 'hello'. When you use a static library and call 'hello' in your program, the code for the 'hello' function is physically put into your program. In each file that uses the function.
Advantages: It's a little more portable. You don't have to worry about missing libraries (dependencies), like with shared libraries (coming up).
Disadvantages: It increases file size because that function is copied in all files that use it. Not only that, it's loaded into memory several times since each program will load their version into memory.
Shared libraries by contrast just provide a link in the relevant files saying "hey computer, this is where you can find this function."
Advantages: Less space occupied in both hard disk and memory.
Disadvantages: This is the same problem as the problem of opening .exes made with Visual Basic. You get an error saying "so-and-so.dll" was not found. Guess what dll stands for? Dynamically linked library! And now you have to go and hunt for that dll online, and you wish that the developer had just added that source code into the .exe file (static library) instead of using a dll (shared library).
Creating a Static Library
Step 1: Create object files from the .c files you want included in the static library:
// here, test.c, test2.c and test3.c are the three files we want in our static library
gcc -c test.c test2.c test3.c
You should now have three new files, test.o, test2.o and test3.o in your folder.
Step 2: Use the object files and the 'ar' command to create the static library:
// here our library is called "lib.a".
ar crv lib.a test.o test2.o test3.o
Your static library is created! You should now have a file called 'lib.a' in your folder.
Using The Library In Your Program
If you have a function hello() in your library, you can just use it in any old .c file you wish. Then when you compile the .c file, just compile the library with it. For example:
// here, lib.a is the library and main.c is the file that uses functions from the library
gcc main.c lib.a
To debug a program, you need to compile it with the '-g' flag:
gcc -g main.c
You can debug programs compiled without this flag but it's nowhere near as helpful.
After your code is compiled (let's say it's compiled into an executable called 'main'), run the GNU debugger with your program:
gdb main
And the GNU debugger should start up. From here, use the following commands:
run
Run the program.
break 9
Set a breakpoint on line 9.
step
step through the program.
info break
See what breakpoints you have enabled.
disable break 1
Disable breakpoint #1 (NOT line 1).
print varname
print the value of the variable varname.
[Enter]
execute the last command again.
quit
Exit the debugger.
If you want to make sure something is true when your program runs, use assertions. For example, in this program you specify what number to divide 10000 by. Since the user should not be able to specify 0, we assert that the number isn't zero. If it is, assert will write some diagnostic info and abort (quit) the program:
int divideby(int num)
{
assert(num!=0);
return 10000/num;
}
Make is used to compile software. Make is used so you don't have to keep track of what's changed, what's not changed, what to compile etc.
Targets And Dependencies
Make builds the target based on the dependencies of that target. Example:
target1.myapp: dep1.c dep2.c dep3.c
And at the top, we specify an 'all' target with our main app as the dependency:
all: target1.myapp
Rules
You can tell make how to make files. By default, if you have this:
myapp: test1.c test2.c
Make will run something like:
gcc -o myapp test1.c test2.c
If you want to use the -g flag, you need to write a rule. Here's the rule we'd want:
myapp: test1.c test2.c
gcc -g -o myapp test1.c test2.c
NOTE: Every rule must start with a tab! This is critical. Without that tab, make will fail. You can not have spaces.
Flags
-n: don't make, but print out what would be done.
-k: keep going, don't stop on errors, which is the default.
-f: Specify which makefile to use for make. The two defaults are:
makefile
Makefile
Macros
Macros are like variables. They give a little flexibility to your makefile. Suppose you have a sourcefile called source.c and it's used all over your makefile. Now you need to rename it to mysource.c. Instead of doing find and replace, just use a macro:
# define the variable
SRC=mysource.c
# use it
myapp: ${SRC)
In the above example, $(SRC) just gets expanded to mysource.c when this runs. Notice we write:
SRC=mysource.c
not:
SRC="mysource.c"
No quotes.
Other Targets
You don't just have to specify your application as a target. Another popular one is 'clean':
clean:
-rm *.o
So when you type
make clean
If deletes all the .o files in the directory. The hyphen at the beginning means "continue even if you got an error" which in this case means "even if there were no .o files to delete". Note that if you have this:
myapp: test1.c test2.c
clean:
-rm *.o
and you say:
make clean
It will just delete the object files, it will NOT build myapp.
Suffixes
A suffix rule is a directive that applies rules and macros to generic suffixes.
Step 1:
Tell make about the suffix. Here we want to write a rule for .cpp files, so:
SUFFIXES: .cpp
Then you tell make that you want to convert .cpp files to .o files:
.cpp.o
Then, tell it the rule:
$(CC) -xc++ $(CFLAGS) -I$(INCLUDE) -c $<
Here, $< is a built in suffix macro.
Built In Suffix Macros
$@: The full name of the current target
$?: A list of modified dependencies (a list of files newer than the target on which the target depends)
$<: The single file that is newer than the target on which the target is dependent
$*: The name of the target file, WITHOUT its suffix (i.e., without the .c or .cpp, etc.)
Why would you use $< instead of $? ?
If you only want to use the newest dependency. Otherwise, use $?.
kill -l
ps
# see all
ps -a
# detailed
ps -l
# you can get the signal number of the various signals by running 'kill -l'
kill -[signal number] [some pid]
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
// the format for this call is:
// signal(signal name,function to call when we get the signal)
(void)signal(SIGINT,interrupted);
void interrupted (int sig) // int sig is the number of the signal (see all numbers using the command 'kill -l'
{
// do something when we get the signal
}
Just catch the SIGCHLD signal.
From page 293 of "Advanced Programming in the UNIX Environment":
SIGCHLD
Whenever a process terminates or stops, the SIGCHLD signal is sent to the parent. By default, this signal is ignored, so the parent must catch this signal if it wants to be notified whenever a child's status changes. The normal action in the signal-catching function is to call one of the wait functions to fetch the child's process ID and termination status.
You can use locate to find the files:
locate stdio.h
These files will most likely be in /usr/include. Read the file with
less /usr/include/stdio.h
System calls (eg open, creat, read, lseek) return -1 when something goes wrong. Every time there's an error, the error code (that's what tells you what error it is) is stored in the global variable 'errno'. The file errno.h tells you about the various error codes possible. Here are the first 5 of them on my computer:
#define EPERM 1 /* Operation not permitted */
#define ENOENT 2 /* No such file or directory */
#define ESRCH 3 /* No such process */
#define EINTR 4 /* Interrupted system call */
#define EIO 5 /* Input/output error */
To print an error, you can use perror like so:
perror("here's the info about the error");
This prints out the text you give it plus a description of the error. So the above line would print something like this, depending on the error:
here's the info about the error: No such file or directory
// or...
here's the info about the error: Interrupted system call
perror does not exit the program after writing the error.
This is sort of like doing cd within a C script.
Returns:
-1 if error
0 if success
int result = chdir(const char * path);
//Example:
char * home = (char *) getenv("HOME");
chdir(home);
System calls are expensive, so you might want to look into using a function from one of the standard libraries instead, like fopen().
Here's how open() works:
int test = open("test1.txt",O_WRONLY|O_TRUNC);
Here we are opening a file called "test1.txt" for writing. We specify the mode in the second parameter (see the entry on file access modes for all choices).
This returns:
a positive integer on success.
-1 on failure.
After this call, test is the file descriptor you use when accessing test1.txt for reading or writing.
System calls are expensive, so you might want to look into using a function from one of the standard libraries instead, like fwrite().
Here's how write() works:
int writer = write(fd,"hello!
",8);
char * astring = "my name is christopher walken.";
writer = write(fd,astring,strlen(astring));
Here, fd is a file descriptor that we got when we used open().
write() returns:
# of bytes written on success. Since each character is 1 byte, you can interpret this as # of characters written.
-1 on failure.
System calls are expensive, so you might want to look into using a function from one of the standard libraries instead, like fread().
Here's how read() works:
char buf[128];
int reader = read(fd,buf,128);
printf("buf is %s
",buf);
Here, fd is a file descriptor that we got when we used open().
write() returns:
# of bytes read on success. Since each character is 1 byte, you can interpret this as # of characters read.
-1 on failure.
It's as easy as:
close(fd);
Where fd is the file descriptor to close. It returns 0 if successful, or -1 on error.
Use lseek when you're using the open, read etc system calls. If you're using fopen(), fread() etc instead, use fseek (which works the same way as lseek).
The lseek system call sets the read/write pointer of a file descriptor; i.e. you can use it to set where in the file the next read or write will occur. You can set the pointer to an absolute location in the file or to a position relative to the current position, or the end of file.
#include <unistd.h>
#include <sys/types.h>
lseek(fd, offset, whence); # see below for explanation
Here, we use the offset and the whence parameters to set the pointer. Here are the three choices for whence:
SEEK_SET
Here, you specify an absolute position in 'offset'.
SEEK_CUR
Here, you specify a position relative to the current position in 'offset'.
SEEK_END
Here, you specify a position relative to the end of the file in 'offset'.
Use dup():
int dup_fd = dup(some_fd);
Every file stream (used with fread() etc) has a low-level file descriptor (used with read() etc - system calls) attached to it. To get the file descriptor associated with a stream, use fileno:
int fd = fileno(fs);
It returns the file descriptor on success, and
-1 on failure.
Trying to make this awful process as painless as possible.
Check out the section on static libraries if find out the difference between static and shared libraries.
For every shared library, you need a bunch of symbolic links (more on this later).
Every shared library has three names:
Real name
This is the actual name of the file. It contains a major version, a minor version, and a release number. For example:
libmylib.so.5.1.10
soname
This is a symbolic link. This filename just contains the major version. For example:
libmylib.so.5
The advantage of doing it this way is suppose you make some minor change in your library and update it from 5.1.10 to 5.1.11. Now, all the programs using the name '...5.1.10' to your library your break, because you changed the name. Instead of having to change all those programs, we just tell the program to use the '...so.5' name instead. As we know, that name is just a symbolic link to the real name. So when the real actual file name changes, we just change the symbolic link to point to that file name instead. None of the programs break because the name of the symbolic link is still the same!
Linker name
This is the name the compiler uses when requesting a library. This is the soname without any version number. For example:
libmylib.so
For example, here's a listing of a shared library:
Naming Requirements
Shared library names end in '.so' (On Mac, the extension is '.dylib' instead).
Shared library names start with 'lib'.
Creating a Shared Library
Note: these directions will work fine on *n*x, but to have them work on a mac requires some different flags as specfied.
Step 1:
Make object files from your .c files:
gcc -c -fPIC my_src_file.c
The -fPIC flag is required. It enables "postion independent code".
On a Mac, you dont need the '-fPIC' flag (it's there by default) but you DO need the '-fno-common' flag.
Now we should have an object file called 'my_src_file.o'.
Step 2:
Make your library:
gcc -g -shared -Wl,-soname,libmylib.so.5 -o libmylib.so.5.1.10 my_src_file.o -lc
On a Mac, the equivalent command is:
gcc -dynamiclib -install_name libmylib.dylib -o libmylib.5.1.10.dylib my_src_file.o
The '-install_name' flag on Macs corresponds to the '-soname' flag on *nix. This flag specifies that the executable is to look for the library libanswer.dylib in the same directory as the executable itself.
Another thing to notice is that on macs, the version #'s go before the file extension. We write '...5.1.10.dylib', NOT '...dylib.5.1.10'.
This creates a library called 'mylib'.
Here, we are creating a shared library with
Real name: libmylib.so.5.1.10
soname: libmylib.so.5
From the object file(s):
my_src_file.o
Step 3:
Make your links. These are all the symbolic links we talked about:
ln -sf libmylib.so.5.1.10 libmylib.so.5.1 # from real name to intermediate link
ln -sf libmylib.so.5.1 libmylib.so.5 # from intermediate link to soname link
ln -sf libmylib.so.5.1.10 libmylib.so # from real name to linker name
Remember, on macs, the extension is 'dylib' instead of 'so', and the version #'s go before the extension. For example, on a mac the link from the real name to the linker name would be:
ln -sf libmylib.5.1.10.dylib libmylib.dylib # from real name to linker name
Step 4: Install
Great! Your library is made. Unfortunately, it's a shared library, so if you need to use it in one of your programs, you can't just add it while compiling your program. Instead, you have to put it somewhere that your program will find it when it runs. More importantly, you have to make sure that you put it somewhere so that no matter who else installs your program, the program will find your shared libraries when it's run.
You have three options of where to put it:
1. Put it in a standard directory
Put it somewhere like /usr/lib or /usr/local/lib, which are standard directories that C checks for libraries — just like Perl does with modules.
2. Put it somewhere, add the path using ldconfig
You can put your library in some sort of folder. Then, use ldconfig to add the path to that folder to the ld config file. This is a file C looks into for paths. So once you've added the path to this folder in there using ld config, you're golden. This is the route most professional applications will take. Macs don't use ldconfig.
3. Add the path to $LD_LIBRARY_PATH
The LD_LIBRARY_PATH environment variable is a colon separated list of directories in which C searches for shared libraries. This is certainly the most convenient way to do it, and this is what people use when they are just testing shared libraries.
If you want to go that second route, you can start a program called "main" by using this:
LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH ./main
Here we add the current directory to LD_LIBRARY_PATH (assuming the current directory is where your shared lib is) and then start main.
Step 5: Use
Suppose the 'mylib' shared library we created above is in the same directory as your source code. You can use this to compile your code with mylib:
gcc -c my_src.c
gcc -o myprogram my_src.o -L. -lmylib
Here we use the '-L' flag to specify the current directory as a location to search for libraries. Then we pass in the library name using the '-l' flag. Note that for this to work, a file named 'libmylib.dylib' needs to be in the current directory; NOT 'libmylib.5.1.10.dylib' or anything along those lines. If you wanted to use version 5.1.10 of the library in particular, you would use 'lmylib.5.1.10'.
You don't need to #include the library in your source code in any way — it will just work:
./myprogram
(This is on a mac. On a *nix system, you do need to use LD_LIBRARY_PATH as explained above).
Once you have your executable built, you can check the shared libraries it is using by using ldd (unix) or otool (mac):
otool -L myprogram
exec() is the name of a whole family of functions — 6 of them. exec() is used to start a new process. For example, if you want to start up 'less' from within your C program, use exec().
An exec function replaces the current process with a new process specified by the path or file argument. Note that it replaces the current function; i.e. once you call exec from your program, your program no longer exists. If you want your program to still be running, call fork() before exec().
These are the 6 exec functions:
execl
execlp
execle
execv
execvp
execve
You'll notice they all say exec, but with a combination of l, p, v or e after them. This is what those letters stand for:
l vs v
If there's an l, that means all arguments have to go in this call itself, like so:
execl("/bin/ps","ps","ax",0);
Notice that you have to end the arguments with a null pointer (aka the zero at the end).
If there's a v, that means all arguments go in an array of strings, and you pass the array as the argument:
char * ps_argv[] = {"ps","ax",0};
execv("/bin/ps",ps_argv);
Notice we still have that zero at the end, except now it's in the array. Also notice that the first argument you pass is just the name of the program. If you remember, in a C program argv[0] is always just the name of the current program. Usually you don't need to specify argv[0]; it is set automatically. In this case you do need to specify it. Also notice it's not the full path to the program, just the name of the program.
p
Having a p in there means exec will use the $PATH variable to look for programs. For example, you'll notice above we specified the full path to 'ps'. With p, you can just say 'ps' (assuming it is in a directory that's in your path). Example:
execl("/bin/ps","ps","ax",0); // no p
execlp("ps","ps","ax",0); // p
e
Normally, the program you exec will get the same environment your current program had. If you want to pass it a new environment, you can use 'e' and pass it an array of strings to be used as the new program environment:
char *ps_envp[] = {"PATH=/bin:/usr/bin","TERM=console",0};
execle("/bin/ps","ps","ax",0,ps_envp);
The new process started by exec() inherits a lot of things from the original process, like the environment and any open file descriptors.
You can create a new process by calling fork(). This system call duplicates the current process, creating a new entry in the process table with many of the same attributes as the current process. Combined with the exec() functions, fork is all you need to run new programs (aka create new processes) from your program. Here's what a fork looks like:
// whatever code here
pid_t pid;
if((pid=fork())==-1)
{
//fork failed.
}else if (pid==0)
{
//this is the child
}else{
//this is the parent
}
So as we can see, pid returns 3 values.
If it returns -1, fork failed.
If it returns 0, that means this is the child.
If it returns a positive number, that's the process ID of the child.
So what this means is: as soon as your program hits fork, it duplicates itself. The original obviously continues running as usual. The duplicate process runs from the point after the fork call. i.e. in the above example, the duplicated process won't run any of the code marked with '//whatever code here'. It will start by evaluating pid==-1.
The one big difference between the duplicated process (aka child process) and the original process (aka parent process) is the value that fork() returns. In the child, fork() returns 0. In the parent, fork() returns the pid, or process id, of the newly created child. That's how we check to see if we're in the child or the parent so we can run the appropriate code (usually an exec() in the child).
Waiting for a process
Now you have two duplicate processes. They will both be running side by side; so first a little bit of code from one of the processes will run, then a little bit of code from the other process will run, back to the first, and so on. Most likely, you don't want this sort of jumble. You want the child to finish running, do whatever it has to do, and then go back to the parent and resume. This is called "waiting for the child to finish". There are two ways to do this:
wait()
int stat_val;
pid_t child_pid;
child_pid = wait(&stat_val);
This waits until any child finishes running. child_pid contains the pid of the child that finished running. stat_val contains the exit status.
waitpid()
int status;
waitpid(child,&status,0);
waitpid() waits for a particular child process to exit.
There are three choices for the first argument:
-1
waits for any child to terminate, same as wait().
0
waits for any child process in the same process group as the current process to terminate.
>0
Waits for the child process with the given pid to exit. You give the pid of the child process as the first argument ("child" in this case). Remember that you get the pid of the child process when you call fork()...so it's a pretty easy matter to just give that pid to the waitpid() function.
Zombie
If a child exits before it's parent can call wait, the child sticks around, containing just it's return value, waiting for someone to call wait() so that it can give that process it's return value and finally exit. This child is known as a zombie process. On newer Unix systems, init automatically calls wait() on zombie processes so they can die and go to zombie heaven.
Orphan
What if a parent dies and the child is still running? This child is called an orphan. But now it doesn't have a parent. In these cases, init adopts the child. So init is the new parent of the child.
Unnamed Pipes are a way to transfer data between processes that are related to one another. Suppose you have a child process that wants to communicate to a parent process. One way to do it would be to create a temp file, have the child process write to the file, then have the parent process read from the file. By using pipes, you're doing something similar; you're creating a pipe that one process can write to and the other process can read from.
To make a pipe, use the pipe() command:
int fd[2];
pipe(fd);
Notice that we pass it two file descriptors. Pipe will open both:
fd[0] is the file descriptor we will read from.
fd[1] is the file descriptor we will write to.
Here's a quick example:
// make the pipe
int fd[2];
pipe(fd);
// write
write(fd[1],"hello!",6);
// read and print out
char * buf[128];
read(fd[0],buf,128);
printf("I read: %s
",buf);
This isn't very useful, because the program is just communicating with itself. Here's an example where a child is communicating with a parent:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char * argv[])
{
int fd[2];
pipe(fd);
pid_t child;
if ((child = fork()) < 0)
{
perror("fork error");
}else if (child==0)
{
printf("child is writing data...
");
write(fd[1],"hello!",6);
}else
{
int status;
waitpid(child,&status,0);
printf("parent is reading data...
");
char * buf[128];
read(fd[0],buf,128);
printf("parent read: %s
",buf);
}
return 0;
}
Notice that it's the same idea, except now the child is writing and the parent is reading. But there is one restriction:
Unnamed pipes can only be used between related processes.
So you want to transfer data between unrelated processes? Alright, use a named pipe (aka FIFO)! To do this, there are two steps: create the FIFO, and use the FIFO.
Creating a FIFO
Use mkfifo:
int res = mkfifo("test_fifo",0777);
mkfifo returns
0 on success, and
-1 on error.
Using the FIFO
Now that you have a FIFO, you'll have one program open it for reading and the other open it for writing.
Note: you cannot have a single program open it for both reading and writing.
Besides that, you now use the FIFO just like any other file, using open(), read() and write() to access it.
So now we can make two simple programs, reader.c and writer.c. writer will write to the FIFO and reader will read it.
reader.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char * argv[])
{
// make FIFO
int res = mkfifo("test_fifo",0777);
if (res==-1) printf("couldn't make fifo..it probably already exists
");
// read from FIFO
res = open("test_fifo", O_RDONLY);
char buf[128];
read(res,buf,128);
printf("I got this: %s
",buf);
return 0;
}
writer.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
int main(int argc, char * argv[])
{
int res = open("test_fifo", O_WRONLY);
char * buf = "this is a test.";
write(res,buf,strlen(buf));
return 0;
}
And if you start these programs from two different terminals, you'll see the magic of data being passed between unrelated processes. Note that reader blocks on read. i.e. it will pause on the line in bold above and wait for something to be written to the FIFO before it continues executing the rest of the program. If you don't want it to block, you need to specify O_NONBLOCK as one of the flags when you open the FIFO:
res = open("test_fifo", O_RDONLY | O_NONBLOCK);
You create a new semaphore with the semget function:
int sem_id = semget([some key] , [number of semaphores required], [creation flags]);
Parameters:
1. The key should be a unique integer > 0. This is how multiple processes all use the same semaphore — they specify the same key.
You can specify the constant IPC_PRIVATE here instead, and then the semaphore can be accessed only by the creating process. This rarely has any useful purpose.
2. This is the number of semaphores required. This will usually be 1.
3. The flags are pretty similar to the flags of the 'open' function. You give the permissions, and OR them with IPC_CREAT if you want the semaphore created if it doesn't exist.
Example:
int sem_id = semget((key_t)1234,1,0666 | IPC_CREAT);
Although you now have a new semaphore, you still need to initialize it before you can use it. See the section on 'Initializing and Deleting Semaphores' for more details.
You can think of a semaphore sort of like an airplane bathroom.
If the bathroom is unoccupied, you go in and lock the door. Now the bathroom is occupied and no one else can come in. When you're done with your business, you unlock the door and go out. Whoever is next can now go in. The big idea is that only one person can be in the bathroom at a time, just like a critical section of a code protected by a semaphore can only be accessed by one process at a time.
There are two operations you can do on the bathroom door: lock, and unlock. Those are the same two operations that you do on a semaphore. To operate on a semaphore, you use the semop function:
int semop([semaphore id] , [struct sembuf pointer], [number of structs in param 2]);
Parameters
1. This is just the id you got from the semget function.
2. Here's where things get a little complex. The second parameter is a pointer to an array of structures, each of which have at least the following members:
struct sembuf {
short sem_num;
short sem_op;
short sem_flg;
}
sem_num
The semaphore #. usually 0 unless you're working with an array of semaphores.
sem_op
The value by which the semaphore should be changed. This is what actually locks / unlocks your semaphore. The possible values are:
-1 to lock.
+1 to unlock.
sem_flg
This is set to SEM_UNDO. In case a process terminates without releasing a semaphore, the operating system will automatically release it.
3. Since you can give an array of structs as param 2, This param says how many structs are in that array.
An Example:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/sem.h>
// here we're just using ONE semaphore.
// This will be the most common scenario.
struct sembuf sem_b;
sem_b.sem_num = 0; // semaphore #. zero since only one semaphore.
sem_b.sem_op = -1; // so we're LOCKING the semaphore here.
sem_b.sem_flg = SEM_UNDO;
if(semop(sem_id,&sem_b,1) == -1) // sem_id is an id we got through a previous call to semget.
{
// semaphore locking failed, maybe some other process is using it right now
}else
{
// we've locked the semaphore. We're in the bathroom, so to speak.
}
The sembuf struct that we are using is actually defined in <sys/sem.h>. Here's what the definition looks like:
/*
* Structure of array element for second argument to semop()
*/
struct sembuf {
unsigned short sem_num; /* [XSI] semaphore # */
short sem_op; /* [XSI] semaphore operation */
short sem_flg; /* [XSI] operation flags */
};
You use the semctl function to initialize and delete semaphores:
int semctl( [semaphore id], [semaphore number], [command], [union semun]);
Parameters
1. This is just the id we got from the semget() function.
2. This is the semaphore number. Unless you're working with an array of semaphores, this will just be zero.
3. Here we give the command to initialize or delete the semaphore. Here are the options:
SETVAL: Initialize the semaphore to a known value. The value is passed as the val member of the union semun (more on this in parameter 4).
IPC_RMID: Used for deleting a semaphore identifier.
4. This is a union semun, which must have at least the following members:
union semun {
int val;
struct semid_ds *buf;
unsigned short *array;
}
Most versions of Linux have a definition of the semun union. For example, here's the definition from <sys/sem.h> from my machine:
union semun {
int val; /* value for SETVAL */
struct semid_ds *buf; /* buffer for IPC_STAT & IPC_SET */
unsigned short *array; /* array for GETALL & SETALL */
};
typedef union semun semun_t;
Example of initializing a semaphore:
// the union used for param 4
union semun sem_union;
// set the value because we're initializing the semaphore here
sem_union.val = 1;
if (semctl(sem_id,0,SETVAL,sem_union) == -1)
{
// semaphore initializing didn't work
}
Example of deleting a semaphore:
union semun sem_union;
if(semctl(sem_id,0,IPC_RMID,sem_union)==-1)
{
// semaphore identifier wasn't deleted.
}
Shared memory is created using the shmget function:
int shmget([a key], [amount of memory required in bytes], [creation flags]);
Notice how similar it is to the semaphore-creating semget() function.
Parameters
1. Just like semget, this is a unique integer > 0. All programs that use this shared memory must put in the same number, and that's how the OS knows which programs the memory is being shared between.
2. Size in bytes. 1024? 2048? You decide.
3. Permissions + IPC_CREAT if you want to create the memory (which of course you do).
Example:
#include <sys/shm.h>
int shmid = shmget((key_t)1234, 2048, 0666 | IPC_CREAT);
int shmid = shmget((key_t)1234,sizeof(struct shared_use_st) , 0666 | IPC_CREAT);
Returns -1 on error.
When you first create a shared memory segment using shmget, it's not accessible by any process. To enable access, you must attach it to the address space of a process. You use the shmat function to do this:
void *shmat( [shared memory identifier] , [memory address], [some flags or 0]);
Parameters
1. This is the memory identifier returned by shmget().
2. This is the address at which the shared memory is to be attached. This should almost always be a null pointer — that way the system will choose this by itself.
3. Here you can usually put zero. But you also have a choice of putting one or both of these flags:
SHM_RND: controls the address at which the memory is attached.
SHM_RDONLY: makes the attached memory read-only.
Example:
void *shared_memory = (void *)0; // because shmat returns a void pointer.
shared_memory = shmat(shmid, (void *)0, 0); // notice how we pass in a null pointer.
Returns -1 on error.
It's as easy as:
shmdt(shared_memory);
where shared_memory is the void pointer you got from shmat. Note that detaching the shared memory doesn't delete it; it just makes that memory unavailable to the current process.
Use the shmctl function:
int shmctl([shared memory identifier] , [some command] , [a struct]);
Parameters:
1. This is the identifier you got from shmget.
2. This is the action to take. You have three choices:
IPC_STAT
Sets the data in the struct (param #3) equal to the stuff in the shared memory.
IPC_SET
Sets the shared memory equal to the struct (param #3).
IPC_RMID
Deletes the shared memory segment. This is the only one you're really going to use.
3. This is a struct with at least these members:
struct shmid_ds {
uid_t shm_perm.uid;
uid_t shm_perm.gid;
mode_t shm_perm.mode;
}
Example:
shmctl(shmid,IPC_RMID,0); // don't need to pass in a struct, we're just deleting the memory.
Please read the sections on creating, attaching, detaching and controlling shared memory before starting on this one. Here we put it all together.
STEP 1:
Create some sort of structure to describe the shared memory. There are no restrictions on this structure or stuff you must include. Here's our struct for this example:
#define TEXT_SZ 2048
// struct for shared memory
struct shared_use_st {
int written_by_you;
char some_text[TEXT_SZ];
};
STEP 2:
Create the shared memory using shmget and save the returned identifier:
int shmid = shmget((key_t)1234,sizeof(struct shared_use_st) , 0666 | IPC_CREAT);
So now shmid contains our shared memory id. Notice how we just pass in the size of the struct as the second parameter.
STEP 3:
Attach the shared memory to the address space of the current process:
void *shared_memory = (void *)0;
shared_memory = shmat(shmid, (void *)0, 0);
STEP 4:
Now we connect the struct from step 1 and the void pointer from step 3.
struct shared_use_st *shared_stuff; // a new struct ala step 1!
shared_stuff = (struct shared_use_st *)shared_memory; // cast that void pointer into pointer for our struct!
shared_stuff->written_by_you = 0; // now the fields can be accessed like this.
That's it! Our shared memory is set up and ready to go. To read or write to the memory, we just access parts of the struct like so:
int reading = shared_stuff->written_by_you; // reading from shared memory
strcpy(shared_stuff->some_text,"hello!"); // writing to shared memory; remember you have to use strcpy because you can't just do an assignment.
Of course, the big issue with shared memory is concurrency; what happens when two processes try to write to the same location in memory at the same time? You need to avoid things like that by using SEMAPHORES (or something).
You can create and access a message queue using msgget:
int msgget( [SOME KEY] [SOME FLAGS]);
Note that unlike shared memory or semaphores, you don't need to initialize the queue after you create it or any of that nonsense.
Parameters:
1. A unique integer > 0. This is how the OS knows which queue you're trying to access; processes accessing the same queue will provide the same key. Just like shared memory/semaphores, you can pass in IPC_PRIVATE and have the queue available only to a single process, but this isn't very useful.
2. Again, these are the permissions OR'd with IPC_CREAT to create a new queue.
Example:
#include <sys/msg.h>
int msgid = msgget((key_t)1234, 0666 | IPC_CREAT);
Use the msgsnd function to send messages via a message queue:
int msgsnd([MESSAGE QUEUE ID] , [A STRUCT CONTAINING THE MSG] , [MESSAGE SIZE] , [SOME FLAGS]);
Parameters:
1. This is just the identifier you get from msgget.
2. This is a struct containing the message. The only requirement is the first member must be a long int. This will be the message type. For example, here's a struct:
struct my_msg {
long int msg_type;
char some_text[1024];
}
The message type is used in the receive function. You should also initialize the msg_type to something...anything, as long as you know what it is.
3. This is the size of the message. So basically, this is the size of the struct minus the long int.. Don't forget to subtract it's size from the total size.
4. Here you can give the flag IPC_NOWAIT. If the queue is full and you have specified IPC_NOWAIT, the function will return immediately without sending the message (return value of msgsnd will be -1). If you haven't specified IPC_NOWAIT and the queue is full, it will keep waiting for the queue to get some space.
Example:
// create the queue with msgget
int msgid = msgget((key_t)1234, 0666 | IPC_CREAT);
// some data is a struct we created previously
some_data.my_msg_type = 1;
strcpy(some_data.some_text,buffer);
// MAX_TEXT is the size of the msg
// notice that we're not passing IPC_NOWAIT, so if the queue is full, we'll wait.
msgsnd(msgid, (void *)&some_data,MAX_TEXT,0);
Use the msgrcv function to get messages via a message queue:
int msgrcv([MESSAGE QUEUE ID] , [A STRUCT CONTAINING THE MSG] , [MESSAGE SIZE] , [MESSAGE TYPE] , [SOME FLAGS]);
Parameters:
Parameters 1, 2 and 3 are the exact same as for msgsnd. Check out that section for details.
4. This is the the type of the message you're getting. There are three choices:
message type = 0
The first message on the queue is retrieved.
message type > 0
The first message on the queue with this message type is retrieved.
message type < 0
The first message on the queue that has a type the same or less than the absolute value of the message type is retrieved.
5. This is the IPC_NOWAIT flag or zero, same as msgsnd.
Example:
#include <sys/msg.h>
int msg_type = 2;
// here's the struct we'll use in msgrcv.
struct my_msg_st some_data;
// BUFSIZ is the message size.
msgrcv(msgid, (void *)&some_data,BUFSIZ,msg_type,0);
Use the msgctl function. This pretty much the same as the shmctl function used for shared memory. Check out that section for a rundown of the parameters.
Example:
msgctl(msgid,IPC_RMID,0);
You create a socket with the socket system call:
int socket( [THE DOMAIN] , [THE TYPE] , [THE PROTOCOL]);
Parameters:
1. The two most common domain choices are:
AF_UNIX
Use this for local sockets implemented via Unix and Linux file systems.
AF_INET
Use this for UNIX network sockets, for programs communicating over some sort of network (e.g. the Internet).
2. You can choose between SOCK_STREAM and SOCK_DGRAM. SOCK_STREAM is waay more reliable, but SOCK_DGRAM is cheaper. Most of the time, you're going to use SOCK_STREAM.
3. You'll usually put 0 to select the default protocol.
Example:
#include <sys/types.h>
#include <sys/socket.h>
int sockfd = socket(AF_UNIX,SOCK_STREAM,0);
Socket Addresses
Each socket domain (AF_UNIX, AF_INET etc) requires its own address format. Here are the two most common:
AF_UNIX
Uses the following struct defined in <sys/un.h> :
struct sockaddr_un {
unsigned char sun_len; /* sockaddr len including null */
sa_family_t sun_family; /* [XSI] AF_UNIX */
char sun_path[104]; /* [XSI] path name (gag) */
};
So obviously if you're using a AF_UNIX socket you need to include <sys/un.h>.
AF_INET
Uses the following struct defined in <netinet/in.h> :
struct sockaddr_in {
__uint8_t sin_len;
sa_family_t sin_family;
in_port_t sin_port;
struct in_addr sin_addr;
char sin_zero[8]; /* XXX bwg2001-004 */
};
So obviously if you're using a AF_UNIX socket you need to include <netinet/in.h>.
Now let's see where these structs are actually used.
bind()
To make a socket available for use by other processes, a server program (not the clients) needs to give the socket a name. AF_UNIX sockets are associated with a file system pathname (since it's socket for local stuff) and AF_INET sockets are associated with an IP port number.
The socket is named using the bind function:
#include <sys/socket.h>
int bind([SOCKET FILE DESCRIPTOR] , [ONE OF THOSE STRUCTS] , [SIZE OF STRUCT]);
Parameters
1. The file descriptor is what you got by calling socket().
2. This is the address of one of those structs we discussed above. The following example shows how to use this.
3. This is just the size of the struct, which you can get via sizeof().
Example using AF_UNIX:
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
struct sockaddr_un my_socket_addr; // so this is for the AF_UNIX domain.
my_socket_addr.sun_family = AF_UNIX; // told ya
strcpy(my_socket_addr.sun_path,"whatever_filename_we_want"); // here's where we give it a name
// finally we call bind.
// Here, server_sockfd is a file descriptor that we got before using socket().
bind(server_sockfd,(struct sockaddr *)&my_socket_addr, sizeof(my_socket_addr));
Example using AF_INET:
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
struct sockaddr_in my_socket_addr; // so this is for the AF_UNIX domain.
my_socket_addr.sin_family = AF_INET;
my_socket_addr.sin_addr.s_addr = inet_addr("127.0.0.1"); // so this is an example using localhost (i.e. the local machine).
// here we specify the port where the server is.
// (sidebar on why we use htons() follows:)
// Port numbers are communicated over socket interfaces as binary numbers. Different computers use different byte ordering for integers. For example, an Intel processer stores the 32-bit integer as 4 consecutive bytes in memory in the order 1-2-3-4, where 1 is the most significant byte. IBM would store the integer in the byte order 4-3-2-1. So to make sure computers agree on what the number is, we always use htons. It converts between formats.
my_socket_addr.sin_port = htons(9734);
// finally we call bind.
// Here, server_sockfd is a file descriptor that we got before using socket().
bind(server_sockfd,(struct sockaddr *)&my_socket_addr, sizeof(my_socket_addr));
To accept incoming connections on a socket, a server program must create a queue to store pending requests. It does this using the listen system call:
#include <sys/socket.h>
int listen([SOCKET FD] , [MAX. NUMBER OF CONNECTIONS POSSIBLE IN QUEUE]);
Parameters:
1. This is the socket file descriptor you got from the call to socket().
2. This is the max length of the queue. A length of 5 is very common. A max of this many connections is stored in the queue waiting to connect to the socket. So suppose we specified 5 as the max and the queue has 5. Now along comes someone else trying to connect to the socket. There's too many now, so this connection will be refused; this person's connection will fail.
Example:
listen(server_sockfd,5); // server_sockfd is a previously-gotten file descriptor
So obviously, we use listen in the server program, just like bind(). There's no use for it in the client program.
Just like bind() and listen(), this part is done in the server program only. We accept a connection to a socket using the accept system call:
#include <sys/socket.h>
int accept([SOCKET FD] , [STRUCT FOR CLIENT] , [SIZE OF STRUCT]);
If you remember the structs in bind(), we'll be using the exact same structs here, except those structs were for the server and these are for whatever client connects. Here's how it works:
1. Some client connects to the socket that the server has created. This client is just the first guy in the socket queue we created.
2. The accept function creates a new socket to communicate with the client. So what accept() returns is the file descriptor of this new socket.
3. Great! Now you have a file descriptor to a socket between the server and a client. You can read / write from it using the usual read / write functions.
Parameters
1. The file descriptor of the socket that we got through our call to socket().
2. A struct, depending on what the domain of the socket is (AF_UNIX or AF_INET). See the section on bind() for full details.
3. An address of a variable containing the size of the struct. Just get this using sizeof().
Example:
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
// this is the struct we'll be using
struct sockaddr_un client_address;
// accept a connection:
client_len = sizeof(client_address); // notice how we put this in a separate var...b/c we need the ADDRESS of this var as the third parameter.
int client_sockfd = accept(server_sockfd, (struct sockaddr *)&client_address, &client_len);
// read and write to client on client_sockfd
read(client_sockfd,&ch,1); // read a character from the socket (so from the client)
ch++; // increment the character
write(client_sockfd,&ch,1); // write it back to the socket (so back to client)
close(client_sockfd); // close the socket.
If there are no connections pending on the socket's queue, accept() will block (so the program won't continue) until a client does make a connection.
Ok, we've seen how to
1. Create a socket,
2. Name the socket,
3. Create a socket queue and
4. Accept connections.
But how do we actually request a connection? This is the code that goes in the client.
Step 1:
Get a new socket file descriptor:
int client_sockfd = socket(AF_UNIX,SOCK_STREAM,0);
Step 2:
Use connect() to connect to a server:
int connect([SOCKET FD] , [A STRUCT] , [STRUCT SIZE]);
Parameters:
1. This is the socket file descriptor we got in Step 1.
2. This is a struct, same as the one in bind. Whatever address you gave in the call to bind(), that same address has to be given here. Check out the section on bind() for the two most commonly used domains and their structs.
3. The size of the struct.
Example:
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h> // so we're using AF_UNIX
// this is the struct we'll be using in params 2 and 3.
struct sockaddr_un address;
// give the address, and connect
address.sun_family = AF_UNIX;
strcpy(address.sun_path,"the_same_address_we_gave_in_bind");
connect(sockfd, (struct sockaddr *)&address, sizeof(address));
// now we can use the socket file descriptor to read to / write from the socket
write(sockfd,&ch,1); // write the char to the socket
read(sockfd,&ch,1); // read from the socket into the char
printf("char from server = %c
",ch);
close(sockfd);
If the connection can't be set up immediately, connect will block for an unspecified timeout period. Once the timeout has expired, the connection will be aborted and connect will fail (return -1).
Check out the section on bind for an example on connecting using AF_INET.
We can write this:
// notice no semicolon
#define TRUE 1
Now we can write things like this:
int isMale = TRUE;
// same as:
int isMale = 1;
so '#define' means just doing a simple substitution. The preprocessor does this substituting.
You can also do things like this:
#define AND &&
#define OR ||
#define IS_LEAP year % 4 == 0
// and then:
if (IS_LEAP) { ... }
Notice that IN_LEAP we are forced to check the value in the variable 'year'. That's not very flexible. This is better:
#define IS_LEAP(y) y % 4 == 0
// and then:
if (IS_LEAP(year)) { ... }
if (IS_LEAP(another_year)) { ... }
This is generally what's called a macro. Note that we're not giving a type a type to 'y', because this is again just a find-and-replace sort of a thing.
You can also use the ternary operator. Here's an example that chooses the larger of two values:
#define MAX(x,y) ( ((x) > (y)) ? (a) : (b) )
We need all those parentheses because what if someone calls MAX like this:
MAX(x + 40, x + 80);
MAX just does a simple text substitution and so without those parens our expression would not evaluate correctly.
#IFDEF
#ifdef MAC_OS_X
#define OS 1
#else
#define OS 0
#endif
In this example, if the symbol MAC_OS_X has been previously defined, OS will be set to 1. Otherwise OS will be set to 0.
When a program has only one thread, it's only doing one thing at a time.
When a program has multiple threads, it can be loading an XML file, animating something on the screen, getting user input, all at the same time. So having multiple threads allows your program to do multiple things and speeds up your program. The downside is, if two threads are both operating on the same data, the same file descriptors etc, you can run into trouble. This is known as a race condition.
Threads are similar to processes. Just like you create a new process with fork(), you create a new thread with pthread_create():
pthread_create([address of thread],[thread attributes],[function to call],[pointer to arguments to function]);
So each time you create a thread, you send it a function you should run. The thread attributes are usually NULL. Here's an example:
pthread_t mythread;
pthread_create(&mythread,NULL,func_call,(void *)arg);
Here we are calling the 'func_call' function with arguments 'arg'. Note that you can only give the address of one argument; so if you have multiple arguments, you need to put them in a struct and send them:
struct mydata {
int item1;
int item2;
};
struct mydata data;
data.item1 = 1;
data.item2 = 2;
pthread_t mythread;
pthread_create(&mythread,NULL,func_call,(void *)data);
void func_call(void *d)
{
somedata = (struct mydata) d; // notice how we cast it back to the struct here
}
This new thread is part of the same process, so it has the same variables, file descriptors etc. If you call exit() anytime within this thread, the whole process will exit.
When your process creates a new thread, its like a mom in a supermarket that tells her kids "you go get some milk, and you go get the eggs". Then the mom sits around waiting for the kids to come back. Similarly, your process can choose to wait for a thread to finish, or just go on with its life, buying the tomatoes, jam etc while the kids get the milk and eggs.
Waiting
Use pthread_join():
void *exitcode;
pthread_join(mythread,&exitcode); // exitcode holds the return value of the thread 'mythread'
Not Waiting
Don't use pthread_join(). Easy as that.
You might want to call pthread_detach() though. This makes sure that any resources consumed by this thread are immediately freed when the thread exits:
pthread_detach(mythread);
A mutex is the thread version of a semaphore.
Step 1: Create a Mutex
pthread_mutex_t mymutex = PTHREAD_MUTEX_INITIALIZER;
Step 2: Lock and Unlock
Lock your mutex when you enter the critical section. Unlock it when you leave the critical section. Easy as that.
pthread_mutex_lock(&mymutex);
// critical section code here
pthread_mutex_unlock(&mymutex);
So now, when multiple threads want access to that critical section, they can only go through one at a time because first they have to get the lock. Note that mutexes aren't completely failsafe. Suppose in your critical section, you're increasing the value of 'myvar' by 1:
pthread_mutex_lock(&mymutex);
// critical section code here
myvar++;
pthread_mutex_unlock(&mymutex);
and you have another function that also accesses myvar:
(void) otherFunc()
{
myvar++;
}
Since there is no mutex lock around this increment to myvar, it is still possible that two threads will change myvar's value at the same time if one changes through the regular function and one changes it through otherFunc(). There is no mutex in otherFunc(), so nothing stops another thread from basically as good as entering your critical section, just through another path.