Algorithm Analysis: Week 4

Algorithm Analysis -- Week 4

Introduction

This week we will look at graphs and graphing algorithms.

Note that the links from this page are to handouts that will be distributed the night of class. Only print out those handouts if you could not attend class.

Main topics this week:

Graphs
Graph Properties
Paths
Graph Exercise
Königsberg Bridge Problem
Graph Representation
Graph Exercise #2
Review for Midterm 1
Next Week

Graphs

Graphs are a data structure. Like all other data structures, that means a graph is a collection of other things. Graphs are similar to trees, but have different terminology and different properties. We saw that trees are used to provide more efficient searching, but graphs are usually used to solve different problems.

For example, your student manual describes a famous problem called the Königsberg Bridge Problem. Königsberg was a city built across a river, on both sides of the river and on islands in the river. Bridges connected the different portions of the city

(draw diagram).

At some point, someone asked the question, "Is it possible to start at some point and return to the same point after crossing each of the bridges only once?" A famous mathematician named Euler finally solved this problem, using graphs.

(draw graph version of the diagram)

That is only one of the types of problems that you can solve using graphs. Other problems include finding the shortest route beteween two cities, project planning, and analyzing communications networks.

Definition of a Graph

Before we can look at how to use graphs, we need to know what they are. A graph consists of two sets, V(G) and E(G).

V(G) is a set of vertices, and cannot be empty. A vertex is a single node in the graph, similar to a node in a tree.

E(G) is a set of edges, and may be empty. An edge is a link between two vertices.

(show sample graph, and the contents of V(G) and E(G))

Types of Graphs

Just as there were different types of trees, there are different types of graphs. What we have seen so far have been called undirected graphs. An undirected graph means that if you can move from vertex A to vertex B along an edge, then you can also move from vertex B to vertex A along that same edge.

We can also have a directed graph (also called a digraph). Essentially, the edges become one way streets. You can move along them only in one direction. We show the direction with an arrow.

(show directed graph example)

Graph Notation

Since graphs are two sets, we can use set notation to represent them. For example, these two sets would represent an undirected graph:

V(G) = {0, 1, 2, 3}
E(G) = {(0,1), (0,2), (0,3), (1,2), (1, 3), (2, 3)}

The two numbers in parenthesis show the two vertices that an edge connects. We have used parenthesis to show that the connection is undirected (e.g. two-way).

To show a directed connection, we use angle brackets (<>). For example:

V(G) = {0, 1, 2}
E(G) = {<0,1>, <1, 0>, <1, 2>}

You should realize by now that a tree is a graph, but a specialized sort of graph. Graphs can represent things that trees cannot.

Graph Restrictions

Normal Graphs have a couple of restrictions:

Cannot have edges that loop (source and destination vertex are the same vertex)
Cannot have multiple edges that are identical

Graph Properties

Complete Graphs

Just as we could have complete trees, we can have complete graphs. The definition of a complete graph is a bit different than a complete tree, though.

A complete graph is a graph that has the maximum number of edges. That means that every vertex has an edge to every other vertex.

For an undirected graph, given N vertices, the maximum number of edges is N * (N - 1) / 2. (show this on a sample)

For a directed graph, given N vertices, the maximum number of edges is N * (N - 1). We don't divide by 2, because edges only go in one direction.

Is a complete tree also a complete graph?

Adjacency

We can say that vertex B is adjacent to vertex A only if an edge exists leading from vertex A to vertex B.

(show examples on undirected and directed graphs)

We can also say that an edge is incident on vertices A and B if it leads from A to B. The edge would be incident from A, and incident to B.

Subgraphs

Just as we could look at portions of a tree (subtrees), we can also look at portions of a graph (subgraphs). A subgraph is any subset of the original graph.

Degree of a Vertex

The degree of a vertex is the number of edges attached to that vertex for undirected graphs. For a directed graph, we can measure both the in-degree (the number of edges leading to that vertex), and the out-degree (the number of edges leading away from that vertex). For a directed graph, the degree of a vertex is the sum of the in-degree and the out-degree.

Paths

Most work with graphs involves finding our way from one vertex to another, so we deal with paths. A path is any sequence of vertices connected by edges.

(show undirected graph and list some paths)

In a directed graph, a path must follow the direction of the arrows (no going the wrong way on a one way street).

(show directed graph and list some paths)

Simple Paths

A simple path is a path in which all vertices are unique, except possibly for the first and the last.

(show some simple paths and some non-simple paths)

Cycle

A cycle is a path in which the first and last vertices are the same. A graph is which there are no cycles is said to be acyclic.

Connected Graphs

A connected graph is a graph in which there is no set of vertices that is apart from the other vertices. This can be clearly seen when you draw a graph that is not connected.

(show connected and not connected examples)

We can now define a tree in terms of graph terminology. A tree is a directed graph that is connected and acyclic.

Strongly Connected Graphs

A strongly connected graph is a connected graph in which an actual path exists from every node to every other node.

(show examples)

Graph Exercise

I will pass out questions that you must answer for two graphs drawn on the board. This exercise is worth 10 points.

Königsberg Bridge Problem

Let's go back and look at the Königsberg Bridge Problem with what we now know about graphs.

(draw graph for the problem)

Euler called a path that started at one vertex, crossed over every edge exactly once, and returned to the starting vertex, a Eulerian Walk, or a Euler Circuit. He developed a theorem about when a Euler Circuit is possible:

Euler Circuit Theorem

A Euler Circuit exists if and only if all of the vertices of a graph are of an even degree.

In other words, each vertex must have an even number of edges connecting it to other vertices. Thinking about this, it makes sense, because you have to use each edge exactly once. If a vertex had an odd number of edges, you would travel to that vertex and not be able to leave.

If you look at the graph for the Königsberg Bridge Problem, you'll see that every vertex has an odd number of edges. It only takes one to make a Euler Circuit impossible, so we know that the Königsberg Bridge Problem cannot be solved.

Euler also came up with another idea, called Euler Path. A Euler Path is a path that starts from one vertex, crosses each edge exactly once, and then ends in a vertex different from the start vertex.

Euler Path Theorem

A Euler Path exits if and only if all the vertices have even degree, except for two that have an odd degree. The two that have odd degree become the start and end vertices of the path.

Is there a Euler Path in the Königsberg Bridge Problem?

(show other examples and see if they have Euler paths or circuits)

Graph Representation

When we want to use graphs in computer programs, we need some way to store them in memory. There are two ways that are popular:

Adjacency matrix

Adjacency list

Adjacency Matrix

An adjacency matrix is a two-dimensional array. If the number of vertices in the graph is N, then the array is N rows by N columns. If an edge exists from vertex X to vertex Y, then we put a 1 in the array at location [X][Y]. If no edge exists, then we put a 0 at that location.

What would the adjacency matrix be for the following graph?

V(G) = {0, 1, 2, 3}
E(G) = {(0,1), (0, 2), (0, 3), (1, 2), (1, 3), (2, 3)}

What about for this graph?

V(G) = {0, 1, 2, 3}
E(G) = {(0,1), (0, 2), (1, 3)}

These have all been undirected graphs, and as you can see the adjacency matrix is symmetric. We only need to store the upper or lower triangle of data to know about all the edges.

For a directed graph, the adjacency matrix does not have to be symmetric. Consider the following graph:

V(G) = {0, 1, 2, 3}
E(G) = {<0,1>, <0, 2>, <0, 3>, <1, 2>, <1, 3>, <2, 3>}

(show the adjacency matrix for it)

Using an adjacency matrix, we can easily determine if there is an edge connecting two nodes, and also calculate the degree of a node. There will be an edge connecting two nodes X and Y if element [X][Y] is 1.

The degree of node X for an undirected graph is simply the sum of all the elements of the adjacency matrix for row X. (show an example).

For a directed graph, the out-degree for node X is the sum of all the elements in row X. The in-degree is the sum of all elements in column X.

Other questions we might ask about graphs are:

How many edges are there in the graph?
Is the graph connected?
Is the graph strongly connected?

To answer these questions, we would need to look at every element of the adjacency matrix. For N nodes, that would be examining N*N elements, making it O(N²).

For most graphs, though, there will be quite a few zeroes in the adjacency matrix, because most graphs are not complete. We call these graphs sparse graphs, because the number of ones in the adjacency matrix is smaller than the number of zeroes. If we want to count the number of edges, it's a waste of time to look through elements that are zero. To avoid this, we can use an adjacency list instead of an adjacency matrix.

Adjacency List

In an adjacency list, we only store elements for edges that do exist. This means that to perform algorithms that require examining every edge, we will take less than O(N²) time.

Instead of a two-dimensional array, an adjacency list is a single dimensional array where every element of the array is a linked list. If vertex X connects to vertex Y with an edge, then the list at element X in the array will contain a node that has the value Y.

(show example)

We keep the lists sorted to make various algorithms more efficient. In a directed graph, some vertices may not have any outgoing edges, which means their linked list will be empty.

The degree of any vertex in an undirected graph represented by an adjacency list is the number of elements in the linked list. This is also the out-degree for a directed graph.

How can we determine the in-degree of a vertex? That's more complex, but we can simplify it by maintaining what is called an inverse adjacency list. The inverse adjacency list keeps track of edges coming into a vertex.

(show example)

So the in-degree of a vertex is the number of nodes in that element's inverse adjacency list.

So we said that an adjacency list would allow us to do things like count the number of edges in the graph in less than O(N²) time. But how much more efficient is it?

That depends partly on our linked list implementation. If our list keeps track of how many elements are in it, then all we must do is go through our single-dimensional array adding up the sizes of the lists. What would the big-Oh of that be?

If our list does not keep track of how many elements are in it, then we must go through our single-dimensional array, and for each element we must process every node in the list to count them. What would the big-Oh of that be?

That's still faster than O(N²), because most graphs are not complete, so they have fewer edges than nodes.

Graph Exercise #2

I will pass out questions that you must answer for several graphs drawn on the board. This exercise is worth 10 points.

Review for Midterm 1

We will review for the first midterm. The list of topics covered in the midterm can be found in the week 5 lecture notes.

Next Week

Next week is the first midterm.