Hi, in the second part of the graph series, I want to go into how to build graphs programming-wise. The implementation is in Python and JSON (Javascript Object Notation).
Let's take again the graph from the last post, which was implemented in Prolog. Let's implement it in another language. I have added a small addition to the graph, the weights.
Define the JSON graph
If we represent the graph in a JSON, we can later work with it in applications, quite easily. The advantage here is that we can also work with it later in apps/web applications.
The implementation of the graph in JSON then looks like this:
{
"nodes": [
{
"name": "A",
"to": [
{
"node":"B",
"weight": 1.0
},
{
"node":"D",
"weight": 1.4
}
]
},
{
"name": "B",
"to": [
{
"node":"C",
"weight": 4.8
}
]
},
{
"name": "C",
"to": [
{
"node":"D",
"weight": 3.0
},
{
"node":"E",
"weight": 2.4
}
]
},
{
"name": "D",
"to": [
{
"node":"B",
"weight": 3.0
},
{
"node":"E",
"weight": 2.4
},
{
"node":"F",
"weight": 2.6
}
]
},
{
"name": "E",
"to": [
{
"node":"F",
"weight": 0.5
}
]
},
{
"name": "F",
"to": [
{
"node":"A",
"weight": 7.8
}
]
}
]
}
Here we can see how each node is implemented. We also see which nodes are connected to whom. With the connections we also see the weights. Looks quite nice. Let's program the nodes in Python.
Implementation in python
#!/usr/bin/env python
# -*- coding:utf-8 -*-
class Node(object):
__key = None
__edge = None
__visited = False
def __init__(self, key):
self.__key = key
self.__edge = []
def getEdges(self):
return self.__edge
def getKey(self):
return self.__key
def addEdge(self, edge):
self.__edge.append(edge)
def visit(self):
self.__visited = True
def unvisit(self):
self.__visited = False
def isVisited(self):
return self.__visited
If we look at the class Node, we see 3 private attributes. The attribute key stores later the name of the node, e.g. "Berlin". The attribute edge is a list, which later stores all edges, which are connected to this node. The last attribute simply stores the state whether this node has already been visited or not. We will see why this is important in the next post, by Prim and Kruskal.
The constructor simply takes the key, and assigns it to the key attribute. This is also where the list is initialized. The getters are quite obvious. The ability to add edges is important.
If we take a closer look at the Edge class, we see that it is also kept quite simple.
#!/usr/bin/env python
# -*- coding:utf-8 -*-
class Edge(object):
__node_a = None
__node_b = None
__weight = None
def __init__(self, a, b, w=1.0):
self.__node_a = a
self.__node_b = b
self.__weight = w
def getA(self):
return self.__node_a
def getB(self):
return self.__node_b
def getWeight(self):
return self.__weight
def setA(self, a):
self.__node_a = a
def setB(self, b):
self.__node_b = b
def setWeight(self, w):
self.__weight = w
Here we can store 2 nodes. E.g. The node and . Both are then assigned according to the private attributes. Furthermore, we also store the weight here. In principle, the node from where we go over the edge is found in the attribute node_a. Its opposite is always at the attribute node_b. The getters and setters are implemented according to the attributes.
Now we have everything to build a graph. Now we only need a class in which the graphs are assembled and output.
class Graph(object):
__nodes = None
__ways = None
Here we have now implemented our class graph. Here we have our attributes nodes, which stores our built graph, and ways, which will be very important for searches later. Ok, now let's build our constructor. It doesn't have to do much except initialize a dictionary and a list.
def __init__(self):
self.__nodes = {}
self.__ways = []
Feeding JSON into Python
Now the question is, how do we turn our JSON into a graph in the program? Python offers the module json, which converts data into JSON or into a dictionary. The dictionaries in Python and JSON are not so dissimilar, which is why one could simply typecast the whole thing as a dictionary. But that is too messy. First we create a method, in my example it is called load. This has a parameter file, which specifies the file to be read. This we can then read normally, with a FileStream, and convert with json into a dictionary.
def load(self, file):
f = open(file, 'r')
data = f.read()
f.close()
j = json.loads(data)
Now we can work through the dictionary, and create the individual nodes.
for entry in j['nodes']:
n = Node(entry['name'])
self.__nodes[entry['name']] = n
Now we need to create the edges that connect the nodes to each other. This means we go through all the nodes again and read from the JSON, the information from "Who is connected to whom, with what weight" and create the edges. We then simply add these to the nodes.
for entry in j['nodes']:
n = self.__nodes[entry['name']]
for to in entry['to']:
e = Edge(
a=self.__nodes[n.getKey()],
b=self.__nodes[to['node']],
w=to['weight']
)
n.addEdge(e)
Putting it all together now, our method looks like this:
def load(self, file):
f = open(file, 'r')
data = f.read()
f.close()
j = json.loads(data)
for entry in j['nodes']:
n = Node(entry['name'])
self.__nodes[entry['name']] = n
for entry in j['nodes']:
n = self.__nodes[entry['name']]
for to in entry['to']:
e = Edge(
a=self.__nodes[n.getKey()],
b=self.__nodes[to['node']],
w=to['weight']
)
n.addEdge(e)
Yay, we can now read in graphs and yoa nothing more really. This method has an overhead in the worst case.
Print the graph
We can build a method, which returns the graph. For this we have to go over all nodes and read the edges. This way we can see if our graph was read correctly.
def printGraph(self):
for key in sorted(self.__nodes.keys()):
print("Node", key)
for edge in self.__nodes[key].getEdges():
print(edge.getA().getKey(), "--->", edge.getB().getKey() , "weight:" , edge.getWeight())
print("-" * 30)
The output should look like this:
Node A
A ---> B weight: 1.0
A ---> D weight: 1.4
------------------------------
Node B
B ---> C weight: 4.8
------------------------------
Node C
C ---> D weight: 3.0
C ---> E weight: 2.4
------------------------------
Node D
D ---> B weight: 3.0
D ---> E weight: 2.4
D ---> F weight: 2.6
------------------------------
Node E
E ---> F weight: 0.5
------------------------------
Node F
F ---> A weight: 7.8
------------------------------
If we now compare this with our graphs in the picture above, we see that the graph was read in and built correctly.
Implement an adjacency list
We can build a method for fun, which creates the adjacency list for the graph. Here we also go through all the nodes, and simply list all the nodes that our current node is connected to.
def adjacenceList(self):
for key in sorted(self.__nodes.keys()):
entry = "[" + key + "]"
for edge in self.__nodes[key].getEdges():
entry += " ---> " + edge.getB().getKey()
print(entry)
If we call that now we should get our adjacency list.
[A] ---> B ---> D
[B] ---> C
[C] ---> D ---> E
[D] ---> B ---> E ---> F
[E] ---> F
[F] ---> A
Visit/Unvisit nodes
Cool now we can read in graphs, output them and create an adjacency list. But what if we want to list all paths that go from node to node ? After all, in Prolog we did that in two lines. We can do that in Python as well. Let's create some helper methods, which we will need later.
def getNode(self, key):
if key in self.__nodes.keys():
return self.__nodes[key]
def unvisitAllNodes(self):
for n in self.__nodes.values():
n.unvisit()
def allVisited(self):
f = []
for n in self.__nodes.values():
f.append(n.isVisited())
return False in f
Let's go through this. The method getNode, returns us the node we are looking for. First the key is taken. Then we check in the dictionary if the key exists at all. If it does, we return the node from the dictionary.
The method unvisitAllNodes, sets the attribute visited to False for every single node. We need this to work later with Prim and Kruskal as well as the path search.
Search paths
The last method simply checks if all nodes have been visited. With this we can build abort conditions later.
Let's move on to the path search.
def findPaths(self, a, b):
self.unvisitAllNodes()
visited = {}
for n in self.__nodes.values():
visited[n.getKey()] = False
pathList = [a]
self.__findPaths(a, b, visited, pathList)
ways = self.__ways
self.__ways = []
return ways
If we look at the code, we see that here first all nodes are marked as not visited. We create a new dictionary visited which should simply represent our control list. Then we create a list pathList which already contains our start node. As we see here is another method which works recursively. This method looks up and the paths and stores them in the private attribute ways. This is then returned and reset.
visited[src] = True
if src == dst:
self.__ways.append(' ---> '.join(path_list))
visited[src] = False
else:
nodes = []
joinable = False
for edge in self.__nodes[src].getEdges():
if edge.getA().getKey() == src:
joinable = True
nodes.append(edge.getB())
for node in nodes:
if not visited[node.getKey()] and joinable:
path_list.append(node.getKey())
self.__findPaths(node.getKey(), dst, visited, path_list)
path_list.remove(node.getKey())
visited[src] = False
nodes.clear()
If we look at the code here, we see that our start node is marked as visited. Here we have to consider 2 cases, once the case that we have arrived at the destination node, or are still on the way. If we have arrived, we have found a possible way. We add this to the attribute ways. Otherwise, we go through all edges at our current nodes, and check which of them are visitable. Since we don't want to get from node to again. The visitable nodes are cached in a list and processed in the lower part. In the lower part, we recursively go through all possible nodes and add them to the path list path_list. Duplicates are removed after each recursive call. At the end, all nodes are removed again. With this, our depth-first search should work at this point.
Let's run that, we get all the paths listed which go from to .
A ---> B ---> C ---> D ---> E ---> F
A ---> B ---> C ---> D ---> F
A ---> B ---> C ---> E ---> F
A ---> D ---> B ---> C ---> E ---> F
A ---> D ---> E ---> F
A ---> D ---> F
I hope I could give a little insight into the implementation 😄