Intuitions behind the Tarzan’s algorithm for finding strongly connected components:
So today,after waking up,I saw Zaheen reading the Tarzan comics and Aarish playing with his toys.Suddenly the elder one asked the younger one,
:Do you know why Tarzan is called the king of jungle?
-Yap,he was born and bred there,fed there,and lived all through his life there as the saviour of wildlife.
:Well,that's the easy one,not the tricky one ;)
-Like?
:Coz Tarzan had so many important works on forests :p
-Can’t understand bro :/
:I’m talking about Robert Endre Tarzan,an american computer scientist.He had important contributions to graph theory.
-Oh,I didn't know.Can you enlighten me a bit ?
:Of course,lil' bro :)
*****************************************************************
Tha'ts how their conversation about Tarzan’s SCC finding algorithm started.I .as per I always do,am going to mention the extract from there:
-So now,tell me,what’s his most important contribution?
:Well I guess,you know about Kosaraju’s algorithm for finding the SCCs of a graph right?
-Which one?The one with two DFS’s?
:Yap.
-Yap,I learnt it from kakai(the writer).
:What if,the same thing can be done using only a single DFS?
-I wish it could be done :(
:don’t be sad,it can actually be done ;).Tarzan did it?
-Wao!!but how?
:Remember the low(u) thing?in fact this low(u) idea was a brainchild of Tarzan,he simplified so many problems introducing this :)
-Yap,I can remember.But how can it help me in finding SCCs?
:Well,at first do a simple dfs like previous ones,this time with a directed graph.
-(after a bit) Ok,here it is ,done.
:Now look carefully,you can easily determine the SCCs manually,right?
-Yap
:Look at the tree,is it the case anywhere that parts of same SCCs are in different subtrees?
-No,how can they be?If so,they wouldn't be connected in the first place.
:Good,so u at least have an idea now:SCCs will be in subtree form.One SCC will have its own subtree.
-Ya.And I think there can’t be a backedge from a SCC related subtree to another SCC related subtree?
:why so,my brother?
-Because it would mean you can go from one SCC to another and vice versa;i.e-they will be on same SCC.But they aren’t!!!
:Love you man,exactly.And what does this backedge thing reminds you ?
-low(u)?oldest ancestor to be visited from u?
:Yes,suppose that an SCC is defined by a subtree rooted by u.can low(u) be higher than discovery time of u?
-No,as per I said earlier,no backedge between SCCs.so its not possible.
:Ok,now tell me,if low(u)=discovery time of u,what would it mean?
-Well,it means the subtree rooted by u ,i.e:the possible members of u’s SCC can at most have back edge to the root u itself.Wait,doesn’t it mean if i find low(u)=disc(u) for a node u,I just should output the subtree under u?Shouldn’t they be the member of a SCC?
:Excellent lad ;) .So how’r u gonna output these nodes?
-ah,what if,I use a stack ,put nodes while visiting them,and start printing the nodes after popping when I find low(u)==disc(u)?and stop doing so after popping u itself?
:Great.Do you think your algorithm is fulfilled now?
-I guess so ^_^
:Sorry lad,its not over yet.Don’t forget it’s a directed graph?
-Wait,I forgot about handling the case of forward age :o ! It was not a problem in undirected graph.But its a problem in directed one :(
:I can give you tips about handling the situation:Output only the nodes when they are in stack.
-Why?
:uff,read this extract from wikipedia -_-
The nodes are placed on a stack in the order in which they are visited. When the depth-first search recursively explores a node v and its descendants, those nodes are not all necessarily popped from the stack before this recursive call returns. The crucial invariant property is that a node remains on the stack after exploration if and only if it has a path to some node earlier on the stack.
At the end of the call that explores v and its descendants, we know whether v itself has a path to any node earlier on the stack.[i.e - low(v)>disc(v) ] If so, the call returns, leaving v on the stack to preserve the invariant. If not, then v must be the root of its strongly connected component, which consists of v together with any later nodes on the stack (such nodes all have paths back to v but not to any earlier node, because if they had paths to earlier nodes then v would also have paths to earlier nodes which is false ). This entire component is then popped from the stack and returned, again preserving the invariant.
”
-Now I understand ,forward edges’s endpoints won’t be in stack .So they won’t be printed.Right?
:Yes,you are right lad :)