maketargets, npm dependencies, task ordering — "do B and C before A" is a dependency graph, and putting it in a valid order is topological sorting. I built a CLI for it. Two implementation hinges: (1) Kahn's algorithm (in-degrees + a queue) with stable, lexicographic output, and (2) when there's a cycle, don't just say "cycle detected" — use DFS to name exactly which nodes form it, asauth -> cache -> db -> auth. The last two entries were Rust and Go, so this one's a Python CLI.
📦 GitHub: https://github.com/sen-ltd/topo-sort-cli
(It's a CLI — no live demo. Run with python3 main.py or Docker.)
Input format
One declaration per line. A: B C means A depends on B and C (B and C come first). # comments and blanks are ignored:
deploy: build test
build: compile
test: compile
compile:
Internally each dependency is stored as a "prerequisite → dependent" edge (A: B → edge B -> A), so a prerequisites-first topological sort is directly a build order.
Hinge 1: Kahn's algorithm with stable output
Topological sort comes in a DFS flavor and a Kahn flavor. Kahn's is intuitive: repeatedly take nodes whose in-degree (number of prerequisites) is zero:
- Count every node's in-degree.
- Queue the zero-in-degree nodes (no prerequisites → can go first).
- Pop one, emit it, decrement its dependents' in-degrees.
- Newly-zero nodes join the queue. Repeat.
A naive queue pops in arbitrary order, so the output wobbles run to run. A min-heap of ready nodes always takes the lexicographically smallest, making output deterministic:
import heapq
def topo_sort(g):
indeg = g.indegrees()
ready = [n for n, d in indeg.items() if d == 0]
heapq.heapify(ready) # ready set as a min-heap
order = []
while ready:
n = heapq.heappop(ready) # smallest ready node → stable
order.append(n)
for v in sorted(g.adj[n]):
indeg[v] -= 1
if indeg[v] == 0:
heapq.heappush(ready, v)
if len(order) == len(g.nodes):
return SortResult(order=order, cycle=None)
return SortResult(order=None, cycle=_find_cycle(g)) # leftovers = cycle
That final len(order) != len(g.nodes) is the cycle test: if some node's in-degree never hits zero, it's in a cycle. Determinism is pinned by a test:
def test_deterministic_tiebreak():
# three independent nodes → lexicographic order
assert order_of("b:\na:\nc:\n") == ["a", "b", "c"]
Hinge 2: name the cycle
Kahn's tells you a cycle exists, not where. Lots of tools stop at error: cycle detected, leaving you to stare at the graph. A second pass — DFS with white/gray/black coloring — extracts one concrete cycle:
- white = unvisited, gray = on the current recursion stack, black = finished
- reaching a gray node during DFS = a back-edge = a cycle
- the recursion stack from that node onward is the cycle
def _find_cycle(g):
WHITE, GRAY, BLACK = 0, 1, 2
color = {n: WHITE for n in g.nodes}
stack = []
def dfs(u):
color[u] = GRAY
stack.append(u)
for v in sorted(g.adj[u]):
if color[v] == GRAY: # back-edge → cycle
idx = stack.index(v)
return stack[idx:] + [v] # stack slice + return node
if color[v] == WHITE:
found = dfs(v)
if found is not None:
return found
stack.pop()
color[u] = BLACK
return None
for n in sorted(g.nodes):
if color[n] == WHITE:
found = dfs(n)
if found is not None:
return found
return []
So instead of "there is a cycle" you get:
error: graph has a cycle:
auth -> cache -> db -> auth
— the actual path, plus exit code 1 so shell scripts can detect it. The user fixes those three edges and moves on.
Cycles hidden behind innocent branches still get found
DFS starts from every node, so a cycle survives even when an unrelated acyclic subgraph is present:
def test_cycle_with_innocent_tail():
# x -> y is fine; a <-> b is the cycle. Must still be detected.
res = topo_sort(parse_graph("x: y\ny:\na: b\nb: a\n"))
assert res.order is None
assert set(res.cycle) <= {"a", "b"}
assert res.cycle[0] == res.cycle[-1] # path returns to its start
Self-loops (a: a), 2-node and 3-node cycles each get their own test too.
CLI
text = open(args.file).read() if args.file else sys.stdin.read()
graph = parse_graph(text)
result = topo_sort(graph)
if result.cycle is not None:
print("error: graph has a cycle:", file=sys.stderr)
print(" " + " -> ".join(result.cycle), file=sys.stderr)
return 1
for node in (reversed(result.order) if args.reverse else result.order):
print(node)
return 0
--reverse for dependents-first, --numbered for positions, and stdin support so cat deps.txt | toposort works.
Architecture
src/graph.py — parse + Kahn's sort + DFS cycle extraction (stdlib only)
src/cli.py — argparse, file/stdin, exit codes
main.py — entry point
Zero dependencies. 21 pytest tests: parsing, ordering validity, deterministic tie-breaking, self-loops, 2- and 3-node cycles, and cycles hidden behind acyclic tails. Docker: python:3.12-alpine build → alpine runtime, non-root.
Run it
python3 main.py examples/build.deps
python3 -m pytest # 21 tests
docker build -t toposort . && docker run --rm -i toposort < examples/build.deps
Takeaways
- Kahn's algorithm: emit zero-in-degree nodes; a min-heap makes the output stable.
-
len(order) != node_countis the cycle test — no separate check needed. - Don't stop at "cycle detected" — DFS gray-node revisit extracts the actual path (
auth -> cache -> db -> auth). - DFS from every node finds cycles hidden behind unrelated branches.
- Split exit codes (0 ok / 1 cycle) so the tool composes in shell.
This is OSS portfolio #268 from SEN LLC (Tokyo). https://sen.ltd/portfolio/

Top comments (0)