I ran it about 10 times, 2/10 times it ran in 5 ms, maybe I got lucky. The average run is 6 ms.

Cycle detection is inside the dfs routine - standard 3-color approach.

At first I tried to create an adjacency list with edges directed from the prerequisite to the given course, and then reverse the result of the topological sort.

This is not necessary, we can simply construct the tree in reverse - from given course to its prerequisites - and then simply output the result of the topological sort.

```
public int[] findOrder(int numCourses, int[][] prerequisites) {
// Create an adjacency list
List<Integer>[] edges = new ArrayList[numCourses];
// array to track visited nodes and detect cycles, 0 = white, 1 = gray, 2 = black
int[] visited = new int[numCourses];
// output array
int[] res = new int[numCourses];
// initialize adjacency list
for (int i = 0; i < edges.length; i++)
edges[i] = new ArrayList<Integer>();
for (int[] edge : prerequisites)
edges[edge[0]].add(edge[1]);
// run topological sort
for (int i = 0, j = 0; i < numCourses; i++) {
j = dfs(i, edges, visited, res, j);
if (j == -1) return new int[0]; // cycle, return empty array
}
return res;
}
/**
* One way to optimize the dfs is to initialize the output array of desired size
* and pass the index of the current slot in the array.
* 1. We start with index 0
* 2. We recursively pass this index through the calls until we get to our first black node.
* 3. We set res[index] = our black node.
* 4. We then return index + 1, so that the next black node we find will be inserted in the next slot of res.
*/
private int dfs(int v, List<Integer>[] edges, int[] visited, int[] res, int i) {
if (visited[v] == 2) return i; // black node
if (visited[v] == 1) return -1; // gray node -> cycle
visited[v] = 1;
for(int j : edges[v]) {
i = dfs(j, edges, visited, res, i);
if (i == -1) return -1; // if there is a cycle, propagate the error
}
visited[v] = 2;
res[i] = v;
return i+1;
}
```