Here is my code:

```
def insertionSortList(head):
dummy = ListNode(0)
dummy.next = head
last = p = q = dummy
while p and p.next:
q = dummy if p.next.val < last.next.val else last
while q != p and q.next and q.next.val < p.next.val:
q = q.next
#insert
if p != q:
print('swap',p.val, q.val)
tmp = p.next
p.next = tmp.next
tmp.next = q.next
q.next = tmp
else:
p = p.next
last = q
return dummy.next
```

I have some optimization to reduce the run time and the algorithm becomes linear time in extreme case like below:

```
0,6,5,4,3,2,1
swap 6 0
0,5,6,4,3,2,1
swap 6 0
0,4,5,6,3,2,1
swap 6 0
0,3,4,5,6,2,1
swap 6 0
0,2,3,4,5,6,1
swap 6 0
0,1,2,3,4,5,6
```

However, I still get TLE for the 5000 to 1 test case. Notice that the algo is O(n) time for this special case. Does it make any sense to further optimize this algo? No. In my whole life, I never see any case that sorting with linked list and a O(n^2) algorithm should be absolutely enough.