it's easy to see how it works, once you realize that the "delta/distance" between the fast and slow pointers are INCREASING by 1 per step before they meet, and once they both enter the loop, the fast pointer is catching up to the slow one by a pace of 1 per step.

https://docs.google.com/presentation/d/1b7Hwb71H5m6x4ePw_73hpF1JagC6oHJLkJXZf14wSPE/edit?usp=sharing

for simplicity of illustration, say at the time the slow pointer arrives at the bifurcation point, fast pointer has not wrapped around in the loop, and is X steps ahead of the slow pointer, as we said, the DELTA between them is growing at 1 per step, same speed as the slow pointer. so the DELTA must be the same distance as the distance traveled by the slow pointer. so the HEAD must be X steps back.

now from the time when slow pointer reaches bifurcation, to when they meet up, the distance caught up by fast pointer, must be the same as the distance traveled by the slow pointer. let's refer to this as D in the figure. the distance traveled by p2 after meeting is X -D + D , same distance as traveled by p1 from head to bifurcation point