I have tried lots of ways to optimize my code, but they all have TLE issue.
Then I looked into other people's successful solutions, and I was surprised to see that they are using much less caches than I did. Later, I realized that my code is a little too general that it does not have good enough performance when k = nums1.length + nums2.length , and my TLE also occured in such test cases (e.g.: k = 1000, both array have 500 elements).
After writing some special logic for "k = nums1.length + nums2.length" cases (no need to use cache in this case), my solution finally got accepted. So if you also encounter TLE issue, please make sure that your solution can have good performance in this case first.