My solution does not leverage much math insights into the problem, but rather apply bit manipulation to simulate the process to find all states after m steps.
Nevertheless, I do use some insights to reduce the number of bits used to represent light state from n to 3. This is because the n lights can be grouped into the following 3 groups. As lights within the same group will not have a different on/off state through out the m operations, their share one bit to represent their states.
 group 1: whose with odd index representable as 3k+1, use the last bit in a char.

group 2: whose with even index use the second last bit in a char.

group 3: whose with odd index, but not representable as 3k+1, use the third last bit in a char.
When a button is pressed, it is flipping the state (and the corresponding bit ) of one or more groups:

button 1 flips all 3 groups, we can do XOR 7 on the old state to get the new state.

button 2 flips group 2, do XOR 2 on the old state to get the new state.

button 3 flips group 1 and group 3, do XOR 5

button 4 flips group 1, do XOR 1
For small n such as 1 and 2, there maybe zero lights in some of the 3 groups, and we setup mask to disregard the corresponding bits every time we simulate the flip.
The finally observation is that the number of states cannot be more than 8, so once we get a set of 8 states at some step, we do not need to simulate all the remaining ones.
The code in C++ is in the following:
int flipLights(int n, int m) {
char s0,mask;
const int maxStates = 8;
vector<char> buttons = {1,2,5,7};
switch(n){
case 1:
s0 = mask = 1;
break;
case 2:
s0 = mask = 3;
break;
default:
s0 = mask = 7;
}
set<char> states;
states.insert(s0);
for (int i = 1; i<=m && states.size() < maxStates; i++){
set <char> newStates;
for (auto s : states)
for (auto b: buttons)
newStates.insert(s^b&mask);
states = newStates;
}
return states.size();
}
Although this approach is slower than the other ones with pure math insights, it is generic and justifies this problem as a valid coding problem.