@StefanPochmann said in Are hash tables ok here? They're not really O(1), are they?:

@edaengineer said in Are hash tables ok here? They're not really O(1), are they?:

But then why you assume hashmap will let an element lookup to continue being O (n)

Because I think it's so unlikely to happen that hashmap implementations don't necessarily bother avoiding it, as that would come at a price as well, a price perhaps not worth it.

I just checked two things:

C++: unordered_map::find says *"Constant on average, worst case linear in the size of the container"*. If it were constant amortized time, why wouldn't they say so? They do say it for vector::push_back.

Java: HashMap, if I understand that description correctly (didn't it read it thoroughly yet), uses linear lists for small buckets and trees with O(log n) for large buckets. And I think this and this is where they're searched and I don't see any reorganization efforts. Do you?

It will resize or modify hashing

Can you show evidence for that? Which implementations do that?

Very useful links!

This actually supports the idea that the notion of "amortized complexity" for unordered Map/ hash map isn't meaningful (and not that amortized complexity is O (n) for maps), which is something that I was trying to get at in my first post on this topic (but couldnt fully phrase it then and your post helped).

In other words, since no rehashing or collision avoidance is occurring for maps, it's reasonable to talk of average complexity of O (1) for a map instead of talking about amortized complexity of O (n). Reason being, the notion of amortization (which means whittling down or distributing cost over a larger input sample) is meaningful only when there is a substantial/abnormal high cost incurred in an otherwise regularly occurring single step of a function, like the case with vector insertion. It's a regular operation that after n steps will have an O (n) cost for even the most average/random input data sample. So it makes sense to talk about amortized complexity of vector insertion.

But nothing like that happens for hash map find. For the average/random data sample the cost of O (n) wouldnt occur. So it's doesn't make sense and may not be correct to talk about amortized complexity for hash map find. Using O (1) complexity of average case for it is reasonable for all/most practical problems.

This may not cover the duplicates scenario where n duplicates are all in a linked list so hash map find is O (n) but that is a worst case scenario (we probably shouldnt refer to it as a case of amortized complexity as the amortization notion doesnt make sense for hashmap), and I think that when talking about complexity of the approach it's reasonable to assume O (1) for hash map find.