**Solution**

**UTF-8 Validation** https://leetcode.com/problems/utf-8-validation/

**Algorithm**

- UTF can have only 1 to 4 characters.
- If just one character, then its byte must start with 0. So '0b10010001' is an error.
- If more than one character, then character 2 onwards must start from "10". How do we check that? First mask with 11000000. Then XOR with 10000000.

```
class Solution(object):
def get_first_unset_bit(self, x):
mask, result = 0x80, 0
for i in range(8):
if mask >> i & x:
result += 1
else:
break
return result
def validUtf8(self, data):
"""
:type data: List[int]
:rtype: bool
"""
end = 0
while end < len(data):
x = data[end]
first_unset_bit = self.get_first_unset_bit(x)
if first_unset_bit in (1,5,6,7,8):
return False
elif first_unset_bit == 0:
end += 1
else:
if end + first_unset_bit-1 >= len(data):
return False
for i in range(end+1, end + first_unset_bit):
if (data[i] & 0xC0) ^ (0x80) != 0:
return False
end = end + first_unset_bit
return True
```