Difference between bytes and bytearray in Python
Introduction
In this tiny Python code snippet article, I am going to borrow from the full article about the topic that you can find here.
So, what is the difference between bytes and bytearray in Python ?
There is no real difference between byte strings and byte arrays except the fact that byte strings are immutable and byte arrays are mutable. If that is the case, then why does Python have both. One explanation is that some applications perform poorly with immutable strings. For example, IO operations that involve buffering, for each addition to the buffer, a new memory has to be allocated for the concatenation and copying. This is a slow process that is why a mutable byte array comes to the rescue.
String handling is not the same in Python 2.x and 3.x. Let us see how…
Python 2.x
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
u = "Python" # This should print: <type 'str'> print(type(u)) v = b"Python" # This should print: <type 'str'> print(type(v)) y = bytearray("Python") # This should print: <type 'bytearray'> print(type(y)) z = bytearray(b"Python") # This should print: <type 'bytearray'> print(type(z)) |
Python 3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
u = "Python" # This should print: <class 'str'> print(type(u)) v = b"Python" # This should print: <class 'bytes'> print(type(v)) try: y = bytearray("Python") # This should print: string argument without an encoding # The reason why this fails in Python 3.x is that Python 3.x # needs to know which encoding scheme should be used to # to convert "Python" to bytes. It does not fail in Python 2.x # due to implicit encoding which may cause undesired # side effects. In Python 3.x we must explicitly specify # the encoding type print(type(y)) except Exception as e: print(e) z = bytearray(b"Python") # This should print: <class 'bytearray'> print(type(z)) |
Ok, we demonstrated some examples, what is the deal ? Here is the summary…
Summary
- Strings in Python 2.x are of type str which is nothing but a bytes stream. It is immutable and cannot be modified. If you need to modify it then Python 2.x provides a bytearray type which is exactly like str but mutable
- On the other hand, Python 3.x strings are Unicode by default. Please refer to the full article to understand encoding. Python 3.x also defines byte and byte arrays types which are similar to str and bytearray in Python 2.x
Thanks for visiting