Prefix before a string in python
Introduction
If you skim through Python source code, you may encounter some strings proceeded by a special character such as (b, u). What are these characters?
Let us see…
The b string prefix
The b string prefix in Python 3.x indicates that the string is of type byte stream. If the prefix is not used, the string is of type unicode by default. On the other hand, in Python 2.x whether we use b or not, the string is of type (str) which is a byte stream.
Let us take an example…
Python 2.x example
In Python 2.x prefixing a string with (b) has no effect.
1 2 3 4 5 6 |
y = b'Python' # This should print: Python and the b prefix has no # effect in Python 2 print(y) # This should print: <type 'str'> print(type(y)) |
Python 3.x example
In Python 3.x prefixing a string with (b) means a byte stream
1 2 3 4 5 |
y = b'Python' # This should print: b'Python' print(y) # This should print: <class 'bytes'> print(type(y)) |
Unicode prefix characters
While dealing with Unicode in Python, you may encounter the following symbols (U+, u’, b’, \x, \u, \U). These symbols are used to define Unicode strings. Let us quickly clarify them…
- U+ is followed by a hex number to denote a given Unicode code point. For example U+0041 is the Unicode code point for letter A
- u’ prefix to denote a Unicode string in Python 2
- \x followed by 2 hex numbers (1 byte)
- \u followed by 4 hex numbers (2 bytes)
- \U followed by 8 hex numbers (4 bytes)
More details
This post was a quick summary, however the topic is beyond just using a string prefix. This topic is tightly related to character encoding. If you are interested to learn more. Please check the full article here.
Thanks for visiting.