Prefix before a string in python

Introduction

If you skim through Python source code, you may encounter some strings proceeded by a special character such as (b, u). What are these characters?

Let us see…

The b string prefix

The b string prefix in Python 3.x indicates that the string is of type byte stream. If the prefix is not used, the string is of type unicode by default. On the other hand, in Python 2.x whether we use b or not, the string is of type (str) which is a byte stream.

Let us take an example…

Python 2.x example

In Python 2.x prefixing a string with (b) has no effect.

Python 3.x example

In Python 3.x prefixing a string with (b) means a byte stream

Unicode prefix characters

While dealing with Unicode in Python, you may encounter the following symbols (U+, u’, b’, \x, \u, \U). These symbols are used to define Unicode strings. Let us quickly clarify them…

  • U+ is followed by a hex number to denote a given Unicode code point. For example U+0041 is the Unicode code point for letter A
  • u’ prefix to denote a Unicode string in Python 2
  • \x followed by 2 hex numbers (1 byte)
  • \u followed by 4 hex numbers (2 bytes)
  • \U followed by 8 hex numbers (4 bytes)

More details

This post was a quick summary, however the topic is beyond just using a string prefix. This topic is tightly related to character encoding. If you are interested to learn more. Please check the full article here.

Thanks for visiting.

Add a Comment

Your email address will not be published. Required fields are marked *