Why does bytes(n) create zeroes instead of binary?

sakshikuchroo · December 3, 2024, 6:30pm

Why does `bytes(n)` create a length `n` byte string instead of converting `n` to a binary representation?

I was trying to create a bytes object in Python 3 with the value b'3\r\n', but I encountered some unexpected behavior:

>>> bytes(3) + b'\r\n'
b'\x00\x00\x00\r\n'

Apparently:

>>> bytes(10)
b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I couldn’t find an explanation for this behavior in the documentation, so I researched further and found some surprise messages in a Python issue regarding adding format to bytes (see also Python 3 bytes formatting):

Python Issue 3982

This behavior seems even more confusing when bytes(int) returns zeroes.

Ideally, I would prefer that bytes(int) return the ASCII representation of that integer. However, this behavior seems inconsistent, and I’d rather see an error message than this behavior. If I wanted this functionality, I’d prefer it to be a class method, like bytes.zeroes(n).

Can someone explain the reasoning behind this behavior and why bytes(n) behaves the way it does in python int to bytes conversion?

joe-elmoufak · December 3, 2024, 6:33pm

Hey All! and @sakshikuchroo

Hers is your detailed answer:-

Using bytes([n]) for ASCII Representation Hey, if you’re trying to convert an integer into its byte representation, instead of using bytes(n)—which creates a byte string of length n filled with zeroes—you can directly convert the integer n into a byte using a list containing that integer. This will give you its ASCII byte representation, which can be useful for various scenarios.

For example:

# Convert integer to its byte representation
byte_value = bytes([3])  # This converts the integer 3 to a byte value
print(byte_value)  # Output: b'\x03'

So, this method gives you the byte representation directly and aligns well with Python’s way of handling integer-to-byte conversions.

dipen-soni · December 7, 2024, 7:01am

@sakshikuchroo @joe-elmoufak

Using struct.pack for Binary Representation Ah, I see where you’re going with this. If you’re looking for a more specific binary representation, like the ASCII binary format, I suggest using the struct module. It allows you to convert the integer into a byte object while specifying the format more explicitly. For example:

import struct

# Convert integer to binary representation as bytes
binary_bytes = struct.pack('B', 3)  # 'B' stands for unsigned char (1 byte)
print(binary_bytes)  # Output: b'\x03'

With struct.pack(), you can handle binary data more precisely, which is especially useful when you need to control the exact format of your byte string."

charity-majors · December 9, 2024, 7:05am

Hello @sakshikuchroo,

Here is your answer:-

Using chr and encode for ASCII Conversion Great points already! Now, let’s talk about using chr and encode to convert an integer into its corresponding ASCII character and then encode it into a byte string. This method is useful if you want to work with the character itself first and then convert it to a byte representation. Here’s how you do it:

# Convert integer to ASCII character and then to bytes
ascii_bytes = chr3.encodelatin-1  # Convert integer to ASCII byte string
print-ascii_bytes  # Output: b\x03

This approach first converts the integer into the ASCII character it corresponds to, then encodes it into a byte string. It’s a neat way to handle integer-to-byte conversions when you want the character representation.

Why does bytes(n) create zeroes instead of binary?

Why does bytes(n) create a length n byte string instead of converting n to a binary representation?

Why does `bytes(n)` create a length `n` byte string instead of converting `n` to a binary representation?