When stdout is a terminal, sys.stdout.encoding gets set using the locale but when stdout is a file, sys.stdout.encoding is None.
I agree that this is annoying, but I disagree that it's stupid.
The right thing to do is that when stdout is a file, Python should use the correct encoding for that file, whatever that is. But unfortunately there's no way for Python to find out. There's no standard operating system API or protocol for discovering the encoding. In particular, there's no good reason to expect the encoding for the file to be the same as the encoding used by your terminal. So I agree with Python that the only sensible thing to do is to default to None for a file encoding.
Note that you can write bytes to the file without error, it's only characters that need to be encoded:
$ python >/dev/null
Python 2.5.1 (r251:54869, Apr 18 2007, 22:08:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5367)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> print '\xa9' # string of bytes, writes successfully
>>> print u'\xa9' # string of characters, need to be encoded before writing
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in position 0: ordinal not in range(128)
(no subject)
Date: 2007-12-15 10:10 pm (UTC)sys.stdout.encoding
gets set using the locale but when stdout is a file,sys.stdout.encoding
isNone
.I agree that this is annoying, but I disagree that it's stupid.
The right thing to do is that when stdout is a file, Python should use the correct encoding for that file, whatever that is. But unfortunately there's no way for Python to find out. There's no standard operating system API or protocol for discovering the encoding. In particular, there's no good reason to expect the encoding for the file to be the same as the encoding used by your terminal. So I agree with Python that the only sensible thing to do is to default to
None
for a file encoding.Note that you can write bytes to the file without error, it's only characters that need to be encoded: