How can I ensure that Python encode utf-8 is properly used when dealing with strings containing special characters (like é or è) in my scripts?
I am creating a string in Python that I save in a file, which contains a lot of data, including directory names and filenames. I want to keep everything in UTF-8 since I will save it in MySQL later. My MySQL database is also set to UTF-8, but I am encountering issues with some characters (like é or è), even though the string in the file is displayed correctly.
Here’s my script:
#!/usr/bin/python
# -*- coding: utf-8 -*-
def createIndex():
import codecs
toUtf8 = codecs.getencoder('UTF8')
# lot of operations & building indexSTR the string that matters
findex = open('config/index/music_vibration_' + date + '.index', 'a')
findex.write(codecs.BOM_UTF8)
findex.write(toUtf8(indexSTR)) # This throws an error!
When I run this script, I encounter the following error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 2171: ordinal not in range(128)
I noticed that after creating this file, I can read it and write it into MySQL, but I face issues with encoding. My MySQL database is set to utf8 (as confirmed by the SQL query SHOW variables LIKE 'char%'
, which only returns utf8
or binary
).
Here’s the MySQL code I’m using:
#!/usr/bin/python
# -*- coding: utf-8 -*-
def saveIndex(index, date):
import MySQLdb as mdb
import codecs
sql = mdb.connect('localhost', 'admin', '*******', 'music_vibration')
sql.charset = "utf8"
findex = open('config/index/' + index, 'r')
lines = findex.readlines()
for line in lines:
if line.find('#artiste') != -1:
artiste = line.split('[:::]')
artiste = artiste[1].replace('\n', '')
c = sql.cursor()
c.execute('SELECT COUNT(id) AS nbr FROM artistes WHERE nom="' + artiste + '"')
nbr = c.fetchone()
if nbr[0] == 0:
c = sql.cursor()
iArt += 1
c.execute('INSERT INTO artistes(nom, status, path) VALUES("' + artiste + '", 99, "' + artiste + '/")'.encode('utf8'))
Even though the artiste string is correctly displayed in the file, it is being written incorrectly into the MySQL database. What might be causing this issue?