Guide January 28, 2026 • 10 min read

CSV Encoding Explained: UTF-8 vs Windows-1252 vs ISO-8859-1 (2026)

Garbled characters in CSV—ä instead of ä, é instead of é—usually mean an encoding mismatch. This guide explains UTF-8, Windows-1252, and ISO-8859-1, how to detect the encoding of a file, and how to fix it in Excel, Python, and JavaScript.

Table of Contents

  1. 1. Why Encoding Matters for CSV
  2. 2. UTF-8: The Modern Standard
  3. 3. Windows-1252 (CP1252)
  4. 4. ISO-8859-1 (Latin-1)
  5. 5. How to Detect and Fix Encoding
  6. 6. Code Examples: Python & JavaScript
  7. 7. Best Practices

CSV files are just text. The bytes that represent "Müller" or "Zürich" depend on the character encoding. If you open a UTF-8 CSV in a tool that assumes Windows-1252, you get mojibake—wrong characters. Here’s how to avoid and fix it.

1. Why Encoding Matters for CSV

Every CSV is a sequence of bytes. The same byte value can mean different characters in different encodings. For example, the bytes C3 A4 in UTF-8 are the character ä; in ISO-8859-1, E4 is ä. If your app or Excel assumes the wrong encoding, you see or ä instead of ä. For more on CSV structure, see What is a CSV File?.

Typical symptom: You export from a German/French/Spanish system, open in Excel or another tool, and names or places show wrong characters (e.g. ö instead of ö). That’s almost always encoding.

2. UTF-8: The Modern Standard

UTF-8 can represent every character in Unicode. It uses 1 byte for ASCII (a–z, 0–9) and 2–4 bytes for accented letters, umlauts, and symbols. UTF-8 is the default for the web, modern APIs, and most databases. For CSV, use UTF-8 whenever possible and save with a BOM (Byte Order Mark: EF BB BF) if you need Excel on Windows to open it correctly without asking.

Example (UTF-8):

name,city
Müller,Zürich
François,Paris
Niño,Madrid

3. Windows-1252 (CP1252)

Windows-1252 (Code Page 1252) is the default in older Windows apps and Excel in many locales. It’s a superset of ISO-8859-1 with extra characters in the 0x80–0x9F range (e.g. smart quotes, Euro €). CSV exported from legacy Windows systems is often Windows-1252. If you open such a file as UTF-8, characters like ä, ö, ü can appear as two-character sequences (e.g. ä).

4. ISO-8859-1 (Latin-1)

ISO-8859-1 covers Western European languages with one byte per character (256 code points). It’s common in older European systems. It’s very similar to Windows-1252; the main difference is the 0x80–0x9F range. Many tools treat “Latin-1” and “CP1252” interchangeably for CSV, but strictly they’re not identical.

Encoding Use case BOM
UTF-8Web, APIs, new systems; all languagesOptional (helps Excel)
Windows-1252Legacy Windows/Excel exports (Western Europe)No
ISO-8859-1Older European systemsNo

5. How to Detect and Fix Encoding

Detect: If you see patterns like ä (ä), ö (ö), é (é), the file is likely UTF-8 being read as Windows-1252 or Latin-1. The reverse (wrong characters when you expect Latin-1) means the file is Windows-1252/Latin-1 but opened as UTF-8. Use a validator or a small script to try both. neatcsv’s CSV Validator can help spot encoding issues; Clean CSV can normalize output to UTF-8.

Fix in Excel: Use “Data → From Text/CSV”, choose the file, then in the import wizard select the correct “File origin” (e.g. “65001: Unicode (UTF-8)” or “1252: Western European”) and load. Save As → CSV UTF-8 (Comma delimited) to export as UTF-8 with BOM. More in How to Open CSV in Excel.

6. Code Examples: Python & JavaScript

Reading CSV with the right encoding and writing UTF-8 keeps data consistent.

Python

# Read Windows-1252 CSV, write UTF-8
with open('input.csv', encoding='cp1252') as f:
    text = f.read()
with open('output.csv', 'w', encoding='utf-8-sig', newline='') as f:
    f.write(text)  # or use csv.reader/writer for structured data

JavaScript (browser / Node)

// File API: read as UTF-8 (default for text())
const text = await file.text();
// If you have bytes from a legacy system, use TextDecoder
const decoder = new TextDecoder('windows-1252');
const text = decoder.decode(uint8Array);

For full CSV parsing (headers, commas, quotes), use a library like Papa Parse (JS) or the built-in csv module (Python) and always pass the correct encoding when reading.

7. Best Practices

For more on fixing broken CSV (columns, delimiters, dates), see 10 Common CSV Errors.

📚 Related Articles

Clean and normalize your CSV

neatcsv: trim, standardize, deduplicate. 14+ tools, 100% private. Plans from 9€/month.

Get Started