Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding issue #207

Open
marcelo2605 opened this issue Aug 26, 2021 · 1 comment
Open

Encoding issue #207

marcelo2605 opened this issue Aug 26, 2021 · 1 comment

Comments

@marcelo2605
Copy link

I file that I'm trying to parse was encoded using Windows-1252. I confirmed this using mb_check_encoding()

$csv = new \ParseCsv\Csv();
$csv->file_data = file_get_contents($file);
$deteced_encoding = mb_check_encoding($csv->file_data, 'Windows-1252'); // return true
$csv->encoding('Windows-1252');
$csv->auto();
$array = $csv->data;

But the content of the array still have broken characters (�).

@gogowitsch
Copy link
Member

@marcelo2605 Could you attach a version of the problematic file? Feel free to remove/overwrite all confidential information.

You call the encoding() function with one parameter, which means the output character set will be 'ISO-8859-1'. Maybe the broken characters cannot be represented in ISO-8859-1? Are you sure you need/want to convert to 'ISO-8859-1'?

Depending on where the data should go, 'UTF-8' might be a good choice for the second parameter of encoding().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants