I need to export to .csv files from ORIGAM, encoded UTF-8 without BOM. (Byte order mark - Simple English Wikipedia, the free encyclopedia). Is there any way how to do it?
What do you use to save the files currently?
I tried puttin UTF8 in “Encoding” property. I also tried letting this property empty (should also be UTF-8). In both cased it ended encoded UTF=8 BOM
My question was – which method do you use to save the files?
But I guess it is FileSystemService.SaveText
. Then the problem lies in here
In order to skip BOM one would have to construct the encoding like this
Encoding utf8WithoutBom = new UTF8Encoding(false);
The problem is how should we decide to do this? Either we would have to parse the encoding
parameter for something like utf-8-NoBOM
or we would have to introduce a new parameter to the method, e.g. EmitBOM
with default true
.
But the answer to the original question is – we cannot do it now. The only real option for you would be to use a command line to remove the BOM in a second step.
We could also introduce a breaking change with a potential configuration fallback to always skip BOM.
As discussed here BOM is not preferred with UTF-8 files.
UTF-8 should be without or with BOM, it is optional. Powershell for example has uft8NoBOM prametr. If you implement this, it is just fine
Well, I would not change it. If there is new possibility to have something like “UTF8NoBOM” switch, it will work just fine In general BOM does not hurt anyone or anything, but I just need it as some imports define CSV structure as NO BOM UTF8 encoding