Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Records combined around end-of-line characters #40

Open
onyxfish opened this issue Oct 31, 2016 · 0 comments
Open

Records combined around end-of-line characters #40

onyxfish opened this issue Oct 31, 2016 · 0 comments

Comments

@onyxfish
Copy link
Contributor

Via Chris Wright: "Finally record counts in general should be checked when receiving and loading data. Excel gives people the option to add in new lines within cells, this is stored as a Line Feed (LF) character (at least under Windows where I work), some applications reading this in will take everything after that as a new record, potentially resulting in data being loaded into wrong columns if you're loading into a database. Another fun trick is when you end up with an end of file character embedded in a text string. I've yet to work out how on earth these end up in the files (it's happened to me maybe 3 times over the last 5 years), but these essentially tell the process reading the data that it has reached the end of the file and to stop reading it there. The ASCII code for it resolves to CTRL+Z, so my current working theory is that the source system is capturing people undoing an typo. I've never been able to replicate this though. In both cases, knowing up front how many records you are expecting, and counting the number of records you've loaded into your working system captures these problems."

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant