spreadsheet values #355
Replies: 6 comments 3 replies
-
After some Googling, it looks like the solution would be in how we open .txt files in Excel -- it's converting plain text into dates upon opening the file, not saving it. Clicking and dragging a .txt file into Excel will automatically convert plain text into numbers/dates wherever it can, so this should not be how people open our .txt to work on them. If a .txt file is opened through File > Open, Excel should open up a Text Import Wizard. In Step 3 of the Wizard, select all columns in the Data Preview, and under Column data format, select Text, then Finish. This will import all cells as plain text, and will avoid autoformatting MARC fields, character positions, etc. I haven't tested out what Excel does if you change the default application for .txt to Excel. |
Beta Was this translation helpful? Give feedback.
-
Here's an attempt at an explanation, if it matters: |
Beta Was this translation helpful? Give feedback.
-
@gerontakos and @lake44me volunteered to follow up on this before 202-05-04 |
Beta Was this translation helpful? Give feedback.
-
This Microsoft discussion is about a related problem - quote marks appearing when text is copied and pasted from one cell to another. It may be that I have unwittingly acquired a set of quote marks by doing this, and they just continued to be propagated. The relevant insights are that Excel is adding quotes if the text contains specific non-printable characters. "I found that Excel adds the quotes only to strings containing the non-printable characters 10 (line feed), 13 (carriage return) and 9 (tab). All other non-printable characters, such as 12 (form feed) or 30 (record separator) do NOT cause Excel to add quotes." So, the reason this seems to specifically happen to certain columns is those are the columns where long explanatory notes and examples would be entered, and we're trying to format them. The Microsoft person trying to provide an answer did not know why Excel was doing this. They suggested pasting copied text first into Word, before copying it back into another cell. Another article https://excel.tips.net/T002237_Getting_Rid_of_Extra_Quote_Marks_in_Exported_Text_Files.html (not Microsoft, but probably knowledgeable) states that "Sometimes the export filters used by Excel can produce undesired results in the exported file. Create bts; sitemask = "0110"; pcmlink = 40 This, however, is how the text in the cell is exported by Excel: "Create bts; sitemask = ""0110""; pcmlink = 40" Notice that Excel adds extra quotation marks, first around the entire cell contents, and then an extra set around any previously "quoted" text within the cell." So having quotation marks around, or within the text of a cell is probably also a trigger to acquire an extra layer of quotation marks when Excel re-converts from Excel to tab-delimited format as the file is being re-saved. Can we make our notes and examples understandable without using line feeds, the other 2 characters, any kinds of quote marks, or any other trigger we come across? Or, should we pursue workarounds? I'm not sure about VBA scripts, etc. Laura "I blame Microsoft" |
Beta Was this translation helpful? Give feedback.
-
Leaving the MARCFIELDLABEL column as "General" data type makes no difference in the addition of quote marks around the field content. They keep piling up if certain character(s) are contained in the text. |
Beta Was this translation helpful? Give feedback.
-
It looks, after all Laura’s work, that the double-quote problem is largely solved. First of all, when we open/import, we should always open our .txt tab-delimited files with text qualifier = " . The text qualifier is our friend. It is used to wrap cell values that contain either a tab, CRLF, a comma, or quotation marks as part of the value. The behavior of cells with values containing a CRLF or a tab is difficult to discern. It seems best we avoid using CRLF or tab inside cells. Commas however seem fine to use; our values just get wrapped in double-quotes. Similarly, if we want to put any part of our values in quotes, the full cell value will get wrapped in quotes. In this case, there’s a complication: the text delimiter is equivalent to the quotes that are part of the cell value. The solution: escape the internal double quotes with … double quotes! So the cell value that opens in Excel with text delimiter = " that displays as: This is the root of our problem. We’re opening with text delimiter = {none} but when Excel closes a .txt file, it adds text delimiters. So we open with text delimiter = {none} and we see: The quotation marks proliferate each time we save using a .txt file with text delimiter = {none}. If that sounds correct to everyone, we can write it into the workflow somehow. Then we can test it in practice and make adjustments as needed. As for the already-existing double-quotes: I don't think it will be too difficult, somewhere down the line (downstream!) to eliminate them. We can use others' macros, write our own, or just find and replace using a text editor. So even if they're there, it doesn't seem like a big deal. It certainly should not stop us from moving forward. |
Beta Was this translation helpful? Give feedback.
-
spreadsheet values in character positions (and elsewhere) are doing weird things, trying to turn into dates. need to figure out how to make them sit still in a .txt file. Punctuation?
Beta Was this translation helpful? Give feedback.
All reactions