-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom invalid characters (#57) #58
Changes from all commits
1ca1541
bb9016f
6f49e84
6abc74c
41dc758
5251bfb
91a1fb5
038971a
6f57187
07704fd
da93854
e4b609b
ea2bd1e
057e3e9
7365cdb
020a4ee
99cde11
0dd8b47
6ec3383
8843c20
05a7f43
4ba9696
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -5,6 +5,7 @@ | |||||
* Replaces characters in strings that are illegal/unsafe for filenames. | ||||||
* Unsafe characters are either removed or replaced by a substitute set | ||||||
* in the optional `options` object. | ||||||
* Additionally, more invalid characters can be passed using the `additionalInvalidStrings` attribute of the optional `options` object. | ||||||
* | ||||||
* Illegal Characters on Various Operating Systems | ||||||
* / ? < > \ : * | " | ||||||
|
@@ -24,7 +25,7 @@ | |||||
* http://unix.stackexchange.com/questions/32795/what-is-the-maximum-allowed-filename-and-folder-size-with-ecryptfs | ||||||
* | ||||||
* @param {String} input Original filename | ||||||
* @param {Object} options {replacement: String | Function } | ||||||
* @param {Object} options {replacement: String | Function, additionalInvalidStrings: String[] } | ||||||
* @return {String} Sanitized filename | ||||||
*/ | ||||||
|
||||||
|
@@ -36,11 +37,43 @@ var reservedRe = /^\.+$/; | |||||
var windowsReservedRe = /^(con|prn|aux|nul|com[0-9]|lpt[0-9])(\..*)?$/i; | ||||||
var windowsTrailingRe = /[\. ]+$/; | ||||||
|
||||||
function sanitize(input, replacement) { | ||||||
function containsInvalids(input, invalids) { | ||||||
for (var i = 0; i < invalids.length; i++) { | ||||||
if (input.indexOf(invalids[i]) !== -1) { | ||||||
return true; | ||||||
} | ||||||
} | ||||||
return false; | ||||||
} | ||||||
|
||||||
function sanitize(input, replacement, invalids) { | ||||||
if (typeof input !== 'string') { | ||||||
throw new Error('Input must be string'); | ||||||
} | ||||||
var sanitized = input | ||||||
if (!(typeof replacement === 'string' || typeof replacement === 'function')) { | ||||||
throw new Error("Replacement must be a string or a function returning a string!"); | ||||||
} | ||||||
if (typeof replacement === 'string' && invalids.indexOf(replacement) !== -1) { | ||||||
throw new Error("The replacement string can't be part of options.additionalInvalidStrings or contain substrings which are part of options.additionalInvalidStrings!"); | ||||||
} | ||||||
var sanitized = input; | ||||||
var counter = 0; | ||||||
while (containsInvalids(sanitized, invalids)) { | ||||||
Chaphasilor marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
for (var i = 0; i < invalids.length; i++) { | ||||||
if (typeof replacement === 'function') { | ||||||
var tempReplacement = replacement(invalids[i]); | ||||||
Chaphasilor marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
sanitized = sanitized.split(invalids[i]).join(tempReplacement); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this will perform better:
Suggested change
Or what's the reason behind using split/join? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The replace function uses a RegExp to find the parts to be replaced. Passing for example I used a RegExp for this at first as well, but had to change it for this reason. Not sure if it will work when simply passing a string though? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can pass the strings directly to console.log('a|a'.replace('|', ''));
// => 'aa' There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From MDN:
Actually, this isn't the problem. The test that's failing is about nested invalid strings. So we'd need to implement an inner loop to make sure all invalid strings on the same 'level' will be removed in the current iteration, or else the output will be something unexpected. Not invalid, but unexpected. Thoughts? @parshap @joelmukuthu There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Haha, I'm not even sure I fully understood that failure but if all else fails we can document our way out of it. But frankly, with all the issues coming up, I'm having a hard time backing the string-array implementation. Not that I'm shying away from hard work, but rather because I think the right way to implement this feature is by allowing users to pass in a regex. I've thought about it some more and I don't think we'd be introducing an extra attack vector. Calling it out in the documentation is also good enough in my opinion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about we add both? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the motivation to keeping it simple, for me it just doesn't equate to the cost of maintaining it -- having both implementations would be an even greater cost. I don't want to be a blocker for this, @parshap what do you think? |
||||||
} else { | ||||||
sanitized = sanitized.split(invalids[i]).join(replacement); | ||||||
} | ||||||
} | ||||||
if (counter > 1000) { | ||||||
throw new Error("Illeagal replacement function. Make sure that replacements generated by your function do not contain any of the strings specified in options.additionalInvalidStrings!"); | ||||||
} else { | ||||||
counter++; | ||||||
} | ||||||
} | ||||||
sanitized = sanitized | ||||||
.replace(illegalRe, replacement) | ||||||
.replace(controlRe, replacement) | ||||||
.replace(reservedRe, replacement) | ||||||
|
@@ -51,9 +84,10 @@ function sanitize(input, replacement) { | |||||
|
||||||
module.exports = function (input, options) { | ||||||
var replacement = (options && options.replacement) || ''; | ||||||
var output = sanitize(input, replacement); | ||||||
var invalids = (options && options.additionalInvalidStrings) || []; | ||||||
var output = sanitize(input, replacement, invalids); | ||||||
if (replacement === '') { | ||||||
return output; | ||||||
} | ||||||
return sanitize(output, ''); | ||||||
return sanitize(output, '', []); | ||||||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice fix with moving the declaration outside of the
while
loop -- at first I thought that lgtm bot was crazy but I guess it was right.Do you have a test that validates this working to prevent that bug from regressing in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've downloaded eslint right after realizing what I've done lol. Embarrassing :D
I added the test
illegal replacement function
, currently in line 159 intest.js
. This should check if the exception is thrown :)