Code Scanning Alerts

Code Scanning Alerts

krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

Before I say anything, I would like to make it clear that I agree 100% that all the findings I am showing are indeed false positives. You will not get an argument from me on that. However, the fact that the multi-billion-dollar company my team works for has become very concerned about security and the company does get the final say which library packages are permitted inside its applications. On Friday, my team was told to completely uninstall datatables off all our applications and yes, all our applications are now broken. Also notice that “replace, and match” functionalities are flagged culprits. So, I am submitting this request to see if datatables can address these concerns very soon. Here are some of the the issues:

1. Incomplete multi-character sanitization
wwwroot/lib/datatables/dist/js/jquery.dataTables.js:1515 <-- line number

return d.replace( _re_html, '' );

This string may still contain , which may cause an HTML element injection vulnerability.

Please see “REPLACE COMMENTS” section from scanning.

2. Incomplete multi-character sanitization
wwwroot/lib/datatables/dist/js/jquery.dataTables.js:5817 <-- line number

    s = s.replace( __re_html_remove, '' );

Please see “REPLACE COMMENTS” section from scanning.

3. Incomplete string escaping or encoding
wwwroot/lib/datatables/dist/js/jquery.dataTables.js:4470 <-- line number

    return word.replace('"', '');

Please see “REPLACE COMMENTS” section from scanning.

4. Polynomial regular expression used on uncontrolled data
wwwroot/lib/datatables.net/js/dataTables.js:4479 <-- line number

var parts = search.match( /!?["\u201C][^"\u201D]+["\u201D]|[^ ]+/g ) || [''];

Please see “MATCH COMMENTS” section from scanning.

REPLACE COMMENTS
Sanitizing untrusted input is a common technique for preventing injection attacks such as SQL injection or cross-site scripting. Usually, this is done by escaping meta-characters such as quotes in a domain-specific way so that they are treated as normal characters.

However, directly using the string replace method to perform escaping is notoriously error-prone. Common mistakes include only replacing the first occurrence of a meta-character, or backslash-escaping various meta-characters but not the backslash itself.

In the former case, later meta-characters are left undisturbed and can be used to subvert the sanitization. In the latter case, preceding a meta-character with a backslash leads to the backslash being escaped, but the meta-character appearing un-escaped, which again makes the sanitization ineffective.

Even if the escaped string is not used in a security-critical context, incomplete escaping may still have undesirable effects, such as badly rendered or confusing output.

Recommendation
Use a (well-tested) sanitization library if at all possible. These libraries are much more likely to handle corner cases correctly than a custom implementation.

An even safer alternative is to design the application so that sanitization is not needed, for instance by using prepared statements for SQL queries.

Otherwise, make sure to use a regular expression with the g flag to ensure that all occurrences are replaced, and remember to escape backslashes if applicable.

Note, however, that this is generally not sufficient for replacing multi-character strings: the String.prototype.replace method only performs one pass over the input string, and will not replace further instances of the string that result from earlier replacements.

For example, consider the code snippet s.replace(/\/..\//g, ""), which attempts to strip out all occurences of /../ from s. This will not work as expected: for the string /./.././, for example, it will remove the single occurrence of /../ in the middle, but the remainder of the string then becomes /../, which is another instance of the substring we were trying to remove.

MATCH COMMENTS
Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, or even allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.

The regular expression engines provided by many popular JavaScript platforms use backtracking non-deterministic finite automata to implement regular expression matching. While this approach is space-efficient and allows supporting advanced features like capture groups, it is not time-efficient in general. The worst-case time complexity of such an automaton can be polynomial or even exponential, meaning that for strings of a certain shape, increasing the input length by ten characters may make the automaton about 1000 times slower.

Typically, a regular expression is affected by this problem if it contains a repetition of the form r* or r+ where the sub-expression r is ambiguous in the sense that it can match some string in multiple ways. More information about the precise circumstances can be found in the references.

Recommendation
Modify the regular expression to remove the ambiguity, or ensure that the strings matched with the regular expression are short enough that the time-complexity does not matter.

Answers

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    Hi,

    Sorry to hear that you've had to uninstall DataTables.

    1) What version of DataTables are you using please? I don't see that expression in the current 2.0.3, or the legacy 1.13.11. It might be in an older version?

    2) This is one from v1.x (which is now considered legacy with v2 now available) and v2 has introduced a new way to address this. By default it still does more or less the same thing, but it's HTML stripping is replaceable. So if you want to use an HTML sanitizing library, you can with the DataTable.util.stripHtml() method, and providing your own (from a library).

    3) I believe this one is a false positive, due to how the split works. However, I have no problem changing it to /"/g if that would help you pass the scanner's test there.

    4) I should be able to something about that. Let me get back to you on that one.

    With these points addressed, will you be able to use DataTables again?

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    1) 2.0.3
    2) Please provide examples how to utilized HTML sanitizing library such as DataTable.util.stripeHtml().
    3) If you can address the security concerns in your next release so that the company code scanes picks up nothing, that will be great.
    4) Thank you for looking into this. I greatly appreciate it.

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    1) Line 1515 in 2.0.3 is a closing brace. Indeed, doing a search on the 2.0.3 code shows no results exactly matching the line you gave. Can you send me the file you are using perhaps?

    2) If you import something like sanitize-html, then you could do:

    DataTable.util.stripHtml(sanitizeHtml);
    

    Replace with whatever HTML sanitizer it is that your company approves of. Obviously the original library is unchanged, so it will still show the warning in the static analysis, so I don't know what your process is to allow for that.

    3) Will my suggested change allow it to pass? There is no point in making the change if it doesn't have an impact, since it isn't technically needed.

    4) Will let you know how I get on.

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    I tried using DOMPurify and DataTable.util.stripHtml by placing like so DOMPurify.sanitize($("#table").DataTable()) and the scanner still looks at the datatable js code. Below are actual findings with recommandations below each block. The verion is 2.0.3.

    wwwroot/lib/datatables.net/js/dataTables.mjs:3837  line number
    columnDef.sTitle = cell.innerHTML.replace( /<.*?>/g, "" );

    wwwroot/lib/datatables.net/js/dataTables.mjs:1235  line number
    return d
    .replace( _re_html, '' ) // Complete tags

    wwwroot/lib/datatables.net/js/dataTables.js:3890  line number
    columnDef.sTitle = cell.innerHTML.replace( /<.*?>/g, "" );

    wwwroot/lib/datatables.net/js/dataTables.js:1288  line number
    return d
    .replace( _re_html, '' ) // Complete tags

    Recommandation
    Modify the regular expression to remove the ambiguity, or ensure that the strings matched with the regular expression are short enough that the time-complexity does not matter.

    wwwroot/lib/datatables.net/js/dataTables.mjs:3837
    columnDef.sTitle = cell.innerHTML.replace( /<.*?>/g, "" );

    wwwroot/lib/datatables.net/js/jquery.dataTables.mjs:15000  line number
    data
    .replace( _re_new_lines, " " )
    .replace( _re_html, "" )

    wwwroot/lib/datatables.net/js/jquery.dataTables.mjs:6152  line number
    var sTitle = col.ariaTitle || col.sTitle.replace( /<.*?>/g, "" );

    wwwroot/lib/datatables.net/js/jquery.dataTables.mjs:5901  line number
    s = s.replace( __re_html_remove, '' );

    wwwroot/lib/datatables.net/js/jquery.dataTables.mjs:1464  line number
    return d
    .replace( _re_html, '' ) // Complete tags

    wwwroot/lib/datatables.net/js/jquery.dataTables.mjs:1464  line number
    return d
    .replace( _re_html, '' ) // Complete tags
    .replace(/<script/i, ''); // Safety for incomplete script tag

    Recommandation
    To prevent this issue, it is highly recommended to use a well-tested sanitization library whenever possible. These libraries are more likely to handle corner cases and ensure effective sanitization.
    If a library is not an option, you can consider alternative strategies to fix the issue. For example, applying the regular expression replacement repeatedly until no more replacements can be performed, or rewriting the regular expression to match single characters instead of the entire unsafe text.

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    I'm not entirely clear what the goal with:

    DOMPurify.sanitize($("#table").DataTable())

    is?

    $().DataTable() returns a DataTables API object, where has the DOMPurity.sanitize() method will expect a string that it can sanatize. The code scanner, I presume, doesn't care what what is done with the returned value, but rather builds an AST of the code and then runs on that.

    What code scanner are you using?

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    The code scanner is CodeQL.

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    I've enabled CodeQL on my distribution repo for DataTables, and addressed the issues that it flagged up.

    The nightly build contains the latest fixes if you would like to try it.

    It will still warn about Incomplete multi-character sanitization in the HTML stripping function, but that is a false positive due to the lines following it in the function which address that concern.

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    Allan, thank you for the updated code on the nightly built. It is doubtful that I am allowed to just copy and paste the updated code because of company policy. When do you expect the next minor release which will include the updated code?

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    Tomorrow :)

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    I do very much appreciate your quick turn around and thank you for your assistance. Ones it becomes available, I will update our applications.

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    Apologies - due to the need to debug a different issue, this release will now fall into next week. Tuesday most likely.

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    Hi Allan,

    Installed the latest datatables v2.0.4 today. There are 2 high issues remaining and the both are same. If you can take care of them, I shall greatly appreciate it.

    On line 4491 of wwwroot/lib/datatables.net/js/dataTables.mjs:4441
    var parts = search.match( /!?["\u201C][^"\u201D]+["\u201D]|[^ ]+/g ) || [''];

    On line 4494 of wwwroot/lib/datatables.net/js/dataTables.js:4494
    var parts = search.match( /!?["\u201C][^"\u201D]+["\u201D]|[^ ]+/g ) || [''];

    Some regular expressions take a long time to match certain input strings to the point where the time it takes to match a string of length n is proportional to nk or even 2n. Such regular expressions can negatively affect performance, or even allow a malicious user to perform a Denial of Service ("DoS") attack by crafting an expensive input string for the regular expression to match.

    The regular expression engines provided by many popular JavaScript platforms use backtracking non-deterministic finite automata to implement regular expression matching. While this approach is space-efficient and allows supporting advanced features like capture groups, it is not time-efficient in general. The worst-case time complexity of such an automaton can be polynomial or even exponential, meaning that for strings of a certain shape, increasing the input length by ten characters may make the automaton about 1000 times slower.

    Typically, a regular expression is affected by this problem if it contains a repetition of the form r* or r+ where the sub-expression r is ambiguous in the sense that it can match some string in multiple ways. More information about the precise circumstances can be found in the references.

    Recommendation
    Modify the regular expression to remove the ambiguity, or ensure that the strings matched with the regular expression are short enough that the time-complexity does not matter.

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    That one isn't showing up in Github's CodeQL for me I'm afraid. What scanner is it you are using? I'm not sure how I would fix that if I can't see it. I would welcome a patch to address the issue.

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    Allan, can you please tell me the version of the CodeQL scanner you are using?

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    It will be the latest as it is enabled in Github (I don't seem to be able to find a specific version mentioned in their UI). I've got their default configuration for Javascript / Typescript:

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    Our CodeQL was recently updated; "CodeQL 2.17.0: Support for Java 22, Swift 5.10, TS 5.4, C# 12"

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    If you are able to suggest a patch that doesn't alter the function of the code and addresses the issue, I'd be very happy to include it.

    Allan

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    I've installed CodeQL locally and run it as follows:

    # Create the codeql database
    codeql database create codeqldb --source-root src --language=javascript
    
    # Run Javascript code scanning
    codeql database analyze codeqldb ~/codeql/qlpacks/codeql/javascript-queries/0.8.13/codeql-suites/javascript-security-extended.qls --format=sarifv2.1.0 --output=results/javascript.sarif
    

    The CodeQL version is 2.17.0:

    $ codeql --version
    CodeQL command-line toolchain release 2.17.0.
    

    Doing this results in no issues found.

    Is it javascript-security-extended.qls (v0.8.13) that you are running as the test suite?

    Allan

  • krutovdlkrutovdl Posts: 36Questions: 7Answers: 1

    I will be unable to provide the codeQL patch my company is using because Security will not disclose it. However, the good news is that there is only one remaining issue and it has to do with the "match" reqex functionality. Please look at my post from April 16th. If you can satiisfy the alert then all the only remaining high issues is fixed.

    Thank you,

    Dimitri

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    I can't fix it, because I can't see the issue. If you or your security team can tell me how I need to configure CodeQL to see it then I can look into it. The other option would be for you do to patch it to address the issue you are seeing and I'll merge it in (assuming there is not functional change).

    Allan

  • allanallan Posts: 61,787Questions: 1Answers: 10,114 Site admin

    I've just stumbled across another regex checking tool and it is telling me that the regex in question (/!?["\u201C][^"\u201D]+["\u201D]|[^ ]+/g) is safe. It correctly catches others that CodeQL was picking up as vulnerable to an exponential attack.

    Allan

Sign In or Register to comment.