Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chi 1991 parameterized queries #401

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

stephenhand
Copy link
Collaborator

@stephenhand stephenhand commented Jun 16, 2023

Description

In order to make our data access logic safer from vulnerabilities, we should try to use PostgreSQL native parameterized queries, rather than relying on client side string interpolation provided as the default approach in pg-promise.

This PR attempts to make that easier by making using parameterized queries as straightforward as using client side interpolation. Out of the box, pg-promise mirrors the underlying postgres API, and only allows positional parameters. This PR adds an alternative factory function for pg-promise ParameterizedQuery objects which allow named parameters, as well as directly passing in JSON objects and arrays intended for use as a comma sewparated list (in an IN clause, for example).

It also converts the case data access, which contains some of our most complex data access logic, over to using parameterized queries as an example

  • Adds a helper function so allow the creation of PostgreSQL parameterized queries in pg-promise using named parameters.
  • Converts the case data access module to use parameterized queries (the others would need doing later)
  • Moves a few shared sql helpers into their own package

Checklist

  • Corresponding issue has been opened
  • New tests added

Related Issues

CHI-1991

Verification steps

Regression test case searching, lists and updating with this build of HRM deployed. You can also run a profiler and verify that the queries are hitting the database as parameters, not already interpolated into the SQL statement

@robert-bo-davis
Copy link
Collaborator

robert-bo-davis commented Jun 16, 2023

It feels like our low level SQL access is cascading into a need for even more custom tooling. I feel like we are using dev cycles to reinvent wheels and I don't fully understand why. I understand that there are some reasons (that I don't fully agree with) that we are avoiding an ORM, but are there reasons we need to be writing SQL at such an extremely low level? There are libraries like knex that will make our life easier in substantial ways without committing to an ORM.

@stephenhand
Copy link
Collaborator Author

It feels like our low level SQL access is cascading into a need for even more custom tooling. I feel like we are using dev cycles to reinvent wheels and I don't fully understand why. I understand that there are some reasons (that I don't fully agree with) that we are avoiding an ORM, but are there reasons we need to be writing SQL at such an extremely low level? There are libraries like knex that will make our life easier in substantial ways without committing to an ORM.

I looked at Knex when I was investigating Sequelize alternatives, I just didn't really see that it brought much benefit for bringing in an extra level of abstraction, and still having to ask the question 'so what query is this thing actually running?' (it might be obvious in simple examples but in complex queries with multiple nested joins & subqueries - not so much).

It seems like you would need to do everything twice, first work out the query you want to write, then work out how to get the library to write it, a bit like with an ORM except the mapping between the API calls you need to make and the query you want is a little more direct

Also I think we would still need to resort to using 'raw' APIs quite often to do what we currently do using Knex, especially some of the JSON stuff - I know it has some support for SQL JSON APIs but some of our manipulations are quite complex.

I just thought this was a way to have a drop in enhancement for making the way we currently write queries a bit more secure with a relatively small amount of custom code. The diffs exaggerate the extent to which I had to rewrite the queries, because I changed them with the initial fix for the vulnerability and then this one reverts them back closer to what they were originally. The main thing that needs to change is that you can only run one parameterised query at a time, rather than sending several at once, but otherwise you can normally just use the same query.

I was thinking we could discuss this at the next Aselo Tech Discussion - we could show the team Knex and see what they think, they might have a different take on it to me

There's no rush to merge this PR so we can keep it on hold until we have time to discuss alternatives

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants