aabashkin.github.io

Security Vulnerabilities In Robotic Process Automation (RPA) / Low Code Technology: SQL Injection & The Return of Bobby Tables

Comic



Audience


Summary


Background

Low Code (LC) technologies and Robotic Process Automation (RPA) are exploding in growth due to their ease of use and wide variety of practical application for ordinary business users. What was once the domain of traditional software engineers is now open to the average “citizen developer”.

Here are some eye opening statistics:

For more background on this growing technology trend, check out the following:


Security Risks

The opportunities that come with these new technologies creates a lot of exciting possibilities, but these possibilities carry some significant security risks as well. The comic above was published many years ago and has become a cult classic among application / software security experts and security conscious software engineers. But for those readers who don’t come from this background, let’s take a moment to briefly review what’s going on. For those who are already familiar with SQLi, feel free to skip this section.

The comic references a very common software vulnerability known as SQL injection (SQLi). As far as vulnerabilities go, this a particularly nasty one, given the various types of impacts it can have on a system. This vulnerability allows an attacker to potentially:

In short, SQL injection can wreck havoc on any system. This has been known for a long time, and is the reason why so much ink has already been spilled on the topic. In fact, there is an entire website dedicated to this sole issue! In addition to this, the OWASP Top Ten, a project dedicated to enumerating some of most common and high impact security vulnerabilities in software, has included this issue in every single version since the origin of the project in 2003.

The good news is that SQL injection has become less prevalent over time, mostly due to improved software frameworks, security testing tools, and more awareness among software engineers. But while traditional software has been steadily improving, the rise of Low Code / RPA creates the potential for this vulnerability (and many others like it) to raise its ugly head once again.

In order to understand why this is so, we need to begin by reviewing how a typical SQL injection vulnerability occurs. Databases are often accessed by special commands known as SQL queries. For our example, let’s say a school has a database where it stores some information about its students. If we want to insert a new student record into the database, we could create a query like this:

INSERT INTO Students (name) VALUES ('Alice');

The example above specifies the type of operation we want to perform (INSERT), the table in the database we want to add it to (students), as well as the field in the table (name), and finally the data itself (Alice).

Now let’s make this example a bit more realistic. Typically, we wouldn’t be writing code that has queries with a literal value such as Alice. Instead, we would create a dynamic query that could accept multiple values. We would gather student names from some kind of data source, such as a spreadsheet or a form on a website, and then pass that value into the SQL query before firing it off. The query would look something like this:

INSERT INTO Students (name) VALUES ('${student_name}');

In this example, the variable ${student_name} represents the data that we got from our web form. So what’s the issue here? Simply put, an attacker could submit some malicious input to our website form and completely alter the entire query. Going back to the comic, our cheeky parent decided to name their child Robert'); DROP TABLE Students;--. When that input is combined with our query, the final result looks like

INSERT INTO Students (name) VALUES ('Robert'); DROP TABLE Students;--');

The special characters '); at the end of Robert create the effect of completing the first query, but then allowing another query to be added to the end. This query (DROP TABLE Students;) has the effect that you think it does.

This is just one example, but it should be apparent how dangerous it can be to combine untrusted user input with regular SQL queries. Our attacker now has the ability to do anything with the database that the application itself can do. This is the main reason why this type of vulnerability has been given so much attention over the past two decades.


A Modern Example with a Low Code / RPA Platform

We’ve discussed traditional applications, but what about our new technologies? Let’s have a look at how this issue can manifest itself using a popular open source Low Code / RPA platform called the Robot Framework along with the RPA Framework designed for it. Unlike a typical programming language, Robot Framework “has an easy syntax, utilizing human-readable keywords. Its capabilities can be extended by libraries implemented with Python, Java or many other programming languages. Robot Framework has a rich ecosystem around it, consisting of libraries and tools that are developed as separate projects.”

Let’s recreate the example from above in RF with the help of the RPA.Database library:

*** Tasks ***

Insert Students Into Database Using Dynamic Query
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx
    FOR    ${student_name}    IN    @{list_students}
        ${query}=    Format String    INSERT INTO students (name) VALUES ('{}');    ${student_name}
        Query    ${query}
    END

*** Keywords ***

Get Students From Excel File
    # Implementation omitted for brevity
    ...
    RETURN    ${list_students}

In this example, our data source is an Excel file called new_students.xlsx. We read this file using a custom function (known as a keyword) that we define called Get Students From Excel File. For the sake of simplicity, I’ve omitted the implementation of this keyword, all that you need to know is that it returns a list of student names from the Excel file name that we provide. To see the full implementation, please go to the code repo

Next, we take that list and loop through it, generating a new SQL query for each student and sending the query to our database. This is where we encounter the SQL injection vulnerability from our initial example.

This line of code:

${query}=    Format String    INSERT INTO students (name) VALUES ('{}');    ${student_name}

Combined with Bobby’s full name:

Robert'); DROP TABLE Students;--

Becomes this query:

INSERT INTO Students (name) VALUES ('Robert'); DROP TABLE Students;--')

To see this example in action, please refer to the instructions in the code repo.


Prevention

There are various ways to prevent this vulnerability.

These include:

None of these techniques are novel or unique to our Robot Framework example. They are tried and tested methods of avoiding SQL injection. The only difference is how they are applied specifically within Robot Framework.

Let’s go step by step.


Parameterized Queries

When generating the query, replace the Format String keyword with the Set Variable keyword and replace the VALUES ('{}') placeholder with a VALUES (\%s) placeholder.

Next, add a data parameter containing untrusted input, ${student_name} in this case, to the Query keyword.

The data parameter accepts tuples, so even if you have a single input item you must still add an extra empty item at the end in order to execute the query. Simply providing ("${student_name}") won’t work, ("${student_name}", ) will.

*** Tasks ***

Insert Students Into Database Using Parameterized Query
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx
    FOR    ${student_name}    IN    @{list_students}
        ${query}=    Set Variable    INSERT INTO students (name) VALUES (\%s);  
        Query    ${query}  data=("${student_name}", )
    END

Please note that every database has a different style of placeholder syntax. Our example contains the syntax used by PostgreSQL. Other databases rely on a different syntax, so please refer to this guide when necessary.

When we run this example, the platform recognizes the clear separation between data and code, thus eliminating the attacker’s ability to manipulate the function of the query itself. This separation is the core principle for avoiding any type of injection vulnerability.


Stored Procedures

Start by creating a stored procedure in the database.

CREATE PROCEDURE insert_student(student_name VARCHAR(255))
LANGUAGE SQL
AS $$
    INSERT INTO students (name) VALUES (student_name);
$$;

Next, utilize the Call Stored Procedure keyword in your code.

*** Tasks ***

Insert Students Into Database Using Stored Procedure
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx
    FOR    ${student_name}    IN    @{list_students}
        @{params}     Create List   ${student_name}
        Call Stored Procedure   insert_student  ${params}
    END

Please note that due to an issue in the psycopg Python module this approach is currently incompatible with any Postgres 11+ database. Instead, one can use the Query keyword along with the CALL instruction. However, in order to properly generate this query string one must use parameterized queries as described in the previous section. For the sake of simplicity it may make more sense to just use parameterized queries and avoid the extra overhead of obtaining access to the database and creating a stored procedure. However, if a stored procedure already exists in your use case and you would like to leverage it, see the example below.

*** Tasks ***

Insert Students Into Database Using Stored Procedure (Postgres)
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx
    FOR    ${student_name}    IN    @{list_students}
        ${query}=    Set Variable    CALL insert_student (\%s)
        Query    ${query}  data=("${student_name}", )
    END

This example also conforms to the principle of separating code and data. Manipulation of the query is no longer possible.


Escaping

Escaping neutralizes any special control characters in our input, thus preventing it from altering the query. Apply escaping by surrounding the {} placeholder with double dollar signs $$.

Since Robot Framework recognizes a dollar sign followed by a curly brace as a variable declaration we have to add an extra backslash \ to $${}$$, so the final result should be $\${}$$.

*** Tasks ***

Insert Students Into Database Using Query With Escaping
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx
    FOR    ${student_name}    IN    @{list_students}
        ${query}=    Format String    INSERT INTO students (name) VALUES ($\${}$$);    ${student_name}
        Query    ${query}
    END

Please note that every database has a different style of escaping. Our example contains the syntax used by PostgreSQL. Please refer to your database’s documentation for guidance relevant to your use case.

Just like the techniques mentioned above, this example separates code and data and secures our query.


Input Validation

As an example, let’s say that we’ve decided that valid student names for our database should contain only letters from the Latin alphabet and hyphens. We use this rule to create a regular expression ^[a-zA-Z-]+$ and apply it to each student name using the Should Match Regexp keyword.

*** Tasks ***

Insert Students Into Database Query With Input Validation
    @{list_students}=    Get Students From Excel File    resources/new_students.xlsx

    ${pattern}=    Set Variable    ^[a-zA-Z-]+$

    FOR    ${student_name}    IN    @{list_students}
        TRY
            ${result}=    Should Match Regexp    ${student_name}    ${pattern}
            ${query}=    Format String    INSERT INTO students (name) VALUES ('{}');    ${student_name}
            Query    ${query}
        EXCEPT   * does not match *    type=glob
            Log To Console    Input validation failed. Student name does not meet requirements.\n
        END
    END

Please note that input validation is most effective when heavily constrained. The example above could actually become vulnerable if we allowed additional special characters, such as ', ), and ;. Deciding if input validation is sufficiently constrained is a difficult topic, even for tech savy users. For this reason, it is recommended to avoid relying entirely on input validation, and instead use it as a secondary security control. That being said, input validation provides benefits beyond just security, such as ensuring data accuracy and consistency, so it is still a good idea to consider it.


Future Challenges

As you can see, low code technologies are susceptible to some of the same risks as traditional applications. However, there are additional challenges in this space:


Additional Issues

Databases are not the only attack surface. Additional vulnerabilities may exist in automation which interacts with:

Bottom line, many of the issues that affect typical applications can affect Low Code / RPA as well. Stay tuned for more blog posts about these topics in the future.


Further Improvements

Building Additional Security Controls

When I first began my security research in this area, the RPA.Database library did not support parameterized queries. I quickly identified this as an area for improvement and submitted a pull request. Thankfully, the PR was quickly reviewed, merged, and released.

Looking at the list of additional issues above, I believe that over time it is very likely that we will discover more improvements that can be made. Just as in other areas of application security, a collaborative effort to solving this problem is key.

The folks at Robocorp, the company which maintains RPA Framework, have been very receptive to my work and provided all the support that I needed in order to be successful. If you are considering contributing your own work, please don’t hesitate. The community slack is a good place to start for validating ideas and figuring out next steps. Hopefully, other low code platforms follow this example.

Raising Awareness

Back in the early 2000’s, application security was still an obscure principle. Many people simply didn’t realize what type of risks were associated with insecure code. In order to raise awareness about the issue, the first version of the OWASP Top 10 was released in 2003. Since then, the project has undergone many revisions and today is considered one of the most authoritative sources of information about application security risks.

Using this as a model for success, the OWASP Low-Code / No-Code Top 10 Project was born. Just like the original Top 10, it will undoubtedly evolve over time, and hopefully become a useful resource for all Low Code users.

I call on all interested parties to contribute to this effort in whatever ways they can. A great place to start is the Slack channel and Google Group. A special thanks goes out to all of the folks that have contributed so far!


Conclusions

In short, Low Code is here, and it’s here to stay. When it comes to adopting Low Code in your organization it is simply a question of when, not if.

Monitoring the security space in Low Code is crucial, and will provide may exciting opportunities for future security research and innovation.


References

SQLi Robot Framework example source code on GitHub

Robot Framework / RPA Framework Security Cheatsheet

OWASP Low-Code / No-Code Top 10 Project


Follow Me

Twitter

LinkedIn


Join the Conversation

OWASP Low-Code / No-Code Slack

OWASP Low-Code / No-Code Google Group