Parsing SQL Statements

JDBC and the Limits of ResultSet Metadata

For my work in the area of data citation, I need to analyse queries, which are used for creating subsets. I am particularly interested in query parameters, sortings and filters. One of the most commonly used query languages is SQL, which is used by many relational database management systems such as MySQL. In some cases, the interaction with databases is abstract, meaning that there is hardly any SQL statements executed directly. The SQL statements are rather built on the fly by object relational mappers such as Hibernate. Other scenarios use SQL statements as String and also prepared statements, which are executed via JDBC. However,  analysing SQL statements is tricky as the language is very flexible.

In order to understand what columns have been selected, it is sufficient to utilise the ResultSet Metadata and retrieve the column names from there. In my case I need to extract this imformation from the query in advance and potentially enforce a specific sorting by adding columns to the ORDER BY clause. In this scenario, I need to parse the SQL statement and retrieve this information from the statement itself. Probably the best way to do this would be to implement a parser for the SQL dialect with ANTLR (ANother Tool for Language Recognition). But this is quite a challenge, so I decided to take a shortcut: FoundationDB.

The FoundationDB Parser

FoundationDB was a NoSQL database which provided several layers for supporting different paradigms at once. I am using past tense here, because the project got acquired by Apple in 2015 and since then does pursue the open source project any more. However, the Maven libraries for the software are still available at Maven Central. FoundationDB uses its own SQL parser, which understands standard SQL queries. These queries can be interpreted as a tree and the parser library allows traversing SQL statements and analyse the nodes. We can use this tree to parse and interpret SQL statements and extract additional information.

The Foundations of FoundationDB

The FoundationDB parser can be included into your own project with the following Maven dependency:

The usage of the parser is straight forward. We use the following example SQL statement as input:

The following function calls the parser and prints the tree of the statement.

The resulting tree is listed below. The statement has also been normalized, which ensures a stable sequence of the parameters.

This tree offers a lot of information, which can be used programmatically as well. In the top of the output, we can see that the statement was a SELECT statement and that it was not DISTINCT. Then follows the ResultSet, which contains a list of the three ResultColumns, which have been specified in the SELECT clause. We can see the column names and the table names from which they are drawn. The next block provides the referenced tables (the FROM list) and their alias names. The WHERE – block contains the operands which have been used for filtering and last but not least, there is the list of ORDER BY clauses and their sorting directions.

The Visitor

In order to access the information shown above programmatically, we need to access the content of the node one by one. This can be achieved with the visitor pattern, which traverses all the nodes of the tree. The following listing shows how the visitor pattern can be used for accessing the list of columns from the SELECT clause.

This code example, we define a visitor which traverses all the ResultColumn nodes. Every time the current node is an instance of ResultColumn, we add this node to our list of columns. The nodes are only visited, if they are children of a SELECT statement. This is our entry point into the tree. We leave the tree when we reach the FROM list. We then apply the visitor to the statement, which initiates the traversal. As a result, we receive a list of columns which have been used for the result set.

In order to get the list of ORDER BY columns, we can utilise a similar approach. The following functions gives an example:

This time, we retrieve the list of ORDER BY columns directly from the CurserNode. Similar principles can be used for manipulating SQL statements and apply a different sorting for instance.

 

 

Continue reading


An Interactive Map with Leaflet, GeoJSON, and jQuery Using Bootstrap

A Side Project – An Interactive Parking Map of Innsbruck

When I recently moved to Innsbruck, I noticed that there was no interactive map for the parking system available. The amount of time you can park your car depends on the zone your car is located in. There are 20 parking zones and they are defined by their bordering streets in the city.  Innsbruck is very dense and parking is always a hot topic. So I thought having an interactive map makes it easier to find zones where you can leave your car longer, also if your are not so familiar with the street names. The city of Innsbruck offers the GIS data at the open data portal of Austria, which made it quite easy to implement such a map. I used the following technologies for creating this map:

The source code is available at my Github account. The implementation is available here and also at the Austrian Open Data Portal.

Code Snippets

Explaining the whole source code would be a bit too much for this post and most of it is pretty self explanatory, but in the following section I would like to highlight a few things of the project.

Initializing

In the top of the HTML file, we load the JavaScript file which contains all the functions and variables for our implementation. The script is called parkraum.js (parkraum means parking space in German).

The initialization of the Javascript code is straight forward with jQuery. I structured the initialization into a few components, as you can see in the following example.

We first initialize the Map itself, then place the parking zones, initialize the page scrolling to make it more smoothly and then populate the drop down menu with the available parking zones.

Initialize the Map

The first step is the initialization of the map with Leaflet. Note that map is a global variable. The coordinates [47.2685694, 11.3932759] are the center of Innsbruck annd 14 is the zoom level. In order to offer the map on a public page, you need to register with mapbox, a service which provides the tiles for the map. Mapbox is free for 50k map views per month. Make sure to use your own key and show some attribution.

We also add a legend to the map, for indicating the parking area type with colors. There exist 4 types of parking areas and we just add little squares with that colors to the map. The legend improves the readability of the map.

The final map with the legend looks like this:

Legend

Reading Data, Adding Layers

I obtained the shapefiles with the geographic information from the Austrian Open Data Portal. The data seems to be exported from ArcGIS and I manually converted it into GeoJSON, which is directly supported by Leaflet.js without any plugins. To do this, I just copied the polygon data into the GeoJSON structure. I also separated the quite large file into smaller junks, each parking zone in one file.

Below you can see an example how the JSON looks like for the parking zone C. The structure contains the short name (“C”), some additional information about the zone, attributes for the color, opacity and outline and of course the actual coordinates, which make up a polygon covering the parking area.

 

I created a small object containing the name of a parking zone, the relative path of the JSON file and a place holder for layer information.

This is globally available inside the JS file. The actual loading of the parking zones and the placement of the polygones on the map is happening in the following function. It uses jQuery to load the JSON files. Note that jQuery expects the files to be delivered by a Web server. So in order for this to work, you need to make sure that the files can be served by a Web server , also on your local development machine. You can try this very easy, if you execute the following python statement within the root directory of your local development directory: sudo http-server -p 80 .

The for loop iterates over the object, where we stored all the parking zones, respectively the JSON file locations. We load the files, one by one, and place the polygons on the map. This of course works in an asynchronous fashion. After this step, the polygons become visible on the map. We also add a clickable info box on all parking zones, which then display additional information.

Dropdown Selection for Marking and Highlighting a Parking Zone

Users are able to select one of the parking zones from a drop down list.  In the first step, we add all parking zones by iterating over the parking zones object. Once selected, the parking zone will change the color and therefore be highlighted. To do that, we remove the layer of the parking zone and add it again in a different color.

Show the Current Location

Showing the current location is also a nice feature. By clicking on a button, the map scrolls to the current location, which is transmitted by the browser. Note that this only works if you deliver your pages with HTTPS!

 

Navigation and Scrolling

The single page app utilises bootstrap for rendering the content nicely and providing the navigation features. The following code snippet show how we can make all links scroll smoothly.

 

Continue reading