Using Databases with Dispersion Modeling: Simple Queries


We are in the midst of the information age. There are databases and datasets everywhere we go. One of the commissioners at the Texas Commission on Environmental Quality addressed agency management and staff regarding the stockpile of information the agency collects by saying, “we are drowning in information but are thirsting for knowledge.”

How does one navigate through the maze of nearly countless bits and bytes to gain knowledge? We do so by using the database query. If you ask the right question, you will get just the information you need in a manageable and understandable manner.

In our previous article (http://www.naviknow.com/2018/07/11/using-databases-with-dispersion-modeling-the-basics/), we described the basic structures of a database and some of the concepts and practices to build a database. In this article, we will briefly describe how to create the building blocks (database tables), populate those tables, and access and extract data from those tables. The focus will be on accessing data.

To not overwhelm you, we are going to start out with simple queries. We will go into more details and capabilities of queries in future articles. I have been working with databases for over 20 years, and I continue to learn new things (by no means am I an expert), so we will start out with just basic material.

Query Basics

Whether you see it or not (most database platforms have a graphical user interface, GUI) nearly every command is in structured query language (SQL). Each platform will have a slightly different dialect, but for the most part, SQL is SQL.

Making the Building Blocks

Though you could more easily create tables with your database platform GUI, it is helpful to know that a SQL statement is created and executed behind the scenes. The SQL statement to create a table has the following syntax and structure:

CREATE TABLE  <Table Name>
(
    <field_1> <data type> <NULL/NOT NULL> <PRIMARY KEY>,
    <field_2> <data type> <NULL/NOT NULL>,
    <field_3> <data type> <NULL/NOT NULL>,
    …
    <field_n> <data type> <NULL/NOT NULL>
)

Green elements (names and data types) are mandatory and the purple (nullability) and red elements (keys) are optional. The specific statement to create one of the tables in our example database would look like this:

CREATE TABLE  tblEPN_Source
(
    EPN varchar(255) NOT NULL PRIMARY KEY,
    source_id varchar(8) NOT NULL
)

With this statement, we are creating the table tblEPN_Source. This table is to store our correlations of EPNs listed on our permit to sources contained in our model input file. The NOT NULL and PRIMARY KEY terms are optional and can be omitted.

By using NOT NULL, both fields are required to have values, e.g. they cannot be left empty or blank. The EPN name can be as long as 255 characters, but source_id is limited to eight characters, as that is the maximum length of a source name in the most current version of AERMOD. By using PRIMARY KEY with the EPN field, each value of EPN must be unique. You could have one source_id associated with multiple values of EPN. If that doesn’t work for your project, the PRIMARY KEY term can be left out.

Adding Data To Tables

To populate data tables, the INSERT INTO statement is used. There are several ways it can be used.

For adding one record at a time, the SQL statement would have the following structure if you are supplying values for all fields and in the order the fields are defined in the table:

INSERT INTO <table_name>
VALUES (value_1, value_2, …, value_n)

If you desire to enter values to selected fields but not all or if you will be providing values in a different order than how the table is defined, the statement changes a little to look like:

INSERT INTO <table_name> (field_1, field_2 , …, field_n)
VALUES (value_1, value_2, …, value_n)

In both cases, the order of the fields and values must correspond.

Accessing Data From Tables

Once there is something in your data tables, you will want to access and put these data to use. To access data from a table or multiple tables (we will get to querying multiple tables in a future article), you will use a SELECT statement. The general structure and syntax are:

SELECT (field_1, field_2 , …, field_n)
FROM <table_name>
WHERE <some criteria>
ORDER BY (field_i, field_j…field_k) [ASC/DESC]

The field list on the SELECT clause is in the order you want the data returned. If the way the data are stored in a table is not to your liking, the query can return the data in the order you want.

If you want to return data that meets certain conditions, you would enter the condition  in the WHERE clause. A list of operators you can use in the WHERE clause is:

  • = ; Equal to
  • <>, != ; Not equal to
  • <, <=, >, >=, !<, !> ; Ineqaulities
  • IN (value_1, value_2,…value_n)
  • BETWEEN <value_1> AND <value_2>
  • LIKE (%’<text string>’%)

In addition, logical operator can be used with the list above, like:

  • AND
  • OR
  • NOT

In addition to selecting which fields and the order they return the data, you can also specify the order of the records returned by using the ORDER BY clause. With the ORDER BY clause, you can specify the sort order by listing fields by which you want the sort and whether the order is ascending (ASC) or descending (DESC).

Specific to our example database, a select statement returning all records associated with a source id of 013 or 014 would be:

SELECT   tblEPN_Source.source_id,
         tblEPN_Source.EPN
FROM     tblEPN_Source
WHERE    tblEPN_Source.source_id = ‘013’
OR       tblEPN_Source.source_id = ‘014’
ORDER BY tblEPN_Source.EPN DESC

Summary

To build tables, insert data, and access data from a table, we would create and execute SQL statements within our database platform. To build tables, we would use the CREATE TABLE statement. To populate a table, we can use the INSERT INTO statement. To access the data in a table, we would use the SELECT statement.

If you are wondering when we are going to do something more dispersion modeling related, in the next article in this series, which describes performing calculations with queries, our direction will appear more clear. We are laying the foundation for some really powerful things you can do when it comes to dispersion modeling. Your patience will be rewarded.

If you found this article informative, there is more helpful and actionable information for you.  Go to http://learn.naviknow.com to see a list of past webinar mini-courses. Every Wednesday (Webinar Wednesday), NaviKnow is offering FREE webinar mini-courses on topics related to air quality dispersion modeling and air quality permitting. If you want to be on our email list, drop me a line at [email protected].

One of the goals of NaviKnow is to create an air quality professional community to share ideas and helpful hints like those covered in this article. So if you found this article helpful, please share with a colleague.