SQL is what helps us communicate with databases and turn big data into information that we can use to make better decisions. Like every other programming language, to be successful in SQL, you have to be patient.
Database Structure
Datum is defined as a “piece of information”, so in short data is simply a datum. Data is found everywhere and contains everything. The simplest way to visualize data is by using a table. A table is a grid that consists of columns and rows.
The table above contains different types of data. Data such as name, position, age, and remote worker. Data in a nutshell is just information. All databases contain metadata, which describes the structure and formatting of the data itself.
A database can simply be defined as a collection of data arranged in a way that makes it fast and easy for retrieval by a computer. Data inside a database is usually stored in a collection of tables with each table containing a specific set of data.
The whole purpose of a database is to interact, organize, and analyze data across a multitude of sources. When data is in a table, they have the capacity to relate to one another.
Rows in a table are considered records.
Columns in a table are considered fields.
Relational Databases
A relational database is a database design created by Edgar Codd. A relational database will contain a lot of tables that are related to one another.
Every table in a relational database should have a primary key. The primary key serves as a unique identifier for a record in the table. Each primary key must be unique and must not be empty (null).
A foreign key is a field in a table that acts as a primary key for another table in the database.
A composite key will consist of two or more columns that are used to define the primary key.
Data Types
When developing a database using SQL, a specific data type must be designated for each column used. Normally you will have character or text-based data types, dates and times, numeric data types, and Boolean values.
Numeric Data
Numeric data types include integers which are whole numbers and do not contain any decimals. When you need decimal data, we would use decimal data (not integer data), which allows us to use decimals. You can limit the length of decimal and integer data, so instead of 3.14159265359, you can limit the length so it produces 3.14, or 3.1, or 3.1416, however, you choose.
Character or Text-Based Data
Character or text-based data holds both a fixed-length string of characters and a variable-length string. So if a database includes postal codes, you would use fixed-length configured for six characters (postal codes are six digits). If you wanted first or last name, then you would use variable length using minimum and maximum length limits.
Dates and Times
Dates and times are important data that you may end up using. You can choose a variety of different date and time layouts such as YYYY-MM-DD, YY-MM-DD, and HH:MI:SS. You can format it however you want.
Boolean
A Boolean value is either True or False. You may use it if you’re working for a government or private entity.
Database Management Systems
SQL operates in a variety of software packages known as relational database management systems (RDBMSs). Popular ones are Oracle Database, Microsoft SQL Server, MySQL, PostgreSQL, and SQLite.
RDBMSs are different from one another Some are more graphic than others while some are more text-based. Presenting database information is also different from one another.
SELECT Statement
The most commonly used SQL command is SELECT. Like many other programming languages, the choice of keywords and the overall structure is vital for it to work. We called this syntax.
The syntax varies slightly from different RDBMSs. It’s usually a little, so it’s more important to understand the logic of SQL.
Queries, Statements, Keywords, and Clauses
A query is a request that returns information from a database in form of records.
Statements is a valid piece of code that is executed by RDBMS.
A clause is a subsection of a query, containing at least one keyword and relevant information to be used with that keyword.
Queries
Here are a few simple commands that we’ll use. Note that I will not be showing pictures of the code in action as this is more of a simple guide, so it may be hard to understand. I would recommend checking my "Introduction to SQL” series for a more easy-to-understand guide that has pictures.
Adding Comments
Comments are just English sentences that you would write out to help add insight and explain what you have written. It's a best practice to use comments since it helps developers understand the intention and function of the SQL statement.
Using ORDER BY
Sometimes it’s useful to order the results by first name or last name, so we would use ORDER BY. ORDER BY clause helps us sort our query results by any field you’ve chosen. The default order is ascending, A to Z. The special keyword is ASC for ascending and for descending, it is DESC. You would put ASC and DESC after the field is sorted. The ORDER BY clause comes after the FROM clause.
For example:
ORDER BY LastName DESC;
Of course, we can add multiple columns, so the single column doesn’t feel lonely:
ORDER BY FirstName ASC, LastName DESC;
NOTE: Some values may show up NULL if there are rows with empty values.
LIMIT
If you’re not interested in looking through thousands and thousands of data then you can use LIMIT to limit the results to a specific number of rows. Just add LIMIT 10 after the ORDER BY clause to return the first ten values from the query. It doesn’t have to be 10, you can put 5, 7, 15, or 29, any number you choose.
For example:
ORDER BY LastName DESC
LIMIT 10
NOTE: You do not need to use the ORDER BY clause in order to use the LIMIT statement.
Data into Information
Displaying and ordering fields is the first step, but we do need more tools, so we can ask more specific questions.
Operators, Comparison, Logical, and Arithmetic
Operators are special keywords that we use to compare the values of fields, select subsets, or perform arithmetic operations.
The three types of operators are comparison, logical, and arithmetic.
Comparison:
= Equal to
> Greater Than
< Less Than
>= Greater Than or Equal To
<= Less Than or Equal To
<> Not Equal To
Logical:
BETWEEN
IN
LIKE
AND
OR
Arithmetic:
+ Add
- Subtract
/ Divide
* Multiply
% Modulo
The three different types of operators can be combined to make more complex queries. We can create unique conditions and search for ranges of data.
WHERE Clause
The WHERE clause allows us to add specific conditions to our queries. We can limit the results so that only data that satisfies our conditions appear on the result set. We can filter numbers, text, and dates. We insert the WHERE clause after the FROM clause. It always comes after the FROM clause but before the ORDER BY clause.
For example:
FROM payment
WHERE Total = 1.34
ORDER BY Date;
We can use AND and BETWEEN to find a more specific value. To find a precise value, we would need an IN operator. With the IN operator, we can add many values as we need (separated by a comma).
The OR operator allows you to find results that match any of the criteria you ask for.
Filtering by Text
We don’t have to use numbers, we can also use operators to return specific text.
For example:
FROM FirstName
WHERE first = ‘John’;
We would only see names from John.
When using text in the WHERE clause, the text values must be surrounded by single quotes (‘John’).
LIKE Operator
SQL allows us to search for parts of a text value and it can be done using the LIKE operator. This is useful when we don’t know how a text value is spelled in a database or if it’s even spelled correctly.
The LIKE operator is represented by ‘%’, which are considered wildcard characters. Wildcard characters must always be enclosed with single quotation marks. With quotation marks, % would just be a modulo.
For example:
FROM FirstName
WHERE first LIKE ‘%J%’;
The LIKE operator can also be used to exclude results by placing the NOT keyword in front of the LIKE.
For example:
FROM FirstName
WHERE first NOT LIKE ‘%J%’;
Wildcard usage:
‘J%’ will produce results beginning with J
‘%J’ will produce results ending with J
‘%J%’ will produce results with J in the middle of a string of text
'J%J’ will produce results beginning and ending with J
CASE Statement
The CASE statement allows you to create a new field in your database that serves as a unique user-specified condition. The CASE statement is like the IF statement in other programming languages. We would have to use the keyword END to end the statement. You can use the ELSE keyword with the CASE statement. The ELSE keyword doesn’t have to be included, but it’s better to include it. Without the ELSE clause, if results fall outside of your conditions then they will be returned as NULL.
We would need to create an alias that becomes the name of the new field we have created for our database. This will go after the END statement and we use the AS keyword. So we would call it END AS and then the alias name we would choose for the new field.
Multiple Tables
As of now, we have only looked at retrieving data from one table. In order to get the full power of SQL, we would need to combine tables. Accessing data from multiple tables simultaneously is important.
The reason that we have multiple tables is to keep databases smaller. We called this normalization and it is the process of distributing fields across related tables. You don’t want a massive data size as it could slow the query processing time.
Joins
Joins is a command that will help us combine tables. Each table should have at least one field that serves as a primary key. A primary key usually exists as a foreign key in another table.
There are several different types of joins and each join functions in a slightly different way.
Joins with Alias
It is necessary to specify the table name when listing a specific field. Joins are used with an alias to reduce the typing time required as well as make it easier for you and developers to read.
For example:
FROM first AS f
INNER JOIN last AS l
ON f.FullName = l.FullName;
Inner Join
Inner join returns only matching records. Any unmatched data from either table is ignored.
Because inner joins only return matching data, the order of the table doesn’t matter. However, for other joins, the order of the table will matter.
The inner join is the most common join. Just remember that the main use of inner join is to bring corresponding data together from different tables in a database.
Left Outer Join
The left outer join combines all the results from the left table with matching results from the right table.
For example:
FROM first AS f
LEFT OUTER JOIN last AS l
ON f.FullName = l.FullName;
Remember that the order of the table matters for this one. Switching the order will produce different results.
With the left outer join, the '“outer” keyword is optional, so you can put it in or not.
Right Outer Join
The right outer join combines all the results from the right table with matching results from the left table. It is the mirror image of the left join and functions in a similar way.
Remember that the order of the table matters for this one. Switching the order will produce different results.
Just like the left outer join, the “outer” can be removed from the query if you want to that is. Just write RIGHT JOIN to produce the same result.
Full Outer Join
The full outer join combines all the results from both tables.
Using Functions
Calculations can be performed by adding functions. Using COUNT(), we can count the number of rows a column has. It will add up all the fields.
For example:
SELECT COUNT(LastName)
Functions
Functions are special keywords that perform an operation, accept certain parameters, and return the result of that operation as a value. Just note that other RDBMSs may have different functions. I’m using the functions that are for SQLite. Some of these functions may appear in other RDBMSs.
STRING modifies character and text data
INSTR()
LENGTH()
LOWER()
LTRIM()
REPLACE()
RTRIM()
SUBSTR()
TRIM()
UPPER()
DATE modify date and time
DATE()
DATETIME()
JULIANDAY()
TIME()
STRFTIME()
AGGREGATE performs mathematical operations
AVG()
COUNT()
MAX()
MIN()
SUM()
Nesting Functions
A nested function is a function contained within another function. We use it to modify the format of the inner function. The ROUND() function is useful when you want to round a decimal up on our results.
For example:
SELECT AVG(Total), ROUND(AVG(Total), 2) FROM payment;
We use 2 to round the decimal to two places, so it would equal 2.35. If you want to round to 3 places then change 2 to 3 and then you would get 2.351.
You would want to be careful when using ROUND() as it could change the value. If you want precise value don’t use ROUND().
GROUP BY Clause
You would want to use GROUP BY with aggregate functions as a way to arrange data into groups.
For example:
SELECT City, AVG(Total) FROM payment
GROUP BY City;
WHERE and HAVING Clause
Using the WHERE clause will allow us to add new criteria.
For example:
SELECT City FROM invoice
WHERE City LIKE ‘S%’;
Just note that the WHERE clause does not work with aggregate data. The WHERE clause only works to pull from the fields in the SELECT clause. If we want to filter based on aggregate functions, we would have to use the HAVING clause.
The HAVING clause comes after the GROUP BY clause.
For example:
SELECT City, AVG(Total) FROM invoice
GROUP BY City
HAVING AVG(Total) > 10;
Just think of HAVING as a special keyword used with a GROUP BY clause. The WHERE clause is for filtering field data while the HAVING clause is for filtering aggregate data. If you use HAVING without GROUP BY it will cause an error.
Subqueries
A subquery is one query that is nested inside of another query.
For example, let’s add a query to WHERE:
SELECT Date, Total FROM invoice
WHERE Total < (select AVG(Total) from invoice);
The query that is surrounded by parentheses is called the inner query.
We can also use the subquery in the SELECT statement.
Also, a subquery doesn’t always contain an aggregate function. You can use it without an aggregate function.
DISTINCT Clause
Oftentimes, there is information that is redundant or duplicated. We want data that is unique and distinct. That is where the DISTINCT keyword comes into play.
For example:
SELECT DISTINCT Id FROM name;
This will show results that don’t contain any duplicates.
Views
A view is a virtual table. It is a query that can be executed repeatedly or referenced by other queries. They are useful when you are constructing the same query. Just add a CREATE VIEW to fix the problem.
If you want to save time then use a view.
Data Manipulation
There are times when we must change or alter the data stored in the table. Statements that are referred to as manipulation are INSERT, DELETE, and UPDATE. These statements are used to add, remove, or modify the data from the tables.
Insert
The INSERT statement is used to insert data into a table one row at a time.
For example:
INSERT INTO
celebrity (Name)
VALUES (‘Lebron James’)
The VALUES keyword is where the actual value is being inserted into the table, in this case, it’s the celebrity table.
If any value doesn’t exist in a field, you need to include a pair of empty quotation marks ‘' as the value doesn’t exist.
Updating
The UPDATE statement is used to modify the existing data in a table. It is often used with the WHERE clause. Without the WHERE clause, the UPDATE statement would update all the rows in the table.
For example:
UPDATE celebrity
SET Code = ‘3345’
WHERE CelebrityID = 9;
SET is used to specify the column within the celebrity column that you intend to update. Follow by an equal sign and the new value that you want to change into. This is later followed by a WHERE clause to specify exactly where celebrities’ results we want to update. By using CelebrityID = 9 we ensure that it’s only this one that is updated and not another one.
Be sure to preview what you are about to update by using the SELECT statement as you don’t want to update random stuff.
Delete
The DELETE statement is used to remove results from a table. Like UPDATE, it is often used with the WHERE clause. Without the WHERE clause, it will delete all rows in the table.
For example:
DELETE FROM celebrity
WHERE CelebrityID = 9;
Be sure to double-check where you want to delete, as a mistake could end up deleting the entire row.
[END]