Tuesday 9 June 2020

12. PGT Computer Science Question Discussion


A subquery in an SQL SELECT statement:

A. can only be used with two tables.
B. can always be duplicated by a join.
C. has a distinct form that cannot be duplicated by a join.
D. cannot have its results sorted using ORDER BY.


In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.

In JOINs RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.

The good thing in sub-queries is that they are more readable than JOI, but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.

        Subquery can not be duplicate by a join. So Option (C) is the correct Answer. 


Sub-queries in Select-Fields

SELECT moo, (SELECT roger FROM wilco WHERE moo = me) AS bar FROM foo
Sub-queries in the Where-statement
SELECT moo FROM foo WHERE bar = (SELECT roger FROM wilco WHERE moo = me)


In the year 2010 I would have joined the author of this questions and would have strongly voted for JOIN, but with much more experience (especially in MySQL) I can state: Yes subqueries can be better. I've read multiple answers here; some stated subqueries are faster, but it lacked a good explanation. I hope I can provide one with this (very) late answer:

First of all, let me say the most important: There are different forms of sub-queries

And the second important statement: Size matters

If you use sub-queries, you should be aware of how the DB-Server executes the sub-query. Especially if the sub-query is evaluated once or for every row! On the other side, a modern DB-Server is able to optimize a lot. In some cases a subquery helps optimizing a query, but a newer version of the DB-Server might make the optimization obsolete.
Sub-queries in Select-FieldsSELECT moo, (SELECT roger FROM wilco WHERE moo = me) AS bar FROM foo



Be aware that a sub-query is executed for every resulting row from foo.
Avoid this if possible; it may drastically slow down your query on huge datasets. However, if the sub-query has no reference to foo it can be optimized by the DB-server as static content and could be evaluated only once.


Sub-queries in the Where-statementSELECT moo FROM foo WHERE bar = (SELECT roger FROM wilco WHERE moo = me)




If you are lucky, the DB optimizes this internally into a JOIN. If not, your query will become very, very slow on huge datasets because it will execute the sub-query for every row in foo, not just the results like in the select-type.
Sub-queries in the Join-statementSELECT moo, bar FROM foo LEFT JOIN ( SELECT MIN(bar), me FROM wilco GROUP BY me ) ON moo = me


No comments:

Post a Comment