Will the use of count(distinct depositor.account_number) make a difference here

Given Database Schema: [The primary key for the tables is mentioned in bold) account(account_no , branch_name,balance) depositor(customer_name,account_number) {No key was stated here} customer(customer_name,customer_street,customer_city)

The question required to write the SQL query for – Find average balance for each customer who lives in Harrison and has atleast 3 accounts.

I wrote the following SQL query:

select depositor.customer_name,avg(balance)
from depositor,account,customer
where depositor.account_number=account.account_number and
          depositor.customer_name=customer.customer_name  and
          customer_city='Harrison'
group by depositor.customer_name
having count(depositor.account_number) >=3

My textbook mentions of the query as:

select depositor.customer_name,avg(balance)
from depositor,account,customer
where depositor.account_number=account.account_number and
          depositor.customer_name=customer.customer_name  and
          customer_city='Harrison'
group by depositor.customer_name
having count( distinct depositor.account_number) >=3

Would placing distinct here lead to a change in result ? According to my analysis, The cross-product of the resultant relation (depositoraccountcustomer) will have the candidate key as customer_name account_number so distinct would not add any value here.

Answer

It seems that depositor table stores unique combinations of customer_name and account_number, meaning that adding DISTINCT in the count should not make a difference.

But if the table was a fact table which had repeating instances of the same account_number you wouldn’t want to count the same account number twice and in that case, it would have made a difference.

But in your case, it should not make a difference since the other two tables also seem to contain unique combinations of their respective fields.

Leave a Reply

Your email address will not be published. Required fields are marked *