Get most recent result from a LEFT JOIN column

I’m creating a custom forum from scratch and I’m attempting to use some LEFT JOIN queries to get information such as total posts, total threads and most recent thread. I’ve managed to get the data but the recent thread keeps returning a random value rather than the most recent thread.

CREATE TABLE forum_categories
    (`name` varchar(18), `label` varchar(52), `id` int)
;
    
INSERT INTO forum_categories
    (`name`, `label`, `id`)
VALUES
    ('General Discussion', 'Talk about anything and everything Digimon!', 1),
    ('Deck Discussion', 'Talk about Digimon TCG Decks and Strategies!', 2),
    ('Card Discussion', 'Talk about Digimon TCG Cards!', 3),
    ('Website Feedback', 'A place to discuss and offer feedback on the website', 4)
;

CREATE TABLE forum_topics
    (`name` varchar(18), `id` int, `parent_id` int, `author_id` int, date date)
;
    
INSERT INTO forum_topics
    (`name`, `id`, `parent_id`, `author_id`, `date`)
VALUES
    ('My First Topic', 1, 1, 16, '2021-03-29'),
    ('My Second Topic', 2, 1, 16, '2021-03-30')
;

CREATE TABLE forum_topics_content
    (`id` int, `topic_id` int, `author_id` int, date datetime, `content` varchar(300))
;
    
INSERT INTO forum_topics_content
    (`id`, `topic_id`, `author_id`, `date`, `content`)
VALUES
    (1, 1, 16, '2021-03-29 15:46:55', 'Hey guys! This is my first post!'),
    (2, 1, 16, '2021-03-30 08:05:13', 'This is my first topic reply!')
;

My Query:

SELECT forum_categories.name, label, forum_categories.id, COUNT(DISTINCT(forum_topics.id)) as 'topics', COUNT(DISTINCT(forum_topics_content.id)) as 'posts', SUBSTRING(forum_topics.name,1, 32) as 'thread'
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id
ORDER BY forum_categories.id, forum_topics.date DESC

I figured having an ORDER BY of forum_topics.date DESC would work for me and output the most recent thread which is "My Second Topic" but it doesn’t. I’m a bit stumped and have tried different variations of ORDER BY to no avail. thread keeps returning a random result from the two possible results.

Full example with data is available on this fiddle: https://www.db-fiddle.com/f/auDzUABaEpYzLKDkRqE7ok/0

Desired result would 'thread' always being the latest thread which in this example is "My Second Topic". However it always seems to randomly pick between "My First Topic" and "My Second Topic".

The output for the first row should always be:

'General Discussion' , 'Talk about anything and everything Digimon!' 1, 2, 2, 'My Second Topic'

Answer

thread keeps returning a random result from the two possible results.

Provided query is simply undeterministic and equivalent to:

SELECT forum_categories.name, 
  forum_categories.label, 
  forum_categories.id,
  COUNT(DISTINCT(forum_topics.id)) as 'topics',
  COUNT(DISTINCT(forum_topics_content.id)) as 'posts',
  SUBSTRING(ANY_VALUE(forum_topics.name),1, 32) as 'thread'
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id, ANY_VALUE(forum_topics.date) DESC;

Assuming that forum_categories.id is PRIMARY KEY, the name/label are functionally dependent but rest of the column is simply ANY_VALUE.

If a column in SELECT list is not functionally dependent or wrapped with aggregate function the query is incorrect. On MySQL 8.0 or when ONLY_FULL_GROUP_BY is enabled the result is error.

Related: Group by clause in mySQL and postgreSQL, why the error in postgreSQL?


There are different ways to achieve desired result(correlated subqueries, windowed functions, limit) and so on.

Here using GROUP_CONCAT:

SELECT forum_categories.name, 
  forum_categories.label, 
  forum_categories.id,
  COUNT(DISTINCT(forum_topics.id)) as `topics`,
  COUNT(DISTINCT(forum_topics_content.id)) as `posts`,
  SUBSTRING_INDEX(GROUP_CONCAT(SUBSTRING(forum_topics.name,1,32)
                 ORDER BY forum_topics.`date` DESC 
                 SEPARATOR '~'),
                 '~',1) AS `thread`
FROM forum_categories 
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id;

How it works:

GROUP_CONCAT is aggregate function that allow to concatenate string preserving order.

My Second Topic~My First Topic~My First Topic

Then SUBSTRING_INDEX returns part of string up to first occurence of delimeter ~.

db<>fiddle demo