SQLAlchemy use IN with first value only of a subquery

I feel like this will be answered elsewhere, but for the life of me I can’t find the correct search term.

I have a subquery that selects 2 values, id and MAX(date). The date is required for getting the latest value when using GROUP BY, but it is not needed after this stage.

How can I “discard” the date column so that I’m able to use the id IN (subquery) statement? In the meantime I’m selecting func.max(Model.id) to get around the issue.

Here’s a trimmed down example of what I’m attempting to do:

# Get the IDs of each latest link to a relationship
>>> subquery = session.query(Model.id, func.max(Model.date)).group_by(Model.relationship_id)

# Use these IDs as part of another query
>>> query = session.query(Model.id).filter(Model.id.in_(subquery))
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) sub-select returns 2 columns - expected 1

Answer

The issue you’re experiencing is that your subquery is returning 2 columns –> when your main query searches for the id, it doesn’t know which column to search through.

You can specify which subquery column to filter on using c.column_name:

query = session.query(Model.id).filter(Model.id == subquery.subquery().c.id)

Alternatively, since your subquery shares a key with your main query, you can perform a join:
(Might make your life easier in some scenarios)

# This should work, but if it doesn't, may need to specify "on" param
query = session.query(Model.id).join(subquery.subquery())

Leave a Reply

Your email address will not be published. Required fields are marked *