Firestore query result pagination

I have an android app using firebase Firestore, and I need to run a calculation on a large (>1000) number of documents. Normally in a query response a database would specify some metadata to indicate if the query had returned all the results, or just a portion with pagination token used to retrieve the result with through subsequent requests.

It appears from the Firestore documentation that there is none of that, only that that the query results are limited to 10mb for a single query. How is this enforced? Will it return a client error? Will it just send back a truncated result? Do I need to apply my own limit, and then how can I know what a reasonable limit is when I don’t know the exact size of these documents?

My question is basically how to run a query for a large set of documents that is likely to exceed the single query limit?

Answer

How to run a query for a large set of documents that is likely to exceed the single query limit?

The best solution I can think of is to create an alternative structure and duplicate some data. Since a query of 1000 results might exceed the limit, you should consider creating another collection that looks like this:

Firebase-root
  |
  --- revenues (collection)
       |
       --- May 2021 (document)
       |    |
       |    --- [{May 17, 2021 at 7:51:42 PM UTC+3: 123.33}, 
       |         {May 18, 2021 at 3:36:11 PM UTC+3: 444.74}]
       |
       --- June 2021 (document)
            |
            --- [{June 11, 2021 at 3:12:22 PM UTC+3: 523.18}, 
                 {June 23, 2021 at 2:39:54 PM UTC+3: 253.14}]

As you can see, I have created a new collection called “revenues” that contains a document for each month of the year. Each document contains an array, that contains in terms key values pairs, where the key is the date and the value is the total amount of the invoice. As you can see, in my example above I have added in each month only two invoices. However, if you’ll only store the above data (date/amount), you’ll be able to store even much more than 1000 invoices within a single array, meaning that you’ll be able to stay below the 1MiB max document size limit.

Since the user selects the start and the end date, you’ll always know which document to start with and which to document to end with. So in this solution, you’ll calculate the revenue on the client. For example, if you want to calculate the entire revenue from May 17th, 2021 to June 23th 2021, you have to read both documents and sum only the elements that exist between those dates. Instead of reading 1000 documents to get the total revenue, which in my opinion is a little costly, you’ll only need to read two documents, assuming that a document might hold 1000 invoices.

This practice is also called denormalization and is a common practice when it comes to Firebase. If you are new to NoSQL databases, I recommend you see this video, Denormalization is normal with the Firebase Database for a better understanding. It is for Firebase Realtime Database but the same rules apply to Cloud Firestore.

Also, when you are duplicating data, there is one thing that you need to keep in mind. In the same way, you are adding data, you need to maintain it. In other words, if you want to update/delete an invoice, you need to do it in every place that it exists.

For more information please also see my answer from the following post:

If you want to directly map an array of objects from Cloud to a List of custom objects, please check my article from the following URL:

If you also want to implement pagination, please check my answer from the following post:

In which you can paginate queries by combining query cursors with the limit() method. It’s some kind of old but I also recommend you take a look at this video for a better understanding.

If you need pagination on button click, please my answer below: