# Bingo game database representation for scalability

This is my first scalability-related question.

In order to simplify the problem, I will use a bingo app idea:

We have a bingo application. Each users has a ticket with 15 random numbers out of 90. Each week the bingo is held to find the winner. Numbers are drawn live until there is a winner. Ex:

• 15th number is drawn -> search table -> no match
• 16th number is drawn -> search table -> no match
• 30th number is drawn -> search table -> winner -> stop

# Question 1:

Which way is better / faster for representing data in table and for searching that table? Table will have 10+ million rows

# Idea 1:

Table Tickets

```id           user_id          week    ticket                                                 created
====================================================================================================
1            100022312        1       1,3,5,7,9,14,15,77,78,79,80,81,82,83,84            <timestamp>
2            102232123        1       2,5,9,22,33,44,55,66,77,,78,79,80,88,89            <timestamp>
3            201141028        1       7,8,9,11,22,33,34,35,37,39,51,55,58,63,66          <timestamp>
...
9.000.000    126387125        1       8,18,28,38,48,58,68,78,79,80,81,82,83,84,85        <timestamp>
10.000.000   126387126        1       1,4,14,24,34,45,56,66,67,68,79,80,81,82,83         <timestamp>
```

Php

```\$drawn_numbers = '1,2,3,4,5,6,7,89,10,11,12,13,14,15,16,17,18,19,20,21,22, ...';

\$result = query("SELECT * FROM Tickets WHERE sksf('\$drawn_numbers')");
```

where `sksf` would be some kind of substring function / regex / LIKE done in MySql.

# Idea 2:

Table Tickets

```id           user_id          week    n1   n2   n3   ...   n15        created
=============================================================================
1            100022312        1       11   32   52   ...   76     <timestamp>
2            102232123        1       22   52   55         78     <timestamp>
3            201141028        1       77   82   83   ...   89     <timestamp>
...
9.000.000    126387125        1       81   55   32   ...   10     <timestamp>
10.000.000   126387126        1       12   42   13   ...   77     <timestamp>
```

Php

```\$drawn_numbers = '1,2,3,4,5,6,7,89,10,11,12,13,14,15,16,17,18,19,20,21,22, ...';

\$result = query("SELECT * FROM Tickets WHERE contidion1 AND condition2 AND ...");
```

Unfortunately, here I still don’t have any idea about the conditions.

# Idea 3:

I select all tickets, iterate though them by checking if the drawn numbers contain any ticket.

```\$drawn_numbers = '1,2,3,4,5,6,7,89,10,11,12,13,14,15,16,17,18,19,20,21,22, ...';
\$all_tickets = query("SELECT * FROM Tickets");

foreach (\$all_tickets as \$ticket) {
if \$drawn_numbers.contains(\$ticket['ticket'])
return \$ticket['id'];
}
```

# Question 2:

Will having sorted numbers help in anyway? (Those 15 numbers and the drawn numbers)

# Question 3:

What happens when week 2 arrives? Should I use the same table, adding a condition `WHERE week=2`, or is it better to have only 1 week per table?

# Update

In the original game, a ticket has 15 numbers in 3 rows, each row having 5 numbers. After each number is drawn live, they are able to also calculate the tickets which have one row found in the drawn numbers (also they know the tickets with 2 rows). A ticket having 3 rows found in the drawn numbers would mean winning ticket.

This information makes me think that the representation looks something like:

# Idea 4:

Table Tickets

```id           user_id          week    row1                row2                 row3                     created
===============================================================================================================
1            100022312         1      1, 3, 5, 7, 9       14,15,77,78,79,      80,81,82,83,84       <timestamp>
2            102232123         1      2, 5, 9,22,33,      44,55,66,77,78,      79,80,87,88,89       <timestamp>
3            201141028         1      7, 8, 9,11,22,      33,34,35,37,39,      51,55,58,63,66       <timestamp>
...
9.000.000    126387125         1      8,18,28,38,48,      58,68,78,79,80,      81,82,83,84,85       <timestamp>
10.000.000   126387126         1      1, 4,14,24,34,      45,56,66,67,68,      79,80,81,82,83       <timestamp>
```

# Example of ticket:

```__  13  __  33  40  __  __  70  __
2  __  22  37  __  52  62  __  81
__  19  23  __  44  __  63  __  89
```

# Example of drawn numbers (not necessarily in sorted order)

```1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 19, 23, 44, 63, 89
-> there is a winner, no more numbers are drawn.
-> Our ticket did not win jackpot, but we won one row [19, 23, 44, 63, 89] (free ticket)
-> One can win also 2 rows.
```

I would use two `BIGINT` columns, and have each bit represent one of the bingo values (1..90). It will take a little bit of manipulation to construct a set of balls, but the searching will be easy, the storage will be much more compact, etc.

Let’s have 2 columns, one with numbers 1..60, the other with 61..90. (The choice is somewhat arbitrary, but makes it easy to visualize.) Now we can do it with one `BIGINT` and one `INT`.

```IF(\$value <= 60, 1 << \$value, 0) -- the bit for the BIGINT
IF(\$value >  60, 1 << (\$value - 60), 0)  -- the bit for the INT
```

Now or’ing together bits gives you a pair of numbers to represent a winning ticket. Use the `|` Operator for this task.

Each player’s current balls will also be or’d — a new bit being turned on after each play.

Then the test becomes something like this:

```-- The balls owned by a user:
user_60 BIGINT,
user_30 INT

-- winning balls (12 rows, a total of 5 bits on for each row)
win_60 BIGINT,
win_30 INT

-- Note: FREE SPACE should be pre-populated in both structures

BIT_COUNT(user_60 & win_60) +
BIT_COUNT(user_30 & win_30)  = 5  -- he's a winner!
```

I have left out an important step — the arrangement of the numbers on each user’s board. That will takes some up-front work, especially since 15 of the numbers can only go in the first column of the card. Etc.

Your takeaway from my Answer is to work with bits not other structures.

Another thought is to have 5 `SMALLINT UNSIGNED`, one for each column on the cards.

(For the future, in MySQL 8.0, `BINARY(16)` would allow a single column to represent a set of 90 values, thereby making the code cleaner.)

re Idea 4

Build 3 bit patterns – one per “row” with 5 numbers. Compare them to the issued tickets: Foreach ticket, compare the 3 patterns; count how many match.