Define the size of each element as the third dimension/parameter in the shape of the multi-dimensional array in python

I have a dataset that looks like (top 5 values or head of my dataset):

  a     b     c     d
00010 01001 01001 01000
01001 00101 01001 01001
00011 00011 10000 01001
00101 01000 01001 01000
01001 00101 01001 01001

let no. of samples=190 so when i check the shape of data, it comes out to be (193, 4)

I want my dataset to show the shape : (190, 4, 5) where 3rd element in shape = number of values in the particular column = 00011 = 5

How can I achieve this?

data1 = pd.read_csv(r"F:my work/binary_bm.csv", dtype=str)
df2 = np.array(data1)
df2
array([['00010', '01001', '01001', ..., '10000', '00110', '00101'],
   ['01001', '00101', '01001', ..., '10000', '00100', '00110'],
   ['00011', '00011', '10000', ..., '01001', '00011', '00011'],
   ...,
   ['01000', '01001', '01000', ..., '10000', '00100', '00110'],
   ['00010', '01001', '01001', ..., '10000', '00101', '00011'],
   ['00110', '00110', '01001', ..., '10000', '00101', '00101']],
  dtype=object)
data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1335 entries, 0 to 1334
Data columns (total 8 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   a       1335 non-null   object
 1   b       1335 non-null   object
 2   c       1335 non-null   object
 3   d       1335 non-null   object
 4   e       1335 non-null   object
 5   f       1335 non-null   object
 6   g       1335 non-null   object
 7   h       1335 non-null   object
dtypes: object(8)
memory usage: 83.6+ KB

x_train = df1.iloc[:930,:5]
x_test = df1.iloc[930:, :5]
y_train = df1.iloc[:930, 5:]
y_test = df1.iloc[930:, 5:]
x_train.head()
       a      b       c       d       e
1016 00010  01001   01001   01000   00110
445  01001  00101   01001   01001   00110
458  00011  00011   10000   01001   10000
251  00101  01000   01001   01000   00110
980  01001  00101   01001   01001   00110

x_tr = np.array(x_train)
print(x_tr)
[['00010' '01001' '01001' '01000' '00110']
 ['01001' '00101' '01001' '01001' '00110']
 ['00011' '00011' '10000' '01001' '10000']
 ...
 ['00100' '01000' '01001' '01000' '00111']
 ['00101' '01001' '01001' '01000' '00110']
 ['00100' '01000' '01001' '01000' '00111']]

x_tr.shape
(930, 5)

Answer

You could loop over the array and convert each string to a list of integers, then reconstruct the array and reshape to the final form.

Using the following sample:

>>> x = np.array([['00010', '01001', '01001', '01000', '00110'],
                  ['01001', '00101', '01001', '01001', '00110'],
                  ['00011', '00011', '10000', '01001', '10000'],
                  ['00100', '01000', '01001', '01000', '00111'],
                  ['00101', '01001', '01001', '01000', '00110'],
                  ['00100', '01000', '01001', '01000', '00111']])

this would look like:

>>> np.array([int(k) for s in x.flatten() for k in s]).reshape(-1, 5, 5)
array([[[0, 0, 0, 1, 0],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 1, 0]],

       [[0, 1, 0, 0, 1],
        [0, 0, 1, 0, 1],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 1],
        [0, 0, 1, 1, 0]],

       [[0, 0, 0, 1, 1],
        [0, 0, 0, 1, 1],
        [1, 0, 0, 0, 0],
        [0, 1, 0, 0, 1],
        [1, 0, 0, 0, 0]],

       [[0, 0, 1, 0, 0],
        [0, 1, 0, 0, 0],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 1, 1]],

       [[0, 0, 1, 0, 1],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 1, 0]],

       [[0, 0, 1, 0, 0],
        [0, 1, 0, 0, 0],
        [0, 1, 0, 0, 1],
        [0, 1, 0, 0, 0],
        [0, 0, 1, 1, 1]]])