#Sam Ayo answers .. linkedin.com/in/sam-ayo
If you use one-hot-encoded for the label/classes like [0, 0, 0, 0, 0, 0, 1]
you need to use categorical_crossentropy
but if you use sparse label/classes like [1, 2, 3, 4, 5, 6, 7]
you need to use sparse_categorical_crossentropy.
Change Categorical Cross Entropy to Binary Cross Entropy since your output label is binary. Also Change Softmax to Sigmoid since Sigmoid is the proper activation function for binary data
y_train = to_categorical(y_train, 3)
y_test = to_categorical(y_test, 3)