It might seem simple but I wonder whether we are called to use regression or classification. In fact, if we use regression, we can then deduce the class of the output (1 if it is a positive return and 0 if not, for instance). Or, we could preprocess the data, changing then the output set to 1 or 0 depending on the sign of the return, thus we would use classification algorithms. My question is : are we “authorized” to change the y_train values or is it mandatory to give a float value ?
You can definitely train on anything y_train training targets that you want (this is a general fact).
Note however that some predictions might be hard to make (because the target can be largely random), in which case predicting a value between the two classes (instead of only a positive/negative class) gives a better score.
Yes, actually you could try to give a cost function that combines two penalization functions:
1.- the difference between the predicted float value and the real float value and
2.- the class of the output
say with a parameter alpha and (1-alpha) and try to make it learn in an end-to-end fashion.