Right prediction, wrong reasoning

Currect Large language models can predict right but provides wrong reasoning behind the prediction. This project proves this by doing experiment in healthcare domain.