Maxwell Ashford. Facilitating Cross-Domain Reasoning Generalization through Conservative Offline Reinforcement Learning Leveraging Pre-trained Large Language Model Representations. CIS [Internet]. 2026 May 19 [cited 2026 Jun. 8];4(1). Available from: https://www.scivexus.org/index.php/CIS/article/view/196