InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning

The math abilities of large language models can represent their abstractreasoning ability. In this paper, we introduce and open-source our mathreasoning LLMs InternLM-Math which is continue pre-trained from InternLM2. Weunify chain-of-thought reasoning, reward modeling, formal reasoning, dataaugmentation, and code interpreter in a unified seq2seq format and superviseour model to be a versatile math reasoner, verifier, prover, and augmenter.These abilities can be used to develop the next math LLMs or self-iteration.InternLM-Math obtains open-sourced state-of-the-art performance under thesetting of in-context learning, supervised fine-tuning, and code-assistedreasoning in various informal and formal benchmarks including GSM8K, MATH,Hungary math exam, MathBench-ZH, and MiniF2F. Our pre-trained model achieves30.3 on the MiniF2F test set without fine-tuning. We further explore how to useLEAN to solve math problems and study its performance under the setting ofmulti-task learning which shows the possibility of using LEAN as a unifiedplatform for solving and proving in math. Our models, codes, and data arereleased at https://github.com/InternLM/InternLM-Math.

Further reading