Files
ezrknn-llm/CHANGELOG.md
2024-05-09 17:31:27 +08:00

640 B
Executable File

CHANGELOG

v1.0.1

  • Optimize model conversion memory occupation
  • Optimize inference memory occupation
  • Increase prefill speed
  • Reduce initialization time
  • Improve quantization accuracy
  • Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3
  • Add Server invocation
  • Add inference interruption interface
  • Add logprob and token_id to the return value

v1.0.0

  • Supports the conversion and deployment of LLM models on RK3588/RK3576 platforms
  • Compatible with Hugging Face model architectures
  • Currently supports the models Llama, Qwen, Qwen2, and Phi-2
  • Supports quantization with w8a8 and w4a16 precision