Files
ezrknn-llm/CHANGELOG.md
2024-05-09 17:31:27 +08:00

17 lines
640 B
Markdown
Executable File

# CHANGELOG
## v1.0.1
- Optimize model conversion memory occupation
- Optimize inference memory occupation
- Increase prefill speed
- Reduce initialization time
- Improve quantization accuracy
- Add support for Gemma, ChatGLM3, MiniCPM, InternLM2, and Phi-3
- Add Server invocation
- Add inference interruption interface
- Add logprob and token_id to the return value
## v1.0.0
- Supports the conversion and deployment of LLM models on RK3588/RK3576 platforms
- Compatible with Hugging Face model architectures
- Currently supports the models Llama, Qwen, Qwen2, and Phi-2
- Supports quantization with w8a8 and w4a16 precision