Qwen2.5-HA-0.5B-Instruct

Description

Qwen2.5-HA-0.5B-Instruct is a smart home model fine-tuned based on Qwen2.5-0.5B-Instruct, with approximately 500 million parameters.
The main features of this model include:

Model Type: Causal Language Model
Training Stages: Pre-training and Post-training
Architecture: Transformer, with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings
Number of Parameters: 490 million (360 million non-embedding parameters)
Number of Layers: 24 layers
Number of Attention Heads (GQA): 14 query heads, 2 key-value heads
Context Length: Supports full 32,768 tokens, with a max generation of 8,192 tokens

This model has significant improvements in instruction comprehension, long text generation, and structured data understanding, and supports multilingual capabilities in 29 languages including English, Chinese, and French.
The model has been fine-tuned with a smart home dataset and can output in a structured format by setting system prompts.

Available NPU Models

Base Model

qwen2.5-HA-0.5B-ctx-ax630c

Supports a 1024-length context window
Maximum output of 1024 tokens
Supported platforms: LLM630 Computing Suite, Module LLM, and Module LLM Suite
TTFT (Time To First Token): 525.85ms
Average generation speed: 10.04 token/s

Installation

apt install llm-model-qwen2.5-ha-0.5b-ctx-ax630c

Download llm-model-qwen2.5-0.5B-prefill-20e

Next Overview

Devices & Quick Start

Module LLM

LLM630 Compute Kit

Models

Qwen2.5

Qwen3

DeepSeek-R1

SmolVLM

MeloTTS

Whisper

Llama

Applications

Audio

CV Vision Application

Vision Language Model (VLM)

Large Language Model (LLM)

Voice Assistant

OpenAI API

Description

Available NPU Models

Base Model

qwen2.5-HA-0.5B-ctx-ax630c

Installation

On This Page