Qiskit Code Assistantをローカルモードで使用する

Qiskit Code Assistantのモデルをローカルマシンにインストール、設定、および使用する方法を学びます。

Notes

Qiskit Code Assistantはプレビューリリース状態であり、変更される可能性があります。
フィードバックがある場合や開発チームに連絡したい場合は、Qiskit Slack Workspaceチャンネルまたは関連するパブリックGitHubリポジトリをご利用ください。

クイックスタート（推奨）

ローカルモードでQiskit Code Assistantを始める最も簡単な方法は、VS CodeまたはJupyterLab拡張機能用の自動セットアップスクリプトを使用することです。これらのスクリプトは、LLMを実行するためにOllamaを自動的にインストールし、推奨モデルをダウンロードして、拡張機能を設定します。

VS Code拡張機能のセットアップ

ターミナルで以下のコマンドを実行してください：

bash <(curl -fsSL https://raw.githubusercontent.com/Qiskit/qiskit-code-assistant-vscode/main/setup_local.sh)

このスクリプトは以下の手順を実行します：

Ollamaをインストールする（まだインストールされていない場合）
推奨されるQiskit Code Assistantモデルをダウンロードして設定する
VS Code拡張機能をローカルデプロイと連携するように設定する

JupyterLab拡張機能のセットアップ

ターミナルで以下のコマンドを実行してください：

bash <(curl -fsSL https://raw.githubusercontent.com/Qiskit/qiskit-code-assistant-jupyterlab/main/setup_local.sh)

このスクリプトは以下を実行します：

Ollamaをインストールする（まだインストールされていない場合）
推奨されるQiskit Code Assistantモデルをダウンロードして設定する
JupyterLab拡張機能をローカルデプロイと連携するように設定する

利用可能なモデル

現行モデル

Qiskit Code Assistantで使用するための最新の推奨モデルは以下のとおりです：

Qiskit/mistral-small-3.2-24b-qiskit - 2025年10月リリース
qiskit/qwen2.5-coder-14b-qiskit - 2025年6月リリース
qiskit/granite-3.3-8b-qiskit - 2025年6月リリース
qiskit/granite-3.2-8b-qiskit - 2025年6月リリース

GGUFモデル（個人環境・ノートPC向けに推奨）

GGUF形式のモデルはローカル使用に最適化されており、より少ない計算リソースで動作します：

mistral-small-3.2-24b-qiskit-GGUF – 2025年10月リリース Qiskitデータバージョン 2.1 までを使用してトレーニング済み
qiskit/qwen2.5-coder-14b-qiskit-GGUF – 2025年6月リリース Qiskitデータバージョン 2.0 までを使用してトレーニング済み
qiskit/granite-3.3-8b-qiskit-GGUF – 2025年6月リリース Qiskitデータバージョン 2.0 までを使用してトレーニング済み
qiskit/granite-3.2-8b-qiskit-GGUF – 2025年6月リリース Qiskitデータバージョン 2.0 までを使用してトレーニング済み

オープンソースのQiskit Code Assistantモデルは、safetensorsまたはGGUFファイル形式で提供されており、以下で説明するようにHugging Faceからダウンロードできます。

トレーニングに使用されたQiskitバージョン

モデル						ベンチマーク指標					リリース日	トレーニングに使用されたQiskitバージョン
	QiskitHumanEval-Hard	QiskitHumanEval	HumanEval	ASDiv	MathQA	SciQ	MBPP	IFEval	CrowsPairs (English)	TruthfulQA (MC1 acc)
mistral-small-3.2-24b-qiskit	32.45	47.02	77.49	3.77	49.68	97.50	64.00	48.44	67.08	39.41	2026年1月	2.2
qwen2.5-coder-14b-qiskit	25.17	49.01	91.46	4.21	53.90	97.00	77.60	49.64	65.18	37.82	2025年6月	2.0
granite-3.3-8b-qiskit	14.57	27.15	62.80	0.48	38.66	93.30	52.40	59.71	59.75	39.05	2025年6月	2.0
granite-3.2-8b-qiskit	9.93	24.50	57.32	0.09	41.41	96.30	51.80	60.79	66.79	40.51	2025年6月	2.0
granite-8b-qiskit-rc-0.10	15.89	38.41	59.76	—	—	—	—	—	—	—	2025年2月	1.3
granite-8b-qiskit	17.88	44.37	53.66	—	—	—	—	—	—	—	2024年11月	1.2

注：ベンチマーク表に記載されているすべてのモデルは、Hugging Faceモデルで定義されている各モデルのシステムプロンプトを使用して評価されました。

廃止されたモデル

これらのモデルは積極的なメンテナンスは行われていませんが、引き続き利用可能です：

qiskit/granite-8b-qiskit-rc-0.10 - 2025年2月リリース（廃止）
qiskit/granite-8b-qiskit - 2024年11月リリース（廃止）

高度なセットアップ

手動でローカル環境を設定したい場合や、インストールプロセスをより細かく制御したい場合は、以下のセクションを展開してください。

Hugging Faceウェブサイトからダウンロードする

Hugging FaceウェブサイトからQiskit Code Assistant関連のモデルをダウンロードするには、以下の手順に従ってください：

Hugging FaceでQiskitの目的のモデルページに移動します。
Files and Versionsタブに移動し、safetensorsまたはGGUFモデルファイルをダウンロードします。

Hugging Face CLIを使用してダウンロードする

Hugging Face CLIを使用して利用可能なQiskit Code Assistantモデルをダウンロードするには、以下の手順に従ってください：

Hugging Face CLIをインストールします
Hugging Faceアカウントにログインします
```
huggingface-cli login
```

前のリストから希望のモデルをダウンロードします

huggingface-cli download <HF REPO NAME> <MODEL PATH> --local-dir <LOCAL PATH>

OllamaでQiskit Code Assistantモデルをローカルに手動でデプロイする

ダウンロードしたQiskit Code Assistantモデルをデプロイして操作する方法は複数あります。このガイドでは、Ollamaを使用する方法を説明します。具体的には、Hugging Face Hubとの統合またはローカルモデルを使用したOllamaアプリケーションか、llama-cpp-pythonパッケージのいずれかです。

Ollamaアプリケーションの使用

OllamaアプリケーションはLLMをローカルで実行するためのシンプルなソリューションです。使いやすく、セットアッププロセス全体、モデル管理、および操作をかなり簡単にするCLIが備わっています。素早い実験や、技術的な詳細をあまり扱いたくないユーザーに最適です。

Ollamaのインストール

Ollamaアプリケーションをダウンロードします
ダウンロードしたファイルをインストールします
インストールしたOllamaアプリケーションを起動します

情報
Ollamaアイコンがデスクトップのメニューバーに表示されると、アプリケーションは正常に動作しています。http://localhost:11434/にアクセスしてサービスが実行中であることを確認することもできます。
ターミナルでOllamaを試し、モデルの実行を開始します。例：
```
ollama run hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskit
```

Hugging Face Hub統合を使用したOllamaのセットアップ

Ollama/Hugging Face Hub統合は、新しいmodelfileを作成したり、GGUFまたはsafetensorsファイルを手動でダウンロードしたりすることなく、Hugging Face Hubにホストされているモデルと対話する方法を提供します。Hugging Face Hub上のモデルには、デフォルトのtemplateファイルとparamsファイルがすでに含まれています。

Ollamaアプリケーションが実行中であることを確認します。
目的のモデルページに移動し、URLをコピーします。例：https://huggingface.co/Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF
ターミナルから以下のコマンドを実行します：
```
ollama run hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskit
```

hf.co/Qiskit/Qwen2.5-Coder-14B-Qiskitモデル、または現在推奨されているその他のGGUF公式モデルhf.co/Qiskit/mistral-small-3.2-24b-qiskit-GGUFやhf.co/Qiskit/granite-3.3-8b-qiskit-GGUFを使用できます。

手動でダウンロードしたQiskit Code Assistantのモデルファイルを使用したOllamaのセットアップ

https://huggingface.co/Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUFのようなGGUFモデルを手動でダウンロードし、異なるテンプレートやパラメーターを試したい場合は、以下の手順に従ってローカルのOllamaアプリケーションに読み込むことができます。

以下の内容を入力してModelfileを作成し、<PATH-TO-GGUF-FILE>をダウンロードしたモデルの実際のパスに更新してください。

FROM <PATH-TO-GGUF-FILE>
TEMPLATE """{{ if .System }}
System:
{{ .System }}

{{ end }}{{ if .Prompt }}Question:
{{ .Prompt }}

{{ end }}Answer:
```python{{ .Response }}
"""

PARAMETER stop "Question:"
PARAMETER stop "Answer:"
PARAMETER stop "System:"
PARAMETER stop "```"

PARAMETER temperature 0
PARAMETER top_k 1

Run the following command to create a custom model instance based on the Modelfile.
```
ollama create Qwen2.5-Coder-14B-Qiskit -f ./path-to-model-file
```
備考
This process may take some time for Ollama to read the model file, initialize the model instance, and configure it according to the specifications provided.

Run the Qiskit Code Assistant model manually downloaded in Ollama

After the Qwen2.5-Coder-14B-Qiskit model has been set up in Ollama, run the following command to launch the model and interact with it in the terminal (in chat mode).

ollama run Qwen2.5-Coder-14B-Qiskit

Some useful commands:

ollama list - List models on your computer
ollama rm Qwen2.5-Coder-14B-Qiskit - Delete the model
ollama show Qwen2.5-Coder-14B-Qiskit - Show model information
ollama stop Qwen2.5-Coder-14B-Qiskit - Stop a model that is currently running
ollama ps - List which models are currently loaded

Manually deploy the Qiskit Code Assistant models in local through the llama-cpp-python package

An alternative to the Ollama application is the llama-cpp-python package, which is a Python binding for llama.cpp. It gives you more control and flexibility to run the GGUF model locally, and is ideal for users who wish to integrate the local model in their workflows and Python applications.

Install llama-cpp-python
Interact with the model from within your application using llama_cpp. For example:

from llama_cpp import Llama

model_path = <PATH-TO-GGUF-FILE>

model = Llama(
        model_path,
        seed=17,
        n_ctx=10000,
        n_gpu_layers=37, # to offload in gpu, but put 0 if all in cpu
    )

input = 'Generate a quantum circuit with 2 qubits'
raw_pred = model(input)["choices"][0]["text"]

You can also add text generation parameters to the model to customize the inference:

generation_kwargs = {
        "max_tokens": 512,
        "echo": False, # Echo the prompt in the output
        "top_k": 1
    }

raw_pred = model(input, **generation_kwargs)["choices"][0]["text"]

Manually deploy the Qiskit Code Assistant models in local through llama.cpp

Use the `llama.cpp` library

Another alternative is to use llama.cpp, an open-source library for performing LLM inference on a CPU with minimal setup. It provides low-level control over the model execution and is typically run from the command line, pointing to a local GGUF model file.

There are several ways to install llama.cpp on your machine:

Install llama.cpp using brew, nix, or winget
Run with Docker: See out the Docker documentation by llama.cpp team
Download pre-built binaries from the releases page
Build from source by cloning this repository

Once installed, you can use llama.cpp to interact with GGUF models in conversation mode as follows:

# Use a local model file
llama-cli -m my_model.gguf -cnv

# Or download and run a model directly from Hugging Face
llama-cli -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF -cnv

You can also launch an OpenAI-compatible API server for the model in the following way:

llama-server -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF

Advanced parameters

With the llama-cli program, you can control the model generation using command-line options. For example, you can provide an initial “system” prompt using the -p/--prompt flag. In conversation mode (-cnv), this initial prompt acts as the system message. Otherwise, you can simply prepend any desired instruction to your prompt text. You can also adjust sampling parameters - for instance: temperature (--temp), top-k (--top-k), top-p (--top-p), repetition penalty (--repeat-penalty), and the seed to use (--seed). The following is an example invocation using these options:

llama-cli -hf Qiskit/Qwen2.5-Coder-14B-Qiskit-GGUF \
  -p "You are a friendly assistant." -cnv \
  --temp 0.7 \
  --top-k 50 \
  --top-p 0.95 \
  --repeat-penalty 1.1 \
  --seed 42

Qiskitモデルが正常に機能するよう、HF GGUFリポジトリで提供されているシステムプロンプトを使用することを推奨します：mistral-small-3.2-24b-qiskit-GGUF、Qwen2.5-Coder-14B-Qiskit-GGUF、granite-3.3-8b-qiskit-GGUF、およびgranite-3.2-8b-qiskit-GGUFのシステムプロンプトをご参照ください。

拡張機能をローカルデプロイに手動で接続する

Qiskit Code AssistantのVS Code拡張機能とJupyterLab拡張機能を使用して、ローカルにデプロイされたQiskit Code Assistantモデルにプロンプトを送ることができます。モデルとともにOllamaアプリケーションをセットアップしたら、拡張機能をローカルサービスに接続するように設定できます。

Qiskit Code Assistant VS Code拡張機能との接続

Qiskit Code Assistant VS Code拡張機能を使用すると、コードを書きながらモデルと対話し、コード補完を実行できます。これは、Pythonアプリケーション向けのQiskitコードを書く際に支援を求めるユーザーにとって効果的です。

Qiskit Code Assistant VS Code拡張機能をインストールします。
VS Codeで、ユーザー設定に移動し、Qiskit Code Assistant: UrlをローカルのOllamaデプロイのURL（例：http://localhost:11434）に設定します。
表示 > コマンドパレット...に移動してDeveloper: Reload Windowを選択することでVS Codeを再読み込みします。

Ollamaで設定されたQiskit Code Assistantモデルがステータスバーに表示され、使用可能な状態になります。

Qiskit Code Assistant JupyterLab拡張機能との接続

Qiskit Code Assistant JupyterLab拡張機能を使用すると、モデルと対話し、Jupyter Notebook内で直接コード補完を実行できます。主にJupyter Notebookを使用するユーザーは、この拡張機能を活用してQiskitコードを書く体験をさらに向上させることができます。

Qiskit Code Assistant JupyterLab拡張機能をインストールします。
JupyterLabで、設定エディターに移動し、Qiskit Code Assistant Service APIをローカルのOllamaデプロイのURL（例：http://localhost:11434）に設定します。

Ollamaで設定されたQiskit Code Assistantモデルがステータスバーに表示され、使用可能な状態になります。

クイックスタート（推奨）​

VS Code拡張機能のセットアップ​

JupyterLab拡張機能のセットアップ​

利用可能なモデル​

現行モデル​

GGUFモデル（個人環境・ノートPC向けに推奨）​

トレーニングに使用されたQiskitバージョン​

廃止されたモデル​

高度なセットアップ​

Ollamaアプリケーションの使用​

Ollamaのインストール​

Hugging Face Hub統合を使用したOllamaのセットアップ​

手動でダウンロードしたQiskit Code Assistantのモデルファイルを使用したOllamaのセットアップ​

Run the Qiskit Code Assistant model manually downloaded in Ollama​

Use the llama.cpp library​

Advanced parameters​

Qiskit Code Assistant VS Code拡張機能との接続​

Qiskit Code Assistant JupyterLab拡張機能との接続​