deepspeed / docs /ENV.md
xingzhikb's picture
init
002bd9b

Env

Use Docker

The docker image: nvidia/pytorch:23.07-py3

alias=`whoami | cut -d'.' -f2`
docker run -itd --runtime=nvidia --ipc=host --privileged --network host -v /home/${alias}:/home/${alias} -w `pwd` --name sca nvcr.io/nvidia/pytorch:23.07-py3 bash
# Maybe we need this version for evaluation
# docker run -itd --runtime=nvidia --ipc=host --privileged --network host -v /home/${alias}:/home/${alias} -w `pwd` --name sca nvcr.io/nvidia/pytorch:22.10-py3 bash
docker exec -it sca bash

# In the docker container
# cd to the code dir
. amlt_configs/setup.sh

Use Conda

conda create -n sca-v2 -y python=3.9 
conda activate sca-v2

# https://pytorch.org/, pytorch 2.0.1 (as of 07/12/2023)
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install -r requirements.txt
pip install -U datasets==2.16.1  # NotImplementedError(f"Loading a dataset cached in a {type(self._fs).__name__} is not supported.") https://github.com/huggingface/datasets/issues/6352

For dev

mkdir -p tmp/{data,code}
pip install -r requirements-dev.txt 

For gradio demo:

pip install -r requirements-app.txt 

transformers is based on v4.30.2, hash 66fd3a8d6

REPO_DIR=transformers
git clone git@github.com:huggingface/transformers.git $REPO_DIR
git fetch --all --tags --prune
git checkout v4.30.2 -b v4.30.2
git rev-parse --short HEAD

Data

Replace the data file paths in src/conf/data/*.yaml.

The data file links: