Detectron 源码解析-乱七八糟

内存超限

在 shell 中运行指令

1
2
3
4
5
6
python tools/infer_simple.py \
--cfg configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml \
--output-dir ./demo/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/ImageNetPretrained/MSRA/R-50.pkl \
demo

报错:

1
RuntimeError: [enforce fail at context_gpu.cu:329] error == cudaSuccess. 2 vs 0. Error at: /home/zerozone/Works/Competition/DF/pytorch/caffe2/core/context_gpu.cu:329: out of memory

解决方案:

https://github.com/facebookresearch/Detectron/issues/21

1
2
3
4
5
6
python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo

该指令运行模型所需显存大小不会超过 3GB

连接超时

指令

1
2
3
4
5
6
python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo

报错

1
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>

说明无法通过 --wts 指令的 url 找到预下载的模型, 此时有可能你的模型没有下载成功, 可能已经下载成功, 但就是连不上, 对此, 可以在 Detectron 的 model zoo 里面找到对应的模型, 手动下载到本地, 然后更改 --wts 指令为本地路径, 如下所示.(如果你下载成功了, 但就是连接不上, 那么直接更改--wts参数即可).

1
2
3
4
5
6
python tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir ./demo/detectron-visualizations \
--image-ext jpg \
--wts ./detectron-download-cache/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo

多gpu 非法内存

multi-GPU training throw an illegal memory access

解决方案
https://github.com/facebookresearch/Detectron/issues/32