SemanticFPN model trained on cityscapes
SemanticFPN is a conceptually simple yet effective baseline for panoptic segmentation trained on cityscapes. The method starts with Mask R-CNN with FPN and adds to it a lightweight semantic segmentation branch for dense-pixel prediction. It was introduced in the paper Panoptic Feature Pyramid Networks in 2019 by Kirillov, Alexander, et al.
We develop a modified version that could be supported by AMD Ryzen AI.
Model description
SemanticFPN is a single network that unifies the tasks of instance segmentation and semantic segmentation. The network is designed by endowing Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. This simple baseline not only remains effective for instance segmentation, but also yields a lightweight, top-performing method for semantic segmentation. It is a robust and accurate baseline for both tasks and can serve as a strong baseline for future research in panoptic segmentation.
Intended uses & limitations
You can use the raw model for image segmentation. See the model hub to look for all available SemanticFPN models.
How to use
Installation
Follow Ryzen AI Installation to prepare the environment for Ryzen AI. Run the following script to install pre-requisites for this model.
pip install -r requirements.txt
Data Preparation (optional: for accuracy evaluation)
- Download cityscapes dataset (https://www.cityscapes-dataset.com/downloads)
- grundtruth folder: gtFine_trainvaltest.zip [241MB]
- image folder: leftImg8bit_trainvaltest.zip [11GB]
- Organize the dataset directory as follows:
βββ data
βββ cityscapes
βββ leftImg8bit
| βββ train
| βββ val
βββ gtFine
βββ train
βββ val
Test & Evaluation
- Code snippet from
infer_onnx.py
on how to use
parser = argparse.ArgumentParser(description='SemanticFPN model')
parser.add_argument('--onnx_path', type=str, default='FPN_int_NHWC.onnx')
parser.add_argument('--save_path', type=str, default='./data/demo_results/senmatic_results.png')
parser.add_argument('--input_path', type=str, default='data/cityscapes/cityscapes/leftImg8bit/test/bonn/bonn_000000_000019_leftImg8bit.png')
parser.add_argument('--ipu', action='store_true',
help='use ipu')
parser.add_argument('--provider_config', type=str, default=None,
help='provider config path')
args = parser.parse_args()
if args.ipu:
providers = ["VitisAIExecutionProvider"]
provider_options = [{"config_file": args.provider_config}]
else:
providers = ['CPUExecutionProvider']
provider_options = None
onnx_path = args.onnx_path
input_img = build_img(args)
session = onnxruntime.InferenceSession(onnx_path, providers=providers, provider_options=provider_options)
ort_input = {session.get_inputs()[0].name: input_img.cpu().numpy()}
ort_output = session.run(None, ort_input)[0]
if isinstance(ort_output, (tuple, list)):
ort_output = ort_output[0]
output = ort_output[0].transpose(1, 2, 0)
seg_pred = np.asarray(np.argmax(output, axis=2), dtype=np.uint8)
color_mask = colorize_mask(seg_pred)
color_mask.save(args.save_path)
- Run inference for a single image
python infer_onnx.py --onnx_path FPN_int_NHWC.onnx --input_path /Path/To/Your/Image --ipu --provider_config Path/To/vaip_config.json
- Test accuracy of the quantized model
python test_onnx.py --onnx_path FPN_int_NHWC.onnx --dataset citys --test-folder ./data/cityscapes --crop-size 256 --ipu --provider_config Path/To/vaip_config.json
Performance
model | input size | FLOPs | mIoU on Cityscapes Validation |
---|---|---|---|
SemanticFPN(ResNet18) | 256x512 | 10G | 62.9% |
model | input size | FLOPs | INT8 mIoU on Cityscapes Validation |
---|---|---|---|
SemanticFPN(ResNet18) | 256x512 | 10G | 62.5% |
@inproceedings{kirillov2019panoptic,
title={Panoptic feature pyramid networks},
author={Kirillov, Alexander and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
pages={6399--6408},
year={2019}
}