SemanticFPN model trained on cityscapes

SemanticFPN is a conceptually simple yet effective baseline for panoptic segmentation trained on cityscapes. The method starts with Mask R-CNN with FPN and adds to it a lightweight semantic segmentation branch for dense-pixel prediction. It was introduced in the paper Panoptic Feature Pyramid Networks in 2019 by Kirillov, Alexander, et al.

We develop a modified version that could be supported by AMD Ryzen AI.

Model description

SemanticFPN is a single network that unifies the tasks of instance segmentation and semantic segmentation. The network is designed by endowing Mask R-CNN, a popular instance segmentation method, with a semantic segmentation branch using a shared Feature Pyramid Network (FPN) backbone. This simple baseline not only remains effective for instance segmentation, but also yields a lightweight, top-performing method for semantic segmentation. It is a robust and accurate baseline for both tasks and can serve as a strong baseline for future research in panoptic segmentation.

Intended uses & limitations

You can use the raw model for image segmentation. See the model hub to look for all available SemanticFPN models.

How to use

Installation

Follow Ryzen AI Installation to prepare the environment for Ryzen AI. Run the following script to install pre-requisites for this model.

pip install -r requirements.txt

Data Preparation (optional: for accuracy evaluation)

Download cityscapes dataset (https://www.cityscapes-dataset.com/downloads)
- grundtruth folder: gtFine_trainvaltest.zip [241MB]
- image folder: leftImg8bit_trainvaltest.zip [11GB]
Organize the dataset directory as follows:

└── data
     └── cityscapes
          ├── leftImg8bit
          |    ├── train
          |    └── val
          └── gtFine
               ├── train
               └── val

Test & Evaluation

Code snippet from infer_onnx.py on how to use

    parser = argparse.ArgumentParser(description='SemanticFPN model')
    parser.add_argument('--onnx_path', type=str, default='FPN_int_NHWC.onnx')
    parser.add_argument('--save_path', type=str, default='./data/demo_results/senmatic_results.png')
    parser.add_argument('--input_path', type=str, default='data/cityscapes/cityscapes/leftImg8bit/test/bonn/bonn_000000_000019_leftImg8bit.png')
    parser.add_argument('--ipu', action='store_true',
                    help='use ipu')
    parser.add_argument('--provider_config', type=str, default=None,
                    help='provider config path')
    args = parser.parse_args()

    if args.ipu:
        providers = ["VitisAIExecutionProvider"]
        provider_options = [{"config_file": args.provider_config}]
    else:
        providers = ['CPUExecutionProvider']
        provider_options = None

    onnx_path = args.onnx_path
    input_img = build_img(args)
    session = onnxruntime.InferenceSession(onnx_path, providers=providers, provider_options=provider_options)
    ort_input = {session.get_inputs()[0].name: input_img.cpu().numpy()}
    ort_output = session.run(None, ort_input)[0]
    if isinstance(ort_output, (tuple, list)):
        ort_output = ort_output[0]

    output = ort_output[0].transpose(1, 2, 0)
    seg_pred = np.asarray(np.argmax(output, axis=2), dtype=np.uint8)
    color_mask = colorize_mask(seg_pred)
    color_mask.save(args.save_path)

Run inference for a single image

python infer_onnx.py --onnx_path FPN_int_NHWC.onnx --input_path /Path/To/Your/Image --ipu --provider_config Path/To/vaip_config.json

Test accuracy of the quantized model

python test_onnx.py --onnx_path FPN_int_NHWC.onnx --dataset citys --test-folder ./data/cityscapes --crop-size 256 --ipu --provider_config Path/To/vaip_config.json

Performance

model	input size	FLOPs	mIoU on Cityscapes Validation
SemanticFPN(ResNet18)	256x512	10G	62.9%

model	input size	FLOPs	INT8 mIoU on Cityscapes Validation
SemanticFPN(ResNet18)	256x512	10G	62.5%

@inproceedings{kirillov2019panoptic,
  title={Panoptic feature pyramid networks},
  author={Kirillov, Alexander and Girshick, Ross and He, Kaiming and Doll{\'a}r, Piotr},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  pages={6399--6408},
  year={2019}
}