| network | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|
| mobilenetv1_imagenet | 74.12 | 73.92 | 73.61 | ImageNet | 1x3x224x224 |
| mobilenetv2_imagenet | 72.65 | 72.51 | 72.11 | ImageNet | 1x3x224x224 |
| resnet18_imagenet | 72.04 | 72.03 | 72.03 | ImageNet | 1x3x224x224 |
| resnet50_imagenet | 77.37 | 76.99 | 76.94 | ImageNet | 1x3x224x224 |
| vargnetv2_imagenet | 73.94 | 73.56 | 73.64 | ImageNet | 1x3x224x224 |
| efficientnet_imagenet | 74.31 | 74.23 | 74.18 | ImageNet | 1x3x224x224 |
| horizon_swin_transformer_imagenet | 80.24 | 80.15 | 80.05 | ImageNet | 1x3x224x224 |
| mixvargenet_imagenet | 71.33 | 71.23 | 71.04 | ImageNet | 1x3x224x224 |
| efficientnasnetm_imagenet | 80.24 | 79.99 | 79.94 | ImageNet | 1x3x280x280 |
| efficientnasnets_imagenet | 76.63 | 76.23 | 76.03 | ImageNet | 1x3x300x300 |
| vit_small_imagenet | 79.50 | 79.40 | - | ImageNet | 1x3x224x224 |
| henet_tinye_imagenet | 77.68 | 77.22 | - | ImageNet | 1x3x224x224 |
| henet_tinym_imagenet | 78.38 | 77.95 | - | ImageNet | 1x3x224x224 |
FCOS
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| fcos_efficientnetb0_mscoco | efficientnetb0 | 36.26 | 35.79 | 35.59 | MS COCO | 1x3x512x512 |
| fcos_efficientnetb1_mscoco | efficientnetb1 | 41.37 | 41.21 | 40.71 | MS COCO | 1x3x640x640 |
| fcos_efficientnetb2_mscoco | efficientnetb2 | 45.35 | 45.10 | 45.00 | MS COCO | 1x3x768x768 |
| fcos_efficientnetb3_mscoco | efficientnetb3 | 48.03 | 47.65 | 47.58 | MS COCO | 1x3x896x896 |
DETR
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| detr_resnet50_mscoco | resnet50 | 35.70 | 31.42 | 31.31 | MS COCO | 1x3x800x1333 |
| detr_efficientnetb3_mscoco | efficientnetb3 | 37.21 | 35.95 | 35.99 | MS COCO | 1x3x800x1333 |
Deform DETR
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| deform_detr_resnet50_mscoco | resnet50 | 44.34 | 44.65 | - | MS COCO | 1x3x800x1333 |
FCOS3D
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| fcos3d_efficientnetb0_nuscenes | efficientnetb0 | 30.60 | 30.27 | 30.31 | nuscenes | 1x3x512x896 |
UNet
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| unet_mobilenetv1_cityscapes | MobileNetV1 | 68.02 | 67.56 | 67.53 | Cityscapes | 1x3x1024x2048 |
Deeplab
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| deeplabv3plus_efficientnetm0_cityscapes | EfficientNet-M0 | 76.30 | 76.22 | 76.12 | Cityscapes | 1x3x1024x2048 |
| deeplabv3plus_efficientnetm1_cityscapes | EfficientNet-M1 | 77.94 | 77.64 | 77.65 | Cityscapes | 1x3x1024x2048 |
| deeplabv3plus_efficientnetm2_cityscapes | EfficientNet-M2 | 78.82 | 78.65 | 78.63 | Cityscapes | 1x3x1024x2048 |
FastScnn
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| fastscnn_efficientnetb0tiny_cityscapes | EfficientNet-B0lite | 69.97 | 69.90 | 69.88 | Cityscapes | 1x3x1024x2048 |
PwcNet
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| pwcnet_pwcnetneck_flyingchairs | PwcNet | 1.4117 | 1.4112 | 1.4075 | FlyingChairs | 1x6x384x512 |
PointPillars
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| pointpillars_kitti_car | SequentialBottleNeck | 77.31 | 76.86 | 76.76 | KITTI3D | 150000x4 |
CenterPoint
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| centerpoint_pointpillar_nuscenes | SequentialBottleNeck | 58.32 | 58.11 | 58.14 | nuscenes | 1x5x20x40000, 40000x4 |
LidarMultiTask
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| centerpoint_mixvargnet_multitask_nuscenes | MixVarGENet | 58.09 | 57.72 | 57.62 | nuscenes | 1x5x20x40000, 40000x4 |
PointPillars 的指标是 Box3d Moderate 这项。
GaNet
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| ganet_mixvargenet_culane | MixVarGENet | 79.49 | 78.72 | 78.72 | CuLane | 1x3x320x800 |
Motr
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| motr_efficientnetb3_mot17 | efficientnetb3 | 58.02 | 57.62 | 57.76 | Mot17 | 1x3x800x1422, 1x256x2x128, 1x1x1x256, 1x4x2x128 |
StereoNet
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| stereonet_stereonetneck_sceneflow | StereoNeck | 1.1270 | 1.1677 | 1.1685 | SceneFlow | 1x6x540x960 |
| stereonetplus_mixvargenet_sceneflow | MixVarGENet | 1.1270 | 1.1329 | 1.1351 | SceneFlow | 2x3x544x960 |
Bev
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| bev_ipm_efficientnetb0_multitask_nuscenes | efficientnetb0 | 30.59 | 30.80 | 30.41 | nuscenes det | 6x3x512x960, 6x128x128x2 |
| bev_ipm_efficientnetb0_multitask_nuscenes | efficientnetb0 | 51.47 | 51.41 | 50.98 | nuscenes seg | 6x3x512x960, 6x128x128x2 |
| bev_lss_efficientnetb0_multitask_nuscenes | efficientnetb0 | 30.09 | 30.05 | 30.01 | nuscenes det | 6x3x256x704, 10x128x128x2, 10x128x128x2 |
| bev_lss_efficientnetb0_multitask_nuscenes | efficientnetb0 | 51.78 | 51.47 | 51.46 | nuscenes seg | 6x3x256x704, 10x128x128x2, 10x128x128x2 |
| bev_gkt_mixvargenet_multitask_nuscenes | MixVarGENet | 28.11 | 28.12 | 27.90 | nuscenes det | 6x3x512x960, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2 |
| bev_gkt_mixvargenet_multitask_nuscenes | MixVarGENet | 48.53 | 48.02 | 48.37 | nuscenes seg | 6x3x512x960, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2, 6x64x64x2 |
| bev_ipm_4d_efficientnetb0_multitask_nuscenes | efficientnetb0 | 37.21 | 37.19 | 37.17 | nuscenes det | 6x3x512x960, 6x128x128x2, 1x64x128x128, 1x128x128x2 |
| bev_ipm_4d_efficientnetb0_multitask_nuscenes | efficientnetb0 | 52.90 | 53.80 | 53.77 | nuscenes seg | 6x3x512x960, 6x128x128x2, 1x64x128x128, 1x128x128x2 |
| detr3d_efficientnetb3_nuscenes | efficientnetb3 | 34.04 | 33.87 | 33.39 | nuscenes det | 6x3x512x1408 |
| petr_efficientnetb3_nuscenes | efficientnetb3 | 37.60 | 37.32 | 37.31 | nuscenes det | 6x3x512x1408 |
| bevformer_tiny_resnet50_detection_nuscenes | resnet50 | 37.00 | 36.66 | - | nuscenes det | 6x3x480x800, 1x2500x256, 6x1x2500x4x2, 1x50x50x2 |
| bev_cft_efficientnetb3_nuscenes | efficientnetb3 | 32.93 | 32.68 | - | nuscenes det | 6x3x512x1408 |
| bev_sparse_resnet50_nuscenes | efficientnetb3 | 56.28 | 55.23 | - | nuscenes det | 6x3x256x704, 6x4x4, 1x600x11, 1x600x256, 1x600, 1 |
HeatmapKeypointModel
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| keypoint_efficientnetb0_carfusion | efficientnetb0 | 94.33 | 94.30 | 94.31 | carfusion | 1x3x128x128 |
DenseTNT
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| densetnt_vectornet_argoverse1 | vectornet | 1.2974 | 1.2989 | 1.3038 | argoverse 1 | 30x9x19x32, 30x11x9x64, 30x1x1x96, 30x2x1x2048, 30x1x1x2048 |
QCNet
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| qcnet_oe_argoverse2 | - | 83.84 | 83.26 | - | argoverse 2 | 输入见下方list |
qcnet_oe_argoverse2 的指标是 HitRate 这项。
qcnet_oe_argoverse2 模型输入shape为:
1x30x50, 1x50x30x30, 1x30x1, 1x1x30x1, 1x1x30x1, 1x1x30x1, 1x1x30x1, 1x1x30x80, 1x1x30x80, 1x1x30x80, 1x1x30x10, 1x1x30x10, 1x1x30x10, 1x1x30x10, 1x1x30x30, 1x1x30x30, 1x1x30x30, 1x30x29x128, 1x30x10x128, 1x30x10x128, 1x80, 1x80, 1x1x80x80, 1x1x80x80, 1x1x80x80, 1x1x80x50, 1x1x80x50, 1x1x80x50, 1x80x50, 1x80x50, 1x80x50, 1x80x50, 1x30x30, 1x30x1, 1x1x30x30, 1x1x30x30, 1x1x30x30, 1x1x30x30, 1x1x30x80, 1x1x30x80, 1x1x30x80, 1x1x30x30, 1x1x30x30, 1x1x30x30, 1x80x80.
FlashOcc
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| flashocc_henet_lss_occ3d_nuscenes | henet_tinym_imagenet | 0.3646 | 0.3622 | - | occ3d_nuscenes | 6x3x512x960, 10x128x128x2, 10x128x128x2 |
MapTR
| network | backbone | float | qat | quantization | dataset | input shape |
|---|---|---|---|---|---|---|
| maptrv2_resnet50_bevformer_nuscenes | resnet50 | 0.5859 | 0.5843 | 0.5763 | nuscenes | 6x3x480x800, 6x1x2500x4x2 |